# Using LangChain and ChatGPT to query Google Calendar.

Or, how to use LangChain and ChatGPT as a personal assistant.

Google Calendar guides my workday, and recently I've begun using it to apply [Timeboxing](https://www.mindtools.com/a9bt6jr/timeboxing) in my day-to-day life.

Naturally, I was curious how I can feed that data to ChatGPT as context to be able to utilise Natural Language queries, over my Google Calendar to get daily summaries, insights and reminders. Turns out you can do this + a lot more!

**Libraries used**

* LangChain - Framework for LLM
* OpenAI API - Create embeddings + utilise LLM Model
* ChromaDB - Vector store

In order to interact with Google Calendar API, we need to follow the instructions [here](https://developers.google.com/workspace/guides/create-credentials) and create OAuth application and download `credentials.json`.

**⚠️ Disclaimer**

* Running Langchain with your own OpenAI API key, will consume credits, please be careful and monitor costs.
* This is my 2nd project with LangChain and LLMs I might not have the nomenclature down correctly, if there are inconsistencies let me know and I will do my best to update.
* You can't run the Notebook on Kaggle, as it creates a server for Google OAuth to callback and authenticate.

Once the credentials are downloaded, next step is to load that data in, for this I'll be using a slightly modified version of the Llama Hub [GoogleCalendarReader](https://llamahub.ai/l/google_calendar) code can be found [here](https://github.com/emptycrown/llama-hub/tree/main/loader_hub/google_calendar).

In [None]:
"""
Google Calendar reader.
"""

import datetime
import os
from typing import Any, List, Optional, Union

from llama_index.readers.base import BaseReader
from llama_index.readers.schema.base import Document

SCOPES = ["https://www.googleapis.com/auth/calendar.readonly"]

# Copyright 2018 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


class GoogleCalendarReader(BaseReader):
    """Google Calendar reader.

    Reads events from Google Calendar

    """

    def load_data(
        self,
        number_of_results: Optional[int] = 100,
        start_date: Optional[Union[str, datetime.date]] = None,
    ) -> List[Document]:

        """Load data from user's calendar.

        Args:
            number_of_results (Optional[int]): the number of events to return. Defaults to 100.
            start_date (Optional[Union[str, datetime.date]]): the start date to return events from. Defaults to today.
        """

        from googleapiclient.discovery import build

        credentials = self._get_credentials()
        service = build("calendar", "v3", credentials=credentials)

        if start_date is None:
            start_date = datetime.date.today()
        elif isinstance(start_date, str):
            start_date = datetime.date.fromisoformat(start_date)

        start_datetime = datetime.datetime.combine(start_date, datetime.time.min)
        start_datetime_utc = start_datetime.strftime("%Y-%m-%dT%H:%M:%S.%fZ")

        events_result = (
            service.events()
            .list(
                calendarId="primary",
                timeMin=start_datetime_utc,
                maxResults=number_of_results,
                singleEvents=True,
                orderBy="startTime",
            )
            .execute()
        )

        events = events_result.get("items", [])

        if not events:
            return []

        results = []
        for event in events:
            if "dateTime" in event["start"]:
                start_time = event["start"]["dateTime"]
            else:
                start_time = event["start"]["date"]

            if "dateTime" in event["end"]:
                end_time = event["end"]["dateTime"]
            else:
                end_time = event["end"]["date"]

            event_string = f"Status: {event['status']}, "
            event_string += f"Summary: {event['summary']}, "
            event_string += f"Start time: {start_time}, "
            event_string += f"End time: {end_time}, "

            organizer = event.get("organizer", {})
            display_name = organizer.get("displayName", "N/A")
            email = organizer.get("email", "N/A")
            if display_name != "N/A":
                event_string += f"Organizer: {display_name} ({email})"
            else:
                event_string += f"Organizer: {email}"

            results.append(Document(event_string))

        return results

    def _get_credentials(self) -> Any:
        """Get valid user credentials from storage.

        The file token.json stores the user's access and refresh tokens, and is
        created automatically when the authorization flow completes for the first
        time.

        Returns:
            Credentials, the obtained credential.
        """
        from google.auth.transport.requests import Request
        from google.oauth2.credentials import Credentials
        from google_auth_oauthlib.flow import InstalledAppFlow

        creds = None
        if os.path.exists("token.json"):
            creds = Credentials.from_authorized_user_file("token.json", SCOPES)
        # If there are no (valid) credentials available, let the user log in.
        if not creds or not creds.valid:
            if creds and creds.expired and creds.refresh_token:
                creds.refresh(Request())
            else:
                flow = InstalledAppFlow.from_client_secrets_file(
                    "credentials.json", SCOPES
                )
                creds = flow.run_local_server(port=3030)
            # Save the credentials for the next run
            with open("token.json", "w") as token:
                token.write(creds.to_json())

        return creds

In [None]:
# Add OPENAI_API_KEY

import dotenv
dotenv.load_dotenv()

In [None]:
'''
The following makes a request to Google Calendar API with the OAuth token
and retrieves number_of_results from the specified date.
'''

from GoogleCalendarReader import GoogleCalendarReader
from datetime import date

loader = GoogleCalendarReader()
documents = loader.load_data(start_date=date.today(), number_of_results=50)

LangChain has a specific schema for the documents, which we need to convert to. Luckily retrievers from LlamaHub work with Langchain, we just need to call `.to_langchain_format()` over the retrieved documents.

In [None]:
from typing import List
from langchain.docstore.document import Document as LCDocument

formatted_documents: List[LCDocument] = [doc.to_langchain_format() for doc in documents]

We have normalised data, we just need to put it together in a ConversationalRetrievalChain. Langchain provides various ways of interacting with LLMs from simple: [VectorstoreIndexCreatora](https://python.langchain.com/en/latest/modules/indexes/getting_started.html), to more complex "chains" which rely on Prompt engineering to get the most out of the LLM.

To interact with Google Calendar and ask any follow-up questions, going to use ConversationalRetrievalChain, which preserves the chat history, and feeds it back to the LLM, when asking my question.

In [None]:
from langchain import OpenAI
from langchain.chains import ConversationalRetrievalChain
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain.memory import ConversationBufferMemory

'''
OpenAIEmbeddings uses text-embedding-ada-002
'''

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
documents = text_splitter.split_documents(formatted_documents)
embeddings = OpenAIEmbeddings()
vector_store = Chroma.from_documents(documents, embeddings)
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=2, model_name="gpt-4"), vector_store.as_retriever(), memory=memory)


We have everything set-up and are ready to start querying.

In [None]:
from IPython.display import Markdown

chat_history = []
query = "Create a summary for what am I doing on the day: 2023-05-09"
result = qa({"question": query})

On May 9, 2023, you have the following activities:

* Deep Work from 10:00 AM to 12:30 PM
* Work Lunch from 1:00 PM to 2:00 PM
* Interview from 2:30 PM to 3:30 PM
* Standup Meeting - Daily from 3:30 PM to 4:00 PM

Some other ideas: 

* Find suitable timeslots on the calendar: "When is the best time to schedule a one hour workout"
* Integrate multiple calendars, find what time is the best to schedule events. 

Thank you for reading, and hopefully this is a usefule showcase on how to give ChatGPT context over your calendar. 