# Google Chat Reader Test

Demonstrates our Google Chat data connector.

If you're opening this Notebook on colab, you will probably need to install LlamaIndex ðŸ¦™.

In [None]:
%pip install llama-index llama-index-readers-google


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


This loader takes in IDs of Google Chat spaces or messages and parses the chat history into `Document`s. The space/message ID can be found in the URL, as shown below:

- mail.google.com/chat/u/0/#chat/space/**<CHAT_ID>**

Before using this loader, you need to create a Google Cloud Platform (GCP) project with a Google Workspace account. Then, you need to authorize the app with user credentials. Follow the prerequisites and steps 1 and 2 of [this guide](https://developers.google.com/workspace/chat/authenticate-authorize-chat-user). After downloading the client secret JSON file, rename it as **`credentials.json`** and save it into your project folder.

This example parses a chat between two users. They first discuss math homework, then they plan a trip to San Francisco in a thread. At the end, they discuss finishing an essay. See the full thread [here](https://pastebin.com/FrYscMAa).

## Basic Usage

The example below loads the entire chat history into a `SummaryIndex`.

In [None]:
from llama_index.core import SummaryIndex
from llama_index.readers.google import GoogleChatReader

space_ids = [
    "AAAAtTPwdzg"
]  # The Google account you authenticated with must have access to this space
reader = GoogleChatReader()
docs = reader.load_data(space_names=space_ids)

In [None]:
index = SummaryIndex.from_documents(docs)

In [None]:
query_engine = index.as_query_engine()
response = query_engine.query("What was the overall conversation about?")

In [None]:
from IPython.display import Markdown, display

display(Markdown(f"{response}"))

The overall conversation was about discussing and planning a trip to San Francisco, including visiting various landmarks and using public transportation to get around the city. Additionally, there was a brief mention of finishing homework and essays before the trip.

## Filtering and Ordering

### Ordering
You can order the chat history by ascending or descending order.

In [None]:
docs = reader.load_data(space_names=space_ids, order_asc=False)

index = SummaryIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query(
    "List the things that the users discussed in the order they were discussed in. Make the list short."
)
display(Markdown(f"{response}"))

1. Visiting San Francisco
2. Planning a trip itinerary in San Francisco
3. Taking public transit to San Francisco
4. Discussing transportation options in San Francisco
5. Working on a math problem together

Even though the messages were retrieved in reverse order, the list is still in the correct order because messages have a timestamp in their metadata.

### Message Limiting
Messages can be limited to a certain number using the `num_messages` parameter. However, the number of messages that are loaded may not be exactly this number. If `order_asc` is True, then takes the first `num_messages` messages within the given time frame. If `order_desc` is True, then takes the last `num_messages` messages within the time frame.

In [None]:
docs = reader.load_data(
    space_names=space_ids, num_messages=10
)  # in ascending order, only contains messages about math HW

index = SummaryIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query("What was discussed in this conversation?")
display(Markdown(f"{response}"))

The conversation revolved around a student seeking help with their math homework, specifically understanding and applying the chain rule in calculus to find the derivative of a function involving cosine. The student was stuck on problem 4b and needed assistance with taking the derivative of cos(2x). The other participant explained the application of the chain rule step by step, leading to the correct derivative of -2sin(2x). The student expressed gratitude for the explanation and indicated that they could now proceed with solving the remaining problems.

Notice that the summary is only about the first 10 messages, which only involves help on the math homework. Below is an example of retrieving the last 16 messages, which only involves the essay. The "cost of a trip" refers to a reply in the SF trip thread that was made during the discussion of the essay.

In [None]:
docs = reader.load_data(
    space_names=space_ids, num_messages=16, order_asc=False
)  # in descending order, only contains messages about essay

index = SummaryIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query("What was discussed in this conversation?")
display(Markdown(f"{response}"))

The conversation revolved around the completion of an essay, specifically focusing on the contrast between old money and new money rather than the American Dream and The Great Gatsby. There were mentions of procrastination, starting work on the essay, and concerns about the cost of a trip.

### Time Frame

A `before` and `after` time frame can also be specified. These parameters take in `datetime` objects.

In [None]:
import datetime

date1 = datetime.datetime.fromisoformat(
    "2024-06-25 14:27:00-07:00"
)  # when they start talking about trip
docs = reader.load_data(space_names=space_ids, before=date1)

index = SummaryIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query(
    "What was discussed in this conversation?"
)  # should only be about math HW
display(Markdown(f"{response}"))

The conversation revolved around a student seeking help with understanding the chain rule in calculus, specifically how to take the derivative of cos(2x). The student was stuck on problem 4b and requested assistance, to which the other student explained the application of the chain rule step by step. The explanation clarified how to differentiate cos(2x) using the chain rule, resulting in the derivative being -2sin(2x). The student who sought help expressed gratitude and indicated that they could now proceed with solving the remaining problems.

In [None]:
date2 = datetime.datetime.fromisoformat(
    "2024-06-25 14:51:00-07:00"
)  # when they start talking about essay
docs = reader.load_data(space_names=space_ids, after=date2)

index = SummaryIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query(
    "What was discussed in this conversation?"
)  # should only be about essay + cost of trip (in thread)
display(Markdown(f"{response}"))

The conversation revolved around finishing an essay on the contrast between old money and new money, concerns about the cost of a trip, and reassurances that transportation expenses would be affordable.

In [None]:
docs = reader.load_data(space_names=space_ids, after=date1, before=date2)

index = SummaryIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query(
    "What was discussed in this conversation?"
)  # should only be about trip
display(Markdown(f"{response}"))

The conversation revolved around planning a trip to San Francisco for the weekend. They discussed visiting various landmarks such as the Golden Gate Bridge, Fisherman's Wharf, and the Ferry Building, with the possibility of going to Alcatraz or Twin Peaks if time allowed. They also talked about transportation options like taking Caltrain or BART, renting a scooter or BayWheels bike, and using public transit to move around the city. Additionally, they mentioned exploring Chinatown and taking bus routes back to the city center for flexibility.