# Creating a more robust RAQA system using LlamaIndex

We'll be putting together a system for querying both qualitative and quantitative data using LlamaIndex. 

To stick to a theme, we'll continue to use BarbenHeimer data as our base - but this can, and should, be extended to other topics/domains.

# Build 🏗️
There are 3 main tasks in this notebook:

- Create a Qualitative VectorStore query engine
- Create a quantitative NLtoSQL query engine
- Combine the two using LlamaIndex's OpenAI agent framework.

# Ship 🚢
Create an host a Gradio or Chainlit application to serve your project on Hugging Face spaces.

# Share 🚀
Make a social media post about your final application and tag @AIMakerspace

### A note on terminology:

You'll notice that there are quite a few similarities between LangChain and LlamaIndex. LlamaIndex can largely be thought of as an extension to LangChain, in some ways - but they moved some of the language around. Let's spend a few moments disambiguating the language.

- `QueryEngine` -> `RetrievalQA`:
  -  `QueryEngine` is just LlamaIndex's way of indicating something is an LLM "chain" on top of a retrieval system
- `OpenAIAgent` vs. `ZeroShotAgent`:
  - The two agents have the same fundamental pattern: Decide which of a list of tools to use to answer a user's query.
  - `OpenAIAgent` (LlamaIndex's primary agent) does not need to rely on an agent excecutor due to the fact that it is leveraging OpenAI's [functional api](https://openai.com/blog/function-calling-and-other-api-updates) which allows the agent to interface "directly" with the tools instead of operating through an intermediary application process.

There is, however, a much large terminological difference when it comes to discussing data.

##### Nodes vs. Documents

As you're aware of from the previous weeks assignments, there's an idea of `documents` in NLP which refers to text objects that exist within a corpus of documents.

LlamaIndex takes this a step further and reclassifies `documents` as `nodes`. Confusingly, it refers to the `Source Document` as simply `Documents`.

The `Document` -> `node` structure is, almost exactly, equivalent to the `Source Document` -> `Document` structure found in LangChain - but the new terminology comes with some clarity about different structure-indices. 

We won't be leveraging those structured indicies today, but we will be leveraging a "benefit" of the `node` structure that exists as a default in LlamaIndex, which is the ability to quickly filter nodes based on their metadata.

![image](https://i.imgur.com/B1QDjs5.png)

### BOILERPLATE

This is only relevant when running the code in a Jupyter Notebook.

In [26]:
import nest_asyncio

nest_asyncio.apply()

import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

### Primary Dependencies and Context Setting

#### Dependencies and OpenAI API key setting

First of all, we'll need our primary libraries - and to set up our OpenAI API key.

In [5]:
!pip install -U -q openai==0.27.8 llama-index==0.8.6 nltk==3.8.1

In [27]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key: ")

import openai
openai.api_key = os.environ["OPENAI_API_KEY"]

#### Context Setting

Now, LlamaIndex has the ability to set `ServiceContext`. You can think of this as a config file of sorts. The basic idea here is that we use this to establish some core properties and then can pass it to various services. 

While we could set this up as a global context, we're going to leave it as `ServiceContext` so we can see where it's applied.

We'll set a few significant contexts:

- `chunk_size` - this is what it says on the tin
- `llm` - this is where we can set what model we wish to use as our primary LLM when we're making `QueryEngine`s and more
- `embed_model` - this will help us keep our embedding model consistent across use cases


We'll also create some resources we're going to keep consistent across all of our indices today.

- `text_splitter` - This is what we'll use to split our text, feel free to experiment here
- `SimpleNodeParser` - This is what will work in tandem with the `text_splitter` to parse our full sized documents into nodes.

In [17]:
from llama_index import ServiceContext
from llama_index.node_parser.simple import SimpleNodeParser
from llama_index.langchain_helpers.text_splitter import TokenTextSplitter
from llama_index.llms import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

embed_model = OpenAIEmbedding()
chunk_size = 1000 ### YOUR CODE HERE
llm = OpenAI(
    temperature=0, 
    model="gpt-3.5-turbo", 
    streaming=True
)

service_context = ServiceContext.from_defaults(
    llm=llm, ### YOUR CODE HERE, 
    chunk_size=chunk_size,### YOUR CODE HERE, 
    embed_model=embed_model### YOUR CODE HERE
)

text_splitter = TokenTextSplitter(
    chunk_size=chunk_size### YOUR CODE HERE
)

node_parser = SimpleNodeParser(
    text_splitter=text_splitter### YOUR CODE HERE
)

### BarbenHeimer Wikipedia Retrieval Tool

Now we can get to work creating our semantic `QueryEngine`!

We'll follow a similar pattern as we did with LangChain here - and the first step (as always) is to get dependencies.

In [7]:
!pip install -U -q chromadb==0.4.6 tiktoken==0.4.0 sentence-transformers==2.2.2 pydantic==1.10.11

In [18]:
from llama_index import VectorStoreIndex
from llama_index.vector_stores import ChromaVectorStore
from llama_index.storage.storage_context import StorageContext
import chromadb

#### ChromaDB

We'll be using [ChromaDB](https://www.trychroma.com/) as our `VectorStore` today!

It works in a similar fashion to tools like Pinecone, Weaveate, and more - but it's locally hosted and will serve our purposes fine.

You'll also notice the return of `OpenAIEmbedding()`, which is the embeddings model we'll be leveraging. Of course, this is using the `ada` model under the hood - and already comes equipped with in-memory caching.

You'll notice we can pass our `service_context` into our `VectorStoreIndex`!

In [28]:
chroma_client = chromadb.Client()
chroma_collection = chroma_client.create_collection("wikipedia_barbie_opp")

vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
wiki_vector_index = VectorStoreIndex([], storage_context=storage_context, service_context=service_context)



INFO:chromadb.telemetry.posthog:Anonymized telemetry enabled. See https://docs.trychroma.com/telemetry for more information.
Anonymized telemetry enabled. See https://docs.trychroma.com/telemetry for more information.
Anonymized telemetry enabled. See https://docs.trychroma.com/telemetry for more information.
Anonymized telemetry enabled. See https://docs.trychroma.com/telemetry for more information.


ValueError: Collection wikipedia_barbie_opp already exists.

In [23]:
!pip install -U -q wikipedia

Essentially the same as the LangChain example - we're just going to be pulling information straight from Wikipedia using the built in `WikipediaReader`.

Setting `auto_suggest=False` ensures we run into fewer auto-correct based errors.

In [29]:
from llama_index.readers.wikipedia import WikipediaReader

movie_list = ["Barbie (film)", "Oppenheimer (film)"]

wiki_docs = WikipediaReader().load_data(pages=movie_list, auto_suggest=False)

#### Node Construction

Now we will loop through our documents and metadata and construct nodes (associated with particular metadata for easy filtration later).

We're using the `node_parser` we created at the top of the Notebook.

In [30]:
for movie, wiki_doc in zip(movie_list, wiki_docs):
    nodes = node_parser.get_nodes_from_documents([wiki_doc])
    for node in nodes:
        node.metadata = {"title" : movie} 
    wiki_vector_index.insert_nodes(nodes)

#https://gpt-index.readthedocs.io/en/stable/api_reference/indices/vector_store.html

#### Auto Retriever Functional Tool

This tool will leverage OpenAI's functional endpoint to select the correct metadata filter and query the filtered index - only looking at nodes with the desired metadata.

A simplified diagram: ![image](https://i.imgur.com/AICDPav.png)

First, we need to create our `VectoreStoreInfo` object which will hold all the relevant metadata we need for each component (in this case title metadata).

Notice that you need to include it in a text list.

In [32]:
from llama_index.tools import FunctionTool
from llama_index.vector_stores.types import (
    VectorStoreInfo,
    MetadataInfo,
    ExactMatchFilter,
    MetadataFilters,
)
from llama_index.retrievers import VectorIndexRetriever
from llama_index.query_engine import RetrieverQueryEngine

from typing import List, Tuple, Any
from pydantic import BaseModel, Field

top_k = 3

vector_store_info = VectorStoreInfo(
    content_info="semantic information about movies",
    metadata_info=[MetadataInfo(
        name="title",
        type="str",
        description="title of the movie, one of [Barbie (film), Oppenheimer (film)]",
    )]
)

Now we'll create our base PyDantic object that we can use to ensure compatability with our application layer. This verifies that the response from the OpenAI endpoint conforms to this schema.

In [33]:
class AutoRetrieveModel(BaseModel):
    query: str = Field(..., description="natural language query string")
    filter_key_list: List[str] = Field(
        ..., description="List of metadata filter field names"
    )
    filter_value_list: List[str] = Field(
        ...,
        description=(
            "List of metadata filter field values (corresponding to names specified in filter_key_list)"
        )
    )

Now we can build our function that we will use to query the functional endpoint.

>The `docstring` is important to the functionality of the application.

In [34]:
def auto_retrieve_fn(
    query: str, filter_key_list: List[str], filter_value_list: List[str]
):
    """Auto retrieval function.

    Performs auto-retrieval from a vector database, and then applies a set of filters.

    """
    query = query or "Query"

    exact_match_filters = [
        ExactMatchFilter(key=k, value=v)
        for k, v in zip(filter_key_list, filter_value_list)
    ]
    retriever = VectorIndexRetriever(
        wiki_vector_index, filters=MetadataFilters(filters=exact_match_filters), top_k=top_k
    )
    query_engine = RetrieverQueryEngine.from_args(retriever)

    response = query_engine.query(query)
    return str(response)

Now we need to wrap our system in a tool in order to integrate it into the larger application.

Source Code Here:
- [`FunctionTool`](https://github.com/jerryjliu/llama_index/blob/d24767b0812ac56104497d8f59095eccbe9f2b08/llama_index/tools/function_tool.py#L21)

In [36]:
vector_store_info.metadata_info

[MetadataInfo(name='title', type='str', description='title of the movie, one of [Barbie (film), Oppenheimer (film)]')]

In [39]:
## these description are important 
description = f"""\  
Use this tool to look up semantic information about films.
The vector database schema is given below:
{vector_store_info.json()}
"""

auto_retrieve_tool = FunctionTool.from_defaults(
    fn=auto_retrieve_fn, # YOUR CODE HERE,
    name="AutoRetrieve",# YOUR CODE HERE,
    description=description,# YOUR CODE HERE,
    fn_schema=AutoRetrieveModel# YOUR CODE HERE,
)

All that's left to do is attach the tool to an OpenAIAgent and let it rip!

Source Code Here:
- [`OpenAIAgent`](https://github.com/jerryjliu/llama_index/blob/d24767b0812ac56104497d8f59095eccbe9f2b08/llama_index/agent/openai_agent.py#L361)

In [41]:
from llama_index.agent import OpenAIAgent

agent = OpenAIAgent.from_tools([auto_retrieve_tool], llm=llm, verbose=True)


In [42]:
response = agent.chat("Tell me what happens (briefly) in the Barbie movie.")
print(str(response))

=== Calling Function ===
Calling function: AutoRetrieve with args: {
  "query": "Barbie movie",
  "filter_key_list": ["title"],
  "filter_value_list": ["Barbie (film)"]
}
Got output: Barbie is a 2023 American fantasy comedy film directed by Greta Gerwig. It is the first live-action Barbie film based on the Barbie fashion dolls by Mattel. The film follows Barbie and Ken on a journey of self-discovery after Barbie experiences an existential crisis. The film received critical acclaim and has become the second-highest-grossing film of 2023.
In the Barbie movie, Barbie and Ken go on a journey of self-discovery after Barbie experiences an existential crisis. The film is a fantasy comedy and has received critical acclaim. It is the first live-action Barbie film based on the Barbie fashion dolls by Mattel.


### BarbenHeimer SQL Tool

We'll walk through the steps of creating a natural language to SQL system in the following section.

> NOTICE: This does not have parsing on the inputs or intermediary calls to ensure that users are using safe SQL queries. Use this with caution in a production environment without adding specific guardrails from either side of the application.

In [43]:
!pip install -q -U sqlalchemy pandas

The next few steps should be largely straightforward, we'll want to:

1. Read in our `.csv` files into `pd.DataFrame` objects
2. Create an in-memory `sqlite` powered `sqlalchemy` engine
3. Cast our `pd.DataFrame` objects to the SQL engine
4. Create an `SQLDatabase` object through LlamaIndex
5. Use that to create a `QueryEngineTool` that we can interact with through the `NLSQLTableQueryEngine`!

If you get stuck, please consult the documentation.

#### Read `.csv` Into Pandas

In [45]:
import pandas as pd

barbie_df = pd.read_csv('./barbie_data/barbie.csv')# YOUR CODE HERE
oppenheimer_df = pd.read_csv('./oppenheimer_data/oppenheimer.csv')# YOUR CODE HERE

#### Create SQLAlchemy engine with SQLite

In [46]:
from sqlalchemy import create_engine

engine = create_engine("sqlite+pysqlite:///:memory:")

#### Convert `pd.DataFrame` to SQL tables

In [47]:
barbie_df.to_sql('barbie', con=engine, if_exists='append')

125

In [48]:
oppenheimer_df.to_sql('oppenheimer', con=engine, if_exists='append')

150

#### Construct a `SQLDatabase` index

Source Code Here:
- [`SQLDatabase`](https://github.com/jerryjliu/llama_index/blob/d24767b0812ac56104497d8f59095eccbe9f2b08/llama_index/langchain_helpers/sql_wrapper.py#L9)

In [49]:
from llama_index import SQLDatabase

sql_database = SQLDatabase(
    engine,
    include_tables=['barbie','oppenheimer'])

#https://github.com/jerryjliu/llama_index/blob/main/docs/examples/index_structs/struct_indices/SQLIndexDemo.ipynb

In [51]:
sql_database.table_info

'\nCREATE TABLE barbie (\n\t"index" BIGINT, \n\t"Unnamed: 0" BIGINT, \n\t"Review_Date" TEXT, \n\t"Author" TEXT, \n\t"Rating" FLOAT, \n\t"Review_Title" TEXT, \n\t"Review" TEXT, \n\t"Review_Url" TEXT\n)\n\n/*\n3 rows from barbie table:\nindex\tUnnamed: 0\tReview_Date\tAuthor\tRating\tReview_Title\tReview\tReview_Url\n0\t0\t21 July 2023\tLoveofLegacy\t6.0\t Beautiful film, but so preachy\n\tMargot does the best with what she\'s given, but this film was very disappointing to me. It was marke\t/review/rw9199947/?ref_=tt_urv\n1\t1\t22 July 2023\timseeg\t7.0\t 3 reasons FOR seeing it and 1 reason AGAINST.\n\tThe first reason to go see it:\t/review/rw9199947/?ref_=tt_urv\n2\t2\t22 July 2023\tNatcat87\t6.0\t Too heavy handed\n\tAs a woman that grew up with Barbie, I was very excited for this movie. I was curious to see how the\t/review/rw9199947/?ref_=tt_urv\n*/\n\n\nCREATE TABLE oppenheimer (\n\t"index" BIGINT, \n\t"Unnamed: 0" BIGINT, \n\t"Review_Date" TEXT, \n\t"Author" TEXT, \n\t"Rating" FL

#### Create the NLSQLTableQueryEngine interface for all added SQL tables

Source Code Here:
- [`NLSQLTableQueryEngine`](https://github.com/jerryjliu/llama_index/blob/d24767b0812ac56104497d8f59095eccbe9f2b08/llama_index/indices/struct_store/sql_query.py#L75C1-L75C1)

In [52]:
from llama_index.indices.struct_store.sql_query import NLSQLTableQueryEngine

sql_query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,#
    tables=['barbie','oppenheimer']# 
)

#### Wrap It All Up in a `QueryEngineTool`

You'll want to ensure you have a descriptive...description. 

An example is provided here:

```
"Useful for translating a natural language query into a SQL query over a table containing: "
"barbie, containing information related to reviews of the Barbie movie"
"oppenheimer, containing information related to reviews of the Oppenheimer movie"
```

Sorce Code Here: 

- [`QueryEngineTool`](https://github.com/jerryjliu/llama_index/blob/d24767b0812ac56104497d8f59095eccbe9f2b08/llama_index/tools/query_engine.py#L13)

In [53]:
from llama_index.tools.query_engine import QueryEngineTool

sql_tool = QueryEngineTool.from_defaults(
    query_engine=sql_query_engine,# YOUR CODE HERE,
    name='sql',# Add a name here,
    description=(
        "Useful for translating a natural language query into a SQL query over a table containing: "
        "barbie, containing information related to reviews of the Barbie movie"
        "oppenheimer, containing information related to reviews of the Oppenheimer movie"
# Add a natural language description here
    ),

)

In [55]:
agent = OpenAIAgent.from_tools([sql_tool], llm=llm, verbose=True)

In [56]:
response = agent.chat("What is the average rating of the two films?")

=== Calling Function ===
Calling function: sql with args: {
  "input": "SELECT AVG(rating) AS average_rating FROM barbie UNION SELECT AVG(rating) AS average_rating FROM oppenheimer"
}
INFO:llama_index.indices.struct_store.sql_query:> Table desc str: Table 'barbie' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Url (TEXT), and foreign keys: .

Table 'oppenheimer' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Url (TEXT), and foreign keys: .
> Table desc str: Table 'barbie' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Url (TEXT), and foreign keys: .

Table 'oppenheimer' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TE

In [57]:
print(str(response))

The average rating of the two films is 7.36 for Barbie and 8.35 for Oppenheimer.


### Combining The Tools Together

Now, we can simple add our tools into the `OpenAIAgent`, and off we go!

In [58]:
barbenheimer_agent = OpenAIAgent.from_tools([auto_retrieve_tool, sql_tool], llm=llm, verbose=True
    ### YOUR CODE HERE
)

In [59]:
response = barbenheimer_agent.chat("What is the lowest rating of the two films - and can you summarize what the reviewer said?")

=== Calling Function ===
Calling function: sql with args: {
  "input": "SELECT MIN(rating) AS lowest_rating, review FROM barbie UNION SELECT MIN(rating) AS lowest_rating, review FROM oppenheimer"
}
INFO:llama_index.indices.struct_store.sql_query:> Table desc str: Table 'barbie' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Url (TEXT), and foreign keys: .

Table 'oppenheimer' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Url (TEXT), and foreign keys: .
> Table desc str: Table 'barbie' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Url (TEXT), and foreign keys: .

Table 'oppenheimer' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEX

In [60]:
print(str(response))

The lowest rating for both films is 3.0. 

For the Barbie movie, one reviewer mentioned that it lacked science and interesting problem-solving. 

For the Oppenheimer movie, one reviewer mentioned that while the production value, cinematography, and acting were good, the movie fell short in some aspects.


In [61]:
response = barbenheimer_agent.chat("How many times do the Barbie reviews mention 'Ken', and what is a summary of his character in the Barbie movie?")

=== Calling Function ===
Calling function: AutoRetrieve with args: {
  "query": "Barbie",
  "filter_key_list": ["title"],
  "filter_value_list": ["Barbie (film)"]
}
Got output: Barbie is a 2023 American fantasy comedy film directed by Greta Gerwig. It is the first live-action Barbie film and is based on the Barbie fashion dolls by Mattel. The film follows Barbie and Ken on a journey of self-discovery after Barbie experiences an existential crisis. It features a supporting ensemble cast and has received critical acclaim. The film was released in July 2023 and has become the second-highest-grossing film of the year.
=== Calling Function ===
Calling function: AutoRetrieve with args: {
  "query": "Ken",
  "filter_key_list": ["title"],
  "filter_value_list": ["Barbie (film)"]
}
Got output: Ken is a character in the film Barbie. He is portrayed by actor Ryan Gosling. In the film, Ken has low self-esteem and seeks approval from Barbie. He also has a power ballad, which is considered a signifi

In [62]:
print(str(response))

The character of Ken is mentioned in the Barbie movie. 

According to the information available, Ken is portrayed by actor Ryan Gosling. In the film, Ken has low self-esteem and seeks approval from Barbie. He also has a power ballad, which is considered a significant moment in the movie. The casting process for the film involved looking for actors with "Barbie energy" to portray Ken and other characters.


In [66]:
response = barbenheimer_agent.chat("Do Barbie and Oppenheimer share the same themes based on wikipedia? ")

=== Calling Function ===
Calling function: AutoRetrieve with args: {
  "query": "Barbie",
  "filter_key_list": ["title"],
  "filter_value_list": ["Barbie (film)"]
}
Got output: Barbie is a 2023 American fantasy comedy film directed by Greta Gerwig. It is the first live-action Barbie film and is based on the Barbie fashion dolls by Mattel. The film follows Barbie and Ken on a journey of self-discovery after Barbie experiences an existential crisis. It features a supporting ensemble cast and has received critical acclaim. The film was released in July 2023 and has become one of the highest-grossing films of the year.
=== Calling Function ===
Calling function: AutoRetrieve with args: {
  "query": "Oppenheimer",
  "filter_key_list": ["title"],
  "filter_value_list": ["Oppenheimer (film)"]
}
Got output: Oppenheimer is a 2023 epic biographical thriller film written and directed by Christopher Nolan. It is based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin. The 

In [67]:
print(str(response))

Based on the information available, the Barbie movie and the Oppenheimer movie do not share the same themes. 

The Barbie movie is described as a fantasy comedy film that follows Barbie and Ken on a journey of self-discovery after Barbie experiences an existential crisis. It is based on the Barbie fashion dolls by Mattel.

On the other hand, the Oppenheimer movie is an epic biographical thriller that chronicles the career of American theoretical physicist J. Robert Oppenheimer. It focuses on his studies, his role in the Manhattan Project during World War II, and his subsequent fall from grace due to a security hearing in 1954.

Therefore, the themes of the two movies are different.


In [68]:
response = barbenheimer_agent.chat("Do Barbie and Oppenheimer share the same themes based on reviews? ")

=== Calling Function ===
Calling function: AutoRetrieve with args: {
  "query": "Barbie",
  "filter_key_list": ["title"],
  "filter_value_list": ["Barbie (film)"]
}
Got output: Barbie is a 2023 American fantasy comedy film directed by Greta Gerwig. It is the first live-action Barbie film and is based on the Barbie fashion dolls by Mattel. The film follows Barbie and Ken on a journey of self-discovery after Barbie experiences an existential crisis. It features a supporting ensemble cast and has received critical acclaim. The film was released in July 2023 and has become the second-highest-grossing film of the year.
=== Calling Function ===
Calling function: AutoRetrieve with args: {
  "query": "Oppenheimer",
  "filter_key_list": ["title"],
  "filter_value_list": ["Oppenheimer (film)"]
}
Got output: Oppenheimer is a 2023 epic biographical thriller film written and directed by Christopher Nolan. It is based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin. The f

In [69]:
print(str(response))

Based on the available information, the Barbie movie explores themes related to feminism, capitalism, and existentialism. Some reviews see the film as a satire that critiques capitalism, while others view it as a clever use of capitalist themes. The movie also delves into the concept of what makes us human and raises questions about autonomy.

On the other hand, the Oppenheimer movie focuses on the life and career of J. Robert Oppenheimer, a theoretical physicist involved in the Manhattan Project during World War II. The film explores themes such as scientific discovery, the ethics of nuclear weapons, and the consequences of scientific advancements.

Therefore, based on the reviews, the Barbie and Oppenheimer movies have different themes.


In [70]:
response = barbenheimer_agent.chat("which movie should i watch if i want to watch higher rated move? ")

=== Calling Function ===
Calling function: sql with args: {
  "input": "SELECT AVG(rating) AS average_rating FROM barbie UNION SELECT AVG(rating) AS average_rating FROM oppenheimer"
}
INFO:llama_index.indices.struct_store.sql_query:> Table desc str: Table 'barbie' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Url (TEXT), and foreign keys: .

Table 'oppenheimer' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Url (TEXT), and foreign keys: .
> Table desc str: Table 'barbie' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Url (TEXT), and foreign keys: .

Table 'oppenheimer' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TE

In [71]:
print(str(response))

If you want to watch a higher-rated movie, I would recommend watching the Oppenheimer movie. It has an average rating of 8.35, which is higher than the average rating of 7.36 for the Barbie movie.
