# Creating a LlamaIndex RAG Pipeline with NL2SQL and Metadata Filtering!

We'll be putting together a system for querying both qualitative and quantitative data using LlamaIndex.

The acitvities will be broken down as follows:

- 🤝 Breakout Room #1
  - Task 1: Load Dependencies
  - Task 2: Set Env Variables and Set Up WandB Callback
  - Task 3: Initialize Settings
  - Task 4: Index Creation
  - Task 5: Simple RAG - `QueryEngine`
  - Task 6: Auto Rertriever Functional Tool
- 🤝 Breakout Room #2
  - Task 1: Quantitative RAG Pipeline with NL2SQL Tooling
  - Task 2: Combined RAG Pipeline

Before we get started, however, a quick note on terminology.

# 🤝 Breakout Room #1

## BOILERPLATE

This is only relevant when running the code in a Jupyter Notebook.

In [1]:
import nest_asyncio

nest_asyncio.apply()

import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

## Load Dependencies

Let's grab our core `llama-index` library, as well as OpenAI's Python SDK.

We'll be leveraging OpenAI's suite of APIs to power our RAG pipelines today.

> NOTE: You can safely ignore any pip errors that occur during the running of these cells.

In [2]:
!pip install -qU llama-index openai anthropic


[notice] A new release of pip is available: 24.0 -> 24.1
[notice] To update, run: python.exe -m pip install --upgrade pip


We'll be collecting our semantic data from Wikipedia - and so will need the [Wikipedia Reader](https://github.com/run-llama/llama_index/tree/main/llama-index-integrations/readers/llama-index-readers-wikipedia)!

In [3]:
!pip install -qU wikipedia llama-index-readers-wikipedia


[notice] A new release of pip is available: 24.0 -> 24.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Our vector database today will be powered by [QDrant](https://qdrant.tech/) and so we'll need that package as well!

In [4]:
!pip install -qU llama-index-vector-stores-qdrant qdrant-client


[notice] A new release of pip is available: 24.0 -> 24.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Finally, we'll need to grab a few dependencies related to our quantitative data!

In [5]:
!pip install -q -U sqlalchemy pandas


[notice] A new release of pip is available: 24.0 -> 24.1
[notice] To update, run: python.exe -m pip install --upgrade pip


We'll can use [Weights and Biases](https://docs.wandb.ai/guides/prompts) (WandB) as a visibility platform, as well as storing our index!

In [6]:
!pip install -qU wandb llama-index-callbacks-wandb

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
grpcio-tools 1.64.1 requires protobuf<6.0dev,>=5.26.1, but you have protobuf 4.25.3 which is incompatible.

[notice] A new release of pip is available: 24.0 -> 24.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
import os
from dotenv import load_dotenv

load_dotenv()

True

We'll also need to set a callback handler for WandB to ensure smooth operation of our traces!

In [3]:
import llama_index
from llama_index.core import set_global_handler

set_global_handler("wandb", run_args={"project": "llama-index-rag-v1"})
wandb_callback = llama_index.core.global_handler

ERROR:wandb.jupyter:Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


wandb: Streaming LlamaIndex events to W&B at https://wandb.ai/lakshyaag/llama-index-rag-v1/runs/esuuk7ge
wandb: `WandbCallbackHandler` is currently in beta.
wandb: Please report any issues to https://github.com/wandb/wandb/issues with the tag `llamaindex`.


## Task 2: Set Env Variables and Set Up WandB Callback

Let's set our API keys for both OpenAI and WandB!

In [None]:
import os
import getpass

# os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key: ")

### OPTIONAL ADVANCED PATH:

Instead of OpenAI - you could use Anthropic's new Claude model `Sonnet 3.5`!

Let's see how the flow might be different if you wanted to use the latest and greatest from Anthropic!

> NOTE: You will need an [API Key](https://www.anthropic.com/news/claude-3-5-sonnet) for `Sonnet 3.5` for the following cells to work!

In [9]:
# OPTIONAL ADVANCED PATH
!pip install -qU llama-index-llms-anthropic

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m862.7/862.7 kB[0m [31m6.9 MB/s[0m eta [36m0:00:00[0m
[?25h

In [17]:
# OPTIONAL ADVANCED PATH
os.environ["ANTHROPIC_API_KEY"] = getpass.getpass("Anthropic API Key: ")

Anthropic API Key: ··········


In [18]:
# OPTIONAL ADVANCED PATH
from llama_index.llms.anthropic import Anthropic
from llama_index.core import Settings

Settings.llm = Anthropic(model="claude-3-5-sonnet-20240620")

## Task 3: Settings

LlamaIndex lets us set global settings which we can use to influence the default behaviour of our components.

Let's set our LLM and our Embedding Model!

In [None]:
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings

Settings.llm = OpenAI(model="gpt-4o")

In [None]:
from llama_index.embeddings.openai import OpenAIEmbedding
Settings.embedding = OpenAIEmbedding(model="text-embedding-3-small")

## Task 4: `Index` Creation

In order for us to perform RAG in the traditional sense - we need an `Index`.

So what is an `Index`? Well - let's see how LlamaIndex defines it:

> In LlamaIndex terms, an `Index` is a data structure composed of Document objects, designed to enable querying by an LLM. Your Index is designed to be complementary to your querying strategy.

Okay, so we know that we have a boatload of Wikipedia content - and we know that we want to be able to query the `Index` and receive documents that are related to our query - so let's use an `Index` built on the idea of embedding-vectors.

Introducing: `VectorStoreIndex`!

Again, let's see how LlamaIndex defines this:

> A `VectorStoreIndex` is by far the most frequent type of `Index` you'll encounter. The Vector Store Index takes your Documents and splits them up into Nodes. It then creates `vector` embeddings of the text of every node, ready to be queried by an LLM.

Alright, that sounds awesome - let's make one!

### Data Collection

We're just going to be pulling information straight from Wikipedia using the built in `WikipediaReader`.

> NOTE: Setting `auto_suggest=False` ensures we run into fewer auto-correct based errors.

### A note on terminology:

You'll notice that there are quite a few similarities between LangChain and LlamaIndex. LlamaIndex can largely be thought of as an extension to LangChain, in some ways - but they moved some of the language around. Let's spend a few moments disambiguating the language.

- `QueryEngine` -> `LCEL Chain`:
  -  `QueryEngine` is just LlamaIndex's way of indicating something is an LLM "chain" on top of a retrieval system
- `OpenAIAgent` vs. `Agent`:
  - The two agents have the same fundamental pattern: Decide which of a list of tools to use to answer a user's query.
  - `OpenAIAgent` (LlamaIndex's primary agent) does not need to rely on an agent excecutor due to the fact that it is leveraging OpenAI's [functional api](https://openai.com/blog/function-calling-and-other-api-updates) which allows the agent to interface "directly" with the tools instead of operating through an intermediary application process.

There is, however, a much large terminological difference when it comes to discussing data.

##### Nodes vs. Documents

As you're aware of from the previous weeks assignments, there's an idea of `documents` in NLP which refers to text objects that exist within a corpus of documents.

LlamaIndex takes this a step further and reclassifies `documents` as `nodes`. Confusingly, it refers to the `Source Document` as simply `Documents`.

The `Document` -> `node` structure is, almost exactly, equivalent to the `Source Document` -> `Document` structure found in LangChain - but the new terminology comes with some clarity about different structure-indices.

We won't be leveraging those structured indicies today, but we will be leveraging a "benefit" of the `node` structure that exists as a default in LlamaIndex, which is the ability to quickly filter nodes based on their metadata.

![image](https://i.imgur.com/B1QDjs5.png)

In [28]:
import httpx
from llama_index.readers.wikipedia import WikipediaReader

wiki_reader = WikipediaReader()

movie_list = [
    "Dune (2021 film)",
    "Dune: Part Two",
    "The Lord of the Rings: The Fellowship of the Ring",
]

In [29]:
wiki_docs = wiki_reader.load_data(
    pages=movie_list, auto_suggest=False, lang_prefix="en"
)

ConnectTimeout: HTTPSConnectionPool(host='en.wikipedia.org', port=443): Max retries exceeded with url: /w/api.php?prop=info%7Cpageprops&inprop=url&ppprop=disambiguation&redirects=&titles=Dune%3A+Part+Two&format=json&action=query (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x000001714D7B4A10>, 'Connection to en.wikipedia.org timed out. (connect timeout=None)'))

In [16]:
movie_list = [
    "The Lord of the Rings: The Two Towers",
]

wiki_docs.extend(wiki_reader.load_data(pages=movie_list, auto_suggest=False))


ConnectTimeout: HTTPConnectionPool(host='en.wikipedia.org', port=80): Max retries exceeded with url: /w/api.php?prop=extracts%7Crevisions&explaintext=&rvprop=ids&titles=The+Lord+of+the+Rings%3A+The+Two+Towers&format=json&action=query (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x000001714AFA3A90>, 'Connection to en.wikipedia.org timed out. (connect timeout=None)'))

In [30]:
wiki_docs

[Document(id_='52659577', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Dune (titled onscreen as Dune: Part One) is a 2021 American epic science fiction film directed and co-produced by Denis Villeneuve, who co-wrote the screenplay with Jon Spaihts, and Eric Roth. It is the first of a two-part adaptation of the 1965 novel of the same name by Frank Herbert. Set in the distant future, the film follows Paul Atreides as his family, the noble House Atreides, is thrust into a war for the deadly and inhospitable desert planet Arrakis. The ensemble cast includes Timothée Chalamet, Rebecca Ferguson, Oscar Isaac, Josh Brolin, Stellan Skarsgård, Dave Bautista, Stephen McKinley Henderson, Zendaya, Chang Chen, Sharon Duncan-Brewster, Charlotte Rampling, Jason Momoa, and Javier Bardem.\nThe film is the third adaptation of Dune, following David Lynch\'s 1984 film and John Harrison\'s 2000 television miniseries. After an unsuccessf

### Initializing our `VectorStoreIndex` with QDrant

QDrant is a locally hostable and open-source vector database solution.

It offers powerful features like metadata filtering out of the box, and will suit our needs well today!

We'll start by creating our local `:memory:` client (in-memory and not meant for production use-cases) and our collection.

In [None]:
from llama_index.vector_stores.qdrant import QdrantVectorStore
from qdrant_client import QdrantClient, models

client = QdrantClient(location=":memory:")

client.create_collection(
    collection_name="movie_wikis",
    vectors_config=models.VectorParams(size=1536, distance=models.Distance.COSINE)
)

True

Then we'll create our `VectorStore` and `StorageContext` which will allow us to create an empty `VectorStoreIndex` which we will be able to add nodes to later!

In [None]:
from llama_index.core import VectorStoreIndex
from llama_index.core import StorageContext

vector_store = QdrantVectorStore(client=client, collection_name="movie_wikis")

storage_context = StorageContext.from_defaults(vector_store=vector_store)

index = VectorStoreIndex.from_documents(
    [],
    storage_context=storage_context,
)

<<<<<<< local


wandb: Logged trace tree to W&B.




[34m[1mwandb[0m: Logged trace tree to W&B.


>>>>>>> remote


### Node Construction

Now we will loop through our documents and metadata and construct nodes.

We'll make sure to explicitly associate our nodes with their respective movie so we can filter by the movie title in the upcoming cells.

You might be thinking to yourself - wait, we never indicated which embedding model this should use - but remember

In [None]:
from llama_index.core import SimpleDirectoryReader
from llama_index.core.ingestion import IngestionPipeline
from llama_index.core.node_parser import TokenTextSplitter
from llama_index.core.extractors import TitleExtractor

pipeline = IngestionPipeline(transformations=[TokenTextSplitter()])

for movie, wiki_doc in zip(movie_list, wiki_docs):
  nodes = pipeline.run(documents=[wiki_doc])
  for node in nodes:
      node.metadata = {"title" : movie}
  index.insert_nodes(nodes)

<<<<<<< local


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


wandb: Logged trace tree to W&B.


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


wandb: Logged trace tree to W&B.


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


wandb: Logged trace tree to W&B.




[34m[1mwandb[0m: Logged trace tree to W&B.
[34m[1mwandb[0m: Logged trace tree to W&B.
[34m[1mwandb[0m: Logged trace tree to W&B.
[34m[1mwandb[0m: Logged trace tree to W&B.


>>>>>>> remote


#### ❓ Question #1:

What `metadata` fields will the nodes in our index have?

> You will need to write code to find this information

#### Answer #1:

The nodes in the index have metadata with:
- `title`: The title of the Wikipedia page

In [34]:
### YOUR CODE HERE
index._vector_store.get_nodes()

[TextNode(id_='017530d6-c511-40f4-bcbf-955572f094a8', embedding=None, metadata={'title': 'Dune (2021 film)'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='52659577', node_type=<ObjectType.DOCUMENT: '4'>, metadata={}, hash='30576c4b0d2c665aacf99fca36da60c7a55b87e5abd30680640ce53ab573cb2e'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='3e3cbc7b-1918-4833-8cac-669e31cd1ce9', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='de0120a18a0b5f62d7096302fb0345f5b275c61deb082e7f6d66bd249ce3ea49'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='916b53a3-18e7-49a8-a106-db9c866f6152', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='0da21008b2fa8077a2f3a38931473bdb8145963d3afcdd3dd95eff2105007bbe')}, text='it. Following its premiere at the Venice Film Festival in September 2021, early reception was generally positive, receiving praise for its ambition, story, scope, worldbuilding,

### Persisting and Loading Stored Index with Weights and Biases

Now we can utilize a powerful feature of Weights and Biases - index and artifact versioning!

We can persist our index to WandB to be used and loaded later!

In [None]:
wandb_callback.persist_index(index, index_name="movie-index-qdrant")

<<<<<<< local


wandb: Adding directory to artifact (e:\Projects\AI-Maven\AIE3\Week 5\Day 1\wandb\run-20240625_195142-esuuk7ge\files\storage)... Done. 0.0s




>>>>>>> remote


<<<<<<< local <removed>




[34m[1mwandb[0m: Adding directory to artifact (/content/wandb/run-20240625_205442-48js8m4z/files/storage)... Done. 0.0s


>>>>>>> remote <modified: >


Now we can load our index from WandB, which is a truly powerful tool!

In [None]:
from llama_index.core import load_index_from_storage

storage_context = wandb_callback.load_storage_context(
<<<<<<< local
    artifact_url="lakshyaag/llama-index-rag-v1/movie-index-qdrant:v0"
=======
    artifact_url="<<YOUR USERNAME HERE>>"
>>>>>>> remote
)

wandb:   4 of 4 files downloaded.  


#### ❓ Question #2:

Provide a screenshot of your index version history as shown in WandB.

You can find your screenshot by doing the following:

![image](https://i.imgur.com/Y0AHkQI.png)

#### ❓ Question #2:

Provide a screenshot of your index version history as shown in WandB.

You can find your screenshot by doing the following:

![image](https://i.imgur.com/Y0AHkQI.png)

#### Answer #2:

![Screenshot of W&B](./w&b_ss.png)

## Task 5: Simple RAG - QueryEngine

Now that we're created our `VectorStoreIndex`, powered by a QDrant VectorStore, we can wrap it in a simple `QueryEngine` using the `as_query_engine()` method - which will connect a few things together for us:

In [None]:
simple_rag = index.as_query_engine()

Before we test this out - let's see what information we can find out about from our new `QueryEngine`!

In [None]:
for k, v in simple_rag.get_prompts().items():
    print(v.get_template())
    print("\n~~~~~~~~~~~~~~~~~~\n")

Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {query_str}
Answer: 

~~~~~~~~~~~~~~~~~~

The original query is as follows: {query_str}
We have provided an existing answer: {existing_answer}
We have the opportunity to refine the existing answer (only if needed) with some more context below.
------------
{context_msg}
------------
Given the new context, refine the original answer to better answer the query. If the context isn't useful, return the original answer.
Refined Answer: 

~~~~~~~~~~~~~~~~~~



Let's see how it does!

In [None]:
response = simple_rag.query("Who is the evil Wizard in the story?")

<<<<<<< local


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


wandb: Logged trace tree to W&B.




[34m[1mwandb[0m: Logged trace tree to W&B.


>>>>>>> remote


In [None]:
response.response

<<<<<<< local


'The evil Wizard in the story is Saruman.'



"Saruman the White is the evil wizard in the story. He is described as waging war upon Rohan and devastating Fangorn Forest, having allied himself with Sauron. Saruman has decimated a large part of Fangorn Forest and is portrayed as openly presenting himself as Sauron's servant. He also placed a spell on Théoden, the King of Rohan, until Gandalf healed him."

>>>>>>> remote


That makes sense!

Let's ask a question that's slightly more...ambiguous.

In [None]:
response = simple_rag.query("Who are the giant beings that roam across the world?")

<<<<<<< local


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


wandb: Logged trace tree to W&B.




[34m[1mwandb[0m: Logged trace tree to W&B.


>>>>>>> remote


In [None]:
response.response

<<<<<<< local


'The giant beings that roam across the world are the sandworms.'



'I apologize, but I don\'t have any information about giant beings roaming across the world based on the context provided. The given text primarily discusses details about the production of two films - "Dune" (2021) and "The Lord of the Rings: The Fellowship of the Ring". It doesn\'t mention any giant beings roaming the world. If you have a different question about these films or their production, I\'d be happy to try to answer based on the information available.'

>>>>>>> remote


We can check the source nodes to see which movies we retrieved.

In [None]:
print([x.metadata["title"] for x in response.source_nodes])

['Dune (2021 film)', 'The Lord of the Rings: The Fellowship of the Ring']


Okay, so in this case - we've gone with "Sandworms " from Dune.

But there's also the Ents from Lord of the Rings, and it looks like we got documents from Lord of the Rings as well.

Let's see if there's a way we can use the title metadata we added to filter the results we get!

## Task 6: Auto Retriever Functional Tool

This tool will leverage OpenAI's functional endpoint to select the correct metadata filter and query the filtered index - only looking at nodes with the desired metadata.

A simplified diagram: ![image](https://i.imgur.com/AICDPav.png)

First, we need to create our `VectoreStoreInfo` object which will hold all the relevant metadata we need for each component (in this case title metadata).

Notice that you need to include it in a text list.

In [None]:
from llama_index.core.tools import FunctionTool
from llama_index.core.vector_stores.types import (
    VectorStoreInfo,
    MetadataInfo,
    ExactMatchFilter,
    MetadataFilters,
)
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine

from typing import List, Tuple, Any
from pydantic import BaseModel, Field

top_k = 3

vector_store_info = VectorStoreInfo(
    content_info="semantic information about movies",
    metadata_info=[
        MetadataInfo(
            name="title",
            type="str",
            description='title of the movie, one of ["Dune (2021 film)", "Dune: Part Two", "The Lord of the Rings: The Fellowship of the Ring", "The Lord of the Rings: The Two Towers"]',
        )
    ],
)

Now we'll create our base PyDantic object that we can use to ensure compatability with our application layer. This verifies that the response from the OpenAI endpoint conforms to this schema.

In [None]:
class AutoRetrieveModel(BaseModel):
    query: str = Field(..., description="natural language query string")
    filter_key_list: List[str] = Field(
        ..., description="List of metadata filter field names"
    )
    filter_value_list: List[str] = Field(
        ...,
        description=(
            "List of metadata filter field values (corresponding to names specified in filter_key_list)"
        ),
    )

Now we can build our function that we will use to query the functional endpoint.

In [None]:
def auto_retrieve_fn(
    query: str, filter_key_list: List[str], filter_value_list: List[str]
):
    """Auto retrieval function.

    Performs auto-retrieval from a vector database, and then applies a set of filters.

    """
    query = query or "Query"

    exact_match_filters = [
        ExactMatchFilter(key=k, value=v)
        for k, v in zip(filter_key_list, filter_value_list)
    ]
    retriever = VectorIndexRetriever(
        index, filters=MetadataFilters(filters=exact_match_filters), top_k=top_k
    )
    query_engine = RetrieverQueryEngine.from_args(retriever)

    response = query_engine.query(query)
    return str(response)

Now we need to wrap our system in a tool in order to integrate it into the larger application.

Source Code Here:
- [`FunctionTool`](https://github.com/jerryjliu/llama_index/blob/d24767b0812ac56104497d8f59095eccbe9f2b08/llama_index/tools/function_tool.py#L21)

In [None]:
description = f"""\
Use this tool to look up non-review based information about films.
The vector database schema is given below:
{vector_store_info.json()}
"""

auto_retrieve_tool = FunctionTool.from_defaults(
    fn=auto_retrieve_fn,
    name="semantic-film-info",
    description=description,
    fn_schema=AutoRetrieveModel,
)

#### ❓ Question #3:

Is the text in the description of our `FunctionTool` important or not? Please explain your answer.

#### Answer #3:

Yes, the text in the description is important as it gets sent to the LLM when information about the tool and its parameters are sent. Essentially, it helps the LLM understand what the tool does and its purpose, to better inform the LLM's decision-making process.

All that's left to do is attach the tool to an OpenAIAgent and let it rip!

Source Code Here:
- [`OpenAIAgent`](https://github.com/jerryjliu/llama_index/blob/d24767b0812ac56104497d8f59095eccbe9f2b08/llama_index/agent/openai_agent.py#L361)

In [None]:
from llama_index.core.agent import FunctionCallingAgentWorker

agent_worker = FunctionCallingAgentWorker.from_tools(
    tools=[auto_retrieve_tool],
    verbose=True,
)

agent = agent_worker.as_agent()

In [None]:
response = agent.chat("Who starred in the 2021 film?")

<<<<<<< local


Added user message to memory: Who starred in the 2021 film?
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: semantic-film-info with args: {"query": "Who starred in the 2021 film?", "filter_key_list": ["title"], "filter_value_list": ["Dune (2021 film)"]}
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Function Output ===
The 2021 film "Dune" starred Timothée Chalamet, Rebecca Ferguson, Dave Bautista, Stellan Skarsgård, Charlotte Rampling, Oscar Isaac, Zendaya, Javier Bardem, Josh Brolin, Jason Momoa, David Dastmalchian, Stephen McKinley Hender

wandb: Logged trace tree to W&B.




Added user message to memory: Who starred in the 2021 film?
=== LLM Response ===
To answer your question about who starred in the 2021 film, I'll need to use the semantic-film-info tool to look up information about the movie. Based on the available films in the database, I assume you're referring to "Dune (2021 film)". Let me query the database for this information.
=== Calling Function ===
Calling function: semantic-film-info with args: {"query": "Who starred in Dune (2021 film)?", "filter_key_list": ["title"], "filter_value_list": ["Dune (2021 film)"]}
=== Function Output ===
The 2021 film adaptation of Dune features an ensemble cast including Timothée Chalamet, Rebecca Ferguson, Oscar Isaac, Josh Brolin, Stellan Skarsgård, Dave Bautista, Stephen McKinley Henderson, Zendaya, Chang Chen, Sharon Duncan-Brewster, Charlotte Rampling, Jason Momoa, and Javier Bardem.

Some notable casting details include:

- Timothée Chalamet as Paul Atreides
- Rebecca Ferguson as Lady Jessica
- Oscar Isaa

[34m[1mwandb[0m: Logged trace tree to W&B.


=== LLM Response ===
Based on the information from the semantic-film-info tool, I can provide you with details about who starred in the 2021 film "Dune":

The movie features an impressive ensemble cast, including:

1. Timothée Chalamet as Paul Atreides (the main protagonist)
2. Rebecca Ferguson as Lady Jessica (Paul's mother)
3. Oscar Isaac as Duke Leto Atreides (Paul's father)
4. Josh Brolin as Gurney Halleck
5. Stellan Skarsgård as Baron Vladimir Harkonnen (the main antagonist)
6. Dave Bautista as Glossu Rabban
7. Zendaya as Chani
8. Jason Momoa as Duncan Idaho
9. Javier Bardem as Stilgar
10. Charlotte Rampling as Gaius Helen Mohiam
11. Sharon Duncan-Brewster as Dr. Liet Kynes

Additionally, the cast includes Stephen McKinley Henderson and Chang Chen in supporting roles.

This star-studded lineup brings to life the characters from Frank Herbert's classic science fiction novel under the direction of Denis Villeneuve. Each actor portrays a key character in the complex world of Dune, co

>>>>>>> remote


In [None]:
response = agent.chat("Who are the giant beings that roam across the world in the movies?")

<<<<<<< local


Added user message to memory: Who are the giant beings that roam across the world in the movies?


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: semantic-film-info with args: {"query": "giant beings that roam across the world", "filter_key_list": ["title"], "filter_value_list": ["Dune (2021 film)"]}
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Function Output ===
The giant beings that roam across the world are the sandworms.
=== Calling Function ===
Calling function: semantic-film-info with args: {"query": "giant beings that roam across the world", "filter_key_list": ["title"], "filter_value_list": ["The Lord of the Rings: The Fellows

wandb: Logged trace tree to W&B.




Added user message to memory: Who are the giant beings that roam across the world in the movies?
=== LLM Response ===
To answer your question about the giant beings that roam across the world in the movies, I'll need to use the semantic-film-info tool to gather information. Since you haven't specified which movie you're referring to, and the question could apply to multiple films in our database, I'll need to query for information about both the Dune films and The Lord of the Rings films, as they both feature large creatures in their worlds.
=== Calling Function ===
Calling function: semantic-film-info with args: {"query": "What are the giant beings or creatures that roam across the world in the movies?", "filter_key_list": ["title"], "filter_value_list": ["Dune (2021 film)", "Dune: Part Two", "The Lord of the Rings: The Fellowship of the Ring", "The Lord of the Rings: The Two Towers"]}
=== Function Output ===
The giant creatures that roam across the world in the movie are called sandw

[34m[1mwandb[0m: Logged trace tree to W&B.


=== LLM Response ===
Based on the information provided by the semantic-film-info tool, I can answer your question about the giant beings that roam across the world in the movies:

In the Dune films (both the 2021 film and its sequel), the giant beings you're referring to are called sandworms. These are massive creatures native to the planet Arrakis, where most of the story takes place. Here are some key points about the sandworms:

1. Size and Appearance: The sandworms are truly gigantic creatures. They were designed to be awe-inspiring and command respect. Their appearance draws inspiration from various natural elements, including:
   - Whales with baleen in their mouths
   - Tree bark
   - Mud flats

2. Significance: These creatures play a crucial role in the film's story and visuals. They are not just background elements but integral to the plot and the ecology of Arrakis.

3. Cultural Impact: To the Fremen people (the native inhabitants of Arrakis), the sandworms are viewed as almo

>>>>>>> remote


# 🤝 Breakout Room #2

## Quantitative RAG Pipeline with NL2SQL Tooling

We'll walk through the steps of creating a natural language to SQL system in the following section.

> NOTICE: This does not have parsing on the inputs or intermediary calls to ensure that users are using safe SQL queries. Use this with caution in a production environment without adding specific guardrails from either side of the application.

The next few steps should be largely straightforward, we'll want to:

1. Read in our `.csv` files into `pd.DataFrame` objects
2. Create an in-memory `sqlite` powered `sqlalchemy` engine
3. Cast our `pd.DataFrame` objects to the SQL engine
4. Create an `SQLDatabase` object through LlamaIndex
5. Use that to create a `QueryEngineTool` that we can interact with through the `NLSQLTableQueryEngine`!

If you get stuck, please consult the documentation.

In [None]:
!curl -o dune1.csv https://raw.githubusercontent.com/AI-Maker-Space/DataRepository/main/dune1.csv


<<<<<<< local


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed



  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  130k  100  130k    0     0  1216k      0 --:--:-- --:--:-- --:--:-- 1289k




>>>>>>> remote


<<<<<<< local <removed>




--2024-06-26 00:01:40--  https://raw.githubusercontent.com/AI-Maker-Space/DataRepository/main/dune1.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 133391 (130K) [text/plain]
Saving to: ‘dune1.csv’


2024-06-26 00:01:40 (7.67 MB/s) - ‘dune1.csv’ saved [133391/133391]



>>>>>>> remote <modified: >


In [None]:
!curl -o dune2.csv https://raw.githubusercontent.com/AI-Maker-Space/DataRepository/main/dune2.csv

<<<<<<< local


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed



  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  109k  100  109k    0     0   777k      0 --:--:-- --:--:-- --:--:--  815k




>>>>>>> remote


<<<<<<< local <removed>




--2024-06-26 00:01:40--  https://raw.githubusercontent.com/AI-Maker-Space/DataRepository/main/dune2.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 111843 (109K) [text/plain]
Saving to: ‘dune2.csv’


2024-06-26 00:01:41 (7.99 MB/s) - ‘dune2.csv’ saved [111843/111843]



>>>>>>> remote <modified: >


In [None]:
!curl -o lotr_fotr.csv https://raw.githubusercontent.com/AI-Maker-Space/DataRepository/main/lotr_fotr.csv

<<<<<<< local


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed



  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  168k  100  168k    0     0   837k      0 --:--:-- --:--:-- --:--:--  865k




>>>>>>> remote


<<<<<<< local <removed>




--2024-06-26 00:01:41--  https://raw.githubusercontent.com/AI-Maker-Space/DataRepository/main/lotr_fotr.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 172855 (169K) [text/plain]
Saving to: ‘lotr_fotr.csv’


2024-06-26 00:01:41 (7.51 MB/s) - ‘lotr_fotr.csv’ saved [172855/172855]



>>>>>>> remote <modified: >


In [None]:
!curl -o lotr.tt.csv https://raw.githubusercontent.com/AI-Maker-Space/DataRepository/main/lotr_tt.csv

<<<<<<< local


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed



  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  111k  100  111k    0     0   514k      0 --:--:-- --:--:-- --:--:--  530k




>>>>>>> remote


<<<<<<< local <removed>




--2024-06-26 00:01:41--  https://raw.githubusercontent.com/AI-Maker-Space/DataRepository/main/lotr_tt.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 114598 (112K) [text/plain]
Saving to: ‘lotr_tt.csv’


2024-06-26 00:01:41 (5.53 MB/s) - ‘lotr_tt.csv’ saved [114598/114598]



>>>>>>> remote <modified: >


#### Read `.csv` Into Pandas

In [None]:
import pandas as pd

dune1 = pd.read_csv("./dune1.csv")
dune2 = pd.read_csv("./dune2.csv")
lotr_fotr = pd.read_csv("./lotr_fotr.csv")
lotr_tt = pd.read_csv("./lotr_tt.csv")

#### Create SQLAlchemy engine with SQLite

In [None]:
from sqlalchemy import create_engine

engine = create_engine("sqlite+pysqlite:///:memory:")

#### Convert `pd.DataFrame` to SQL tables

In [None]:
dune1.to_sql(
  "Dune (2021 film)",
  engine
)

274

In [None]:
dune2.to_sql(
  "Dune: Part Two",
  engine
)

175

In [None]:
lotr_fotr.to_sql(
  "The Lord of the Rings: The Fellowship of the Ring",
  engine
)

250

In [None]:
lotr_tt.to_sql(
  "The Lord of the Rings: The Two Towers",
  engine
)

149

#### Construct a `SQLDatabase` index

Source Code Here:
- [`SQLDatabase`](https://github.com/jerryjliu/llama_index/blob/d24767b0812ac56104497d8f59095eccbe9f2b08/llama_index/langchain_helpers/sql_wrapper.py#L9)

In [None]:
from llama_index.core import SQLDatabase

sql_database = SQLDatabase(engine=engine, include_tables=movie_list)

#### Create the NLSQLTableQueryEngine interface for all added SQL tables

Source Code Here:
- [`NLSQLTableQueryEngine`](https://github.com/jerryjliu/llama_index/blob/d24767b0812ac56104497d8f59095eccbe9f2b08/llama_index/indices/struct_store/sql_query.py#L75C1-L75C1)

In [None]:
from llama_index.core.indices.struct_store.sql_query import NLSQLTableQueryEngine

sql_query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=movie_list,
)

#### Wrap It All Up in a `QueryEngineTool`

You'll want to ensure you have a descriptive...description!

This is what will help the LLM decide which table to use when querying!

Sorce Code Here:

- [`QueryEngineTool`](https://github.com/jerryjliu/llama_index/blob/d24767b0812ac56104497d8f59095eccbe9f2b08/llama_index/tools/query_engine.py#L13)

#### 🏗️ Activity #1:

Please write a Natural Language Description for the tables that we are using today.

Here is an example:

```
This tool should be used to answer any and all review related inquiries by translating a natural language query into a SQL query with access to tables:
'Dune (2021 film)' - containing info. about the first movie in the Dune series,
'Dune: Part Two'- containing info. about about the second movie in the Dune series,
'The Lord of the Rings: The Fellowship of the Ring' - containing info. about the first movie in the Lord of the Ring series,
'The Lord of the Rings: The Two Towers' - containing info. the second movie in the Lord of the Ring series,
```

In [72]:
DESCRIPTION = """\
Use this tool to answer any and all review related inquiries by translating a natural language query into a SQL query with access to tables.

The database schema is given below:
- Review Date: Date of the review
- Review Title: Title of the review
- Review: Review text
- Rating: Rating of the review
- Author: Author of the review
- URL: URL of the review


The 4 tables are:
- 'Dune (2021 film)' - containing info. about the first movie in the Dune series,
- 'Dune: Part Two'- containing info. about about the second movie in the Dune series,
- 'The Lord of the Rings: The Fellowship of the Ring' - containing info. about the first movie in the Lord of the Ring series,
- 'The Lord of the Rings: The Two Towers' - containing info. the second movie in the Lord of the Ring series,

"""


In [None]:
from llama_index.core.tools.query_engine import QueryEngineTool

sql_tool = QueryEngineTool.from_defaults(
    query_engine=sql_query_engine,
    name="sql-query",
    description=DESCRIPTION,
)

In [None]:
agent_worker = FunctionCallingAgentWorker.from_tools(
    tools=[sql_tool],
    verbose=True
)

agent = agent_worker.as_agent()

In [None]:
response = agent.chat("What is the average rating of the 2nd Lord of the Rings movie?")

<<<<<<< local


Added user message to memory: What is the average rating of the 2nd Lord of the Rings movie?
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: sql-query with args: {"input": "SELECT AVG(Rating) AS Average_Rating FROM 'The Lord of the Rings: The Two Towers'"}
INFO:llama_index.core.indices.struct_store.sql_retriever:> Table desc str: Table 'Dune (2021 film)' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Url (TEXT), and foreign keys: .

Table 'Dune: Part Two' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Url (TEXT), and foreign keys: .

Table 'The Lord of the Rings: The Fellowship of the Ring' has columns: index (BIGINT), Unnamed: 0 (B

wandb: Logged trace tree to W&B.




Added user message to memory: What is the average rating of the 2nd Lord of the Rings movie?
=== LLM Response ===
To answer this question, we need to query the database for information about the second Lord of the Rings movie, which is "The Lord of the Rings: The Two Towers". Let's use the SQL query tool to retrieve this information.
=== Calling Function ===
Calling function: sql-query with args: {"input": "SELECT AVG(rating) as average_rating FROM `The Lord of the Rings: The Two Towers`"}
=== Function Output ===
I apologize, but I'm unable to provide an accurate response based on the given query and results. The SQL query appears to be incomplete or invalid, and an error was returned instead of actual data. 

To properly answer a question about the average rating for "The Lord of the Rings: The Two Towers", we would need:

1. A valid SQL query
2. A database with a table containing rating information for the movie
3. The actual results of running that query

Without this information, I

[34m[1mwandb[0m: Logged trace tree to W&B.


=== LLM Response ===
Thank you for your patience. I apologize that I couldn't provide the exact average rating for "The Lord of the Rings: The Two Towers" as initially requested. However, based on the information we were able to retrieve, I can share some insights about the movie's reception:

1. We have access to review data for the movie, which includes individual ratings and review content.

2. The highest-rated review in our sample gave the movie a perfect score of 10 out of 10.

3. This top review suggests that the movie was very well-received, with the reviewer considering it an "epic" and potentially even better than its predecessor.

While we can't calculate the exact average rating from this limited data, we can infer that the movie was highly regarded by at least some viewers. The perfect score from the top review indicates that the movie likely has a high average rating overall, though we'd need more data to confirm this.

If you'd like more specific information about the mo

>>>>>>> remote


In [None]:
print(str(response))

<<<<<<< local




Thank you for your patience. I apologize that I couldn't provide the exact average rating for "The Lord of the Rings: The Two Towers" as initially requested. However, based on the information we were able to retrieve, I can share some insights about the movie's reception:

1. We have access to review data for the movie, which includes individual ratings and review content.

2. The highest-rated review in our sample gave the movie a perfect score of 10 out of 10.

3. This top review suggests that the movie was very well-received, with the reviewer considering it an "epic" and potentially even better than its predecessor.

While we can't calculate the exact average rating from this limited data, we can infer that the movie was highly regarded by at least some viewers. The perfect score from the top review indicates that the movie likely has a high average rating overall, though we'd need more data to confirm this.

If you'd like more specific information about the movie's ratings or revi

>>>>>>> remote


<<<<<<< local <modified: >


The average rating for "The Lord of the Rings: The Two Towers" is approximately 9.87.




>>>>>>> remote <removed>


In [None]:
response = agent.chat("What movie series has better reviews, Lord of the Rings or Dune?")

<<<<<<< local


Added user message to memory: What movie series has better reviews, Lord of the Rings or Dune?
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 500 Internal Server Error"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 500 Internal Server Error"
INFO:openai._base_client:Retrying request to /chat/completions in 0.969333 seconds
Retrying request to /chat/completions in 0.969333 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: sql-query with args: {"input": "SELECT AVG(Rating) AS Average_Rating FROM 'Dune (2021 film)'"}
INFO:llama_index.core.indices.struct_store.sql_retriever:> Table desc str: Table 'Dune (2021 film)' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Ur

wandb: Logged trace tree to W&B.




Added user message to memory: What movie series has better reviews, Lord of the Rings or Dune?
=== LLM Response ===
To answer this question, we'll need to compare the reviews of both the Lord of the Rings and Dune movie series. We'll query the database for information about the movies from both series and then analyze the results. Let's start by getting data for each movie.
=== Calling Function ===
Calling function: sql-query with args: {"input": "SELECT 'LOTR: Fellowship' AS movie, AVG(rating) AS avg_rating, COUNT(*) AS review_count\nFROM `The Lord of the Rings: The Fellowship of the Ring`\nUNION ALL\nSELECT 'LOTR: Two Towers' AS movie, AVG(rating) AS avg_rating, COUNT(*) AS review_count\nFROM `The Lord of the Rings: The Two Towers`\nUNION ALL\nSELECT 'Dune (2021)' AS movie, AVG(rating) AS avg_rating, COUNT(*) AS review_count\nFROM `Dune (2021 film)`\nUNION ALL\nSELECT 'Dune: Part Two' AS movie, AVG(rating) AS avg_rating, COUNT(*) AS review_count\nFROM `Dune: Part Two`"}
=== Function 

[34m[1mwandb[0m: Logged trace tree to W&B.


=== LLM Response ===
I apologize once again for the difficulties we're encountering with the SQL queries. It seems that we're unable to directly access or query the database as we initially thought. Given these limitations, I'll provide a comparison of the Lord of the Rings and Dune series based on generally available information, rather than specific data from the database we were trying to query.

Lord of the Rings series:
1. The Lord of the Rings trilogy, directed by Peter Jackson, is widely regarded as one of the most successful and critically acclaimed film series of all time.
2. All three films (The Fellowship of the Ring, The Two Towers, and The Return of the King) received high praise from both critics and audiences.
3. The Return of the King, the final installment, won 11 Academy Awards, including Best Picture, tying the record for most Oscars won by a single film.
4. On aggregate review sites like Rotten Tomatoes and Metacritic, all three films consistently score above 90%, i

>>>>>>> remote


In [None]:
print(str(response))

<<<<<<< local




I apologize once again for the difficulties we're encountering with the SQL queries. It seems that we're unable to directly access or query the database as we initially thought. Given these limitations, I'll provide a comparison of the Lord of the Rings and Dune series based on generally available information, rather than specific data from the database we were trying to query.

Lord of the Rings series:
1. The Lord of the Rings trilogy, directed by Peter Jackson, is widely regarded as one of the most successful and critically acclaimed film series of all time.
2. All three films (The Fellowship of the Ring, The Two Towers, and The Return of the King) received high praise from both critics and audiences.
3. The Return of the King, the final installment, won 11 Academy Awards, including Best Picture, tying the record for most Oscars won by a single film.
4. On aggregate review sites like Rotten Tomatoes and Metacritic, all three films consistently score above 90%, indicating universal a

>>>>>>> remote


<<<<<<< local <modified: >


Here are the average ratings for each movie:

- **Dune (2021 film)**: 8.34
- **Dune: Part Two**: 8.71
- **The Lord of the Rings: The Fellowship of the Ring**: 9.87
- **The Lord of the Rings: The Two Towers**: 9.87

Based on these average ratings, the **Lord of the Rings** series has better reviews compared to the **Dune** series.




>>>>>>> remote <removed>


### Task 2: Combined RAG Pipeline

Now, we can simply add our tools into the `OpenAIAgent`, and off we go!

In [None]:
combined_tool_agent_worker = FunctionCallingAgentWorker.from_tools(
    tools=[auto_retrieve_tool, sql_tool],
    verbose=True
)

combined_tool_agent = combined_tool_agent_worker.as_agent()

In [None]:
response = combined_tool_agent.chat("Which movie is about a ring, and what is the average rating of the movie?")

<<<<<<< local


Added user message to memory: Which movie is about a ring, and what is the average rating of the movie?
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: semantic-film-info with args: {"query": "movie about a ring", "filter_key_list": ["title"], "filter_value_list": ["The Lord of the Rings: The Fellowship of the Ring", "The Lord of the Rings: The Two Towers"]}
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Function Output ===
The movie is "The Lord of the Rings: The Fellowship of the Ring." It tells the story of the Dark Lord Sauron, who seeks

wandb: Logged trace tree to W&B.




Added user message to memory: Which movie is about a ring, and what is the average rating of the movie?
=== LLM Response ===
To answer your question, we'll need to use two different tools. First, we'll use the semantic film info tool to find out which movie is about a ring, and then we'll use the SQL query tool to get the average rating for that movie.

Let's start with finding the movie about a ring:
=== Calling Function ===
Calling function: semantic-film-info with args: {"query": "Which movie is about a ring?", "filter_key_list": [], "filter_value_list": []}
=== Function Output ===
The movie that is about a ring is "The Lord of the Rings: The Fellowship of the Ring." This 2001 epic fantasy adventure film, directed by Peter Jackson, tells the story of a powerful ring sought by the Dark Lord Sauron. The plot revolves around a young hobbit named Frodo Baggins who inherits this ring and must embark on a perilous journey with eight companions to destroy it in Mount Doom, the only place w

[34m[1mwandb[0m: Logged trace tree to W&B.


=== LLM Response ===
Now, I can answer your question completely:

The movie about a ring is "The Lord of the Rings: The Fellowship of the Ring." This movie is the first installment in The Lord of the Rings trilogy, directed by Peter Jackson. It tells the story of Frodo Baggins and his quest to destroy a powerful ring sought by the Dark Lord Sauron.

The average rating of "The Lord of the Rings: The Fellowship of the Ring" is approximately 9.87 out of 10. This exceptionally high rating suggests that the movie is widely loved and critically acclaimed, considered a masterpiece in the fantasy genre.


>>>>>>> remote


In [None]:
print(str(response))

<<<<<<< local




Now, I can answer your question completely:

The movie about a ring is "The Lord of the Rings: The Fellowship of the Ring." This movie is the first installment in The Lord of the Rings trilogy, directed by Peter Jackson. It tells the story of Frodo Baggins and his quest to destroy a powerful ring sought by the Dark Lord Sauron.

The average rating of "The Lord of the Rings: The Fellowship of the Ring" is approximately 9.87 out of 10. This exceptionally high rating suggests that the movie is widely loved and critically acclaimed, considered a masterpiece in the fantasy genre.


>>>>>>> remote


<<<<<<< local <modified: >


The movie about a ring is **"The Lord of the Rings: The Fellowship of the Ring."** It tells the story of the Dark Lord Sauron, who seeks the One Ring to return to power. The Ring has found its way to a young hobbit named Frodo Baggins, who, along with eight companions, embarks on a perilous journey to Mount Doom in Mordor to destroy it.

The average rating for **"The Lord of the Rings: The Fellowship of the Ring"** is approximately **9.87**.




>>>>>>> remote <removed>


In [None]:
response = combined_tool_agent.chat("What worlds do the LoTR, and Dune movies take place in?")

<<<<<<< local


Added user message to memory: What worlds do the LoTR, and Dune movies take place in?
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: semantic-film-info with args: {"query": "world where the movie takes place", "filter_key_list": ["title"], "filter_value_list": ["The Lord of the Rings: The Fellowship of the Ring", "The Lord of the Rings: The Two Towers"]}
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Function Output ===
The movie takes place in the fictional world of Middle-earth.
=== Calling Function ===
Calling function: semantic-film-inf

wandb: Logged trace tree to W&B.




Added user message to memory: What worlds do the LoTR, and Dune movies take place in?
=== LLM Response ===
To answer this question, we'll need to use the semantic film info tool to gather information about the worlds in which the Lord of the Rings (LoTR) and Dune movies take place. We'll make two separate queries, one for each franchise.
=== Calling Function ===
Calling function: semantic-film-info with args: {"query": "What world do the Lord of the Rings movies take place in?", "filter_key_list": ["title"], "filter_value_list": ["The Lord of the Rings: The Fellowship of the Ring"]}
=== Function Output ===
The Lord of the Rings movies take place in a world called Middle-earth. This fantasy realm is the setting for the epic adventure that unfolds in the film, where various races including hobbits, elves, dwarves, and men face the threat of the Dark Lord Sauron. The story follows Frodo Baggins and his companions as they embark on a perilous journey across Middle-earth to destroy the One 

[34m[1mwandb[0m: Logged trace tree to W&B.


=== LLM Response ===
Now, I can provide a comprehensive answer to your question:

1. The Lord of the Rings (LoTR) movies:
   The LoTR movies take place in a world called Middle-earth. This is a fantasy realm created by J.R.R. Tolkien, featuring various races such as hobbits, elves, dwarves, and men. The story unfolds across different regions of Middle-earth as the characters embark on their quest to destroy the One Ring and defeat the Dark Lord Sauron.

2. The Dune movies:
   The Dune movies are set in a distant future universe, with the primary focus on the planet Arrakis. However, the Dune universe is vast and includes multiple planets:

   - Arrakis: Also known as Dune, this is a harsh desert planet and the only source of the valuable substance called "spice."
   - Caladan: The homeworld of House Atreides before they move to Arrakis.
   - Giedi Prime: The homeworld of House Harkonnen.

   This universe features interstellar travel, complex political systems involving noble houses, a

>>>>>>> remote


In [None]:
print(str(response))

<<<<<<< local


- The **"The Lord of the Rings"** movies take place in the fictional world of **Middle-earth**.

- The **"Dune"** movies take place in a distant future where humanity has spread across the universe and settled on various planets. The story primarily revolves around the desert planet **Arrakis**, also known as **Dune**, which is the only source of the valuable substance known as "spice." Other significant planets include **Caladan**, the oceanic home of House Atreides, and **Giedi Prime**, the industrial home of House Harkonnen. The universe is characterized by its complex political, social, and ecological systems.




Now, I can provide a comprehensive answer to your question:

1. The Lord of the Rings (LoTR) movies:
   The LoTR movies take place in a world called Middle-earth. This is a fantasy realm created by J.R.R. Tolkien, featuring various races such as hobbits, elves, dwarves, and men. The story unfolds across different regions of Middle-earth as the characters embark on their quest to destroy the One Ring and defeat the Dark Lord Sauron.

2. The Dune movies:
   The Dune movies are set in a distant future universe, with the primary focus on the planet Arrakis. However, the Dune universe is vast and includes multiple planets:

   - Arrakis: Also known as Dune, this is a harsh desert planet and the only source of the valuable substance called "spice."
   - Caladan: The homeworld of House Atreides before they move to Arrakis.
   - Giedi Prime: The homeworld of House Harkonnen.

   This universe features interstellar travel, complex political systems involving noble houses, advanced technology, a

>>>>>>> remote


In [None]:
response = combined_tool_agent.chat("Which of the following movie series is considered the 'best': Dune, or Lord of the Rings? Base your answer on both reviews, and non-review information.")

<<<<<<< local


Added user message to memory: Which of the following movie series is considered the 'best': Dune, or Lord of the Rings? Base your answer on both reviews, and non-review information.
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: sql-query with args: {"input": "SELECT AVG(Rating) as AverageRating FROM \"Dune (2021 film)\""}
INFO:llama_index.core.indices.struct_store.sql_retriever:> Table desc str: Table 'Dune (2021 film)' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Url (TEXT), and foreign keys: .

Table 'Dune: Part Two' has columns: index (BIGINT), Unnamed: 0 (BIGINT), Review_Date (TEXT), Author (TEXT), Rating (FLOAT), Review_Title (TEXT), Review (TEXT), Review_Url (TEXT), and foreign keys: .

Table 'The Lord of the Rings: T

wandb: Logged trace tree to W&B.




Added user message to memory: Which of the following movie series is considered the 'best': Dune, or Lord of the Rings? Base your answer on both reviews, and non-review information.
=== LLM Response ===
To answer this question comprehensively, we'll need to gather information about both the Dune and Lord of the Rings movie series using our available tools. We'll look at review data as well as non-review information to make a comparison. Let's start by gathering some general information about both series, and then we'll look at the reviews.

First, let's get some non-review information about both series:
=== Calling Function ===
Calling function: semantic-film-info with args: {"query": "Compare the Dune and Lord of the Rings movie series in terms of their cultural impact, critical reception, and overall popularity.", "filter_key_list": [], "filter_value_list": []}
=== Function Output ===
I apologize, but I do not have enough information in the given context to make a meaningful comparis

[34m[1mwandb[0m: Logged trace tree to W&B.


=== LLM Response ===
Now, to answer your question about which movie series is considered the 'best' between Dune and Lord of the Rings, based on both reviews and non-review information:

1. Critical Reception and Awards:
   - Lord of the Rings: The series, particularly "The Fellowship of the Ring," received widespread acclaim. It won numerous awards, including four Academy Awards for the first film alone. The series is considered a landmark in filmmaking and a major achievement in the fantasy genre.
   - Dune: The 2021 film and its sequel have also received positive reception, with critics praising various aspects including performances, visuals, and score. The first film won six Academy Awards, showing strong recognition in the industry.

2. Cultural Impact:
   - Lord of the Rings: The series has had a massive cultural impact. "The Fellowship of the Ring" was named one of the 100 greatest American films in history by the American Film Institute and was selected for preservation in the

>>>>>>> remote


In [None]:
print(str(response))

<<<<<<< local


Based on both reviews and non-review information:

### Reviews:
- **Dune Series:**
  - **Dune (2021)**: Average rating of approximately 8.34.
  - **Dune: Part Two**: Average rating of approximately 8.71.

- **Lord of the Rings Series:**
  - **The Lord of the Rings: The Fellowship of the Ring**: Average rating of approximately 9.87.
  - **The Lord of the Rings: The Two Towers**: (Rating not explicitly provided, but the series is highly acclaimed).

### Non-Review Information:
- **Dune Series:**
  - "Dune" (2021) received numerous accolades, including ten Academy Award nominations and six wins.

- **Lord of the Rings Series:**
  - The "Lord of the Rings" film series is highly celebrated, particularly "The Return of the King," which won eleven Academy Awards.

### Conclusion:
While both series are critically acclaimed, the **"Lord of the Rings"** series, particularly "The Fellowship of the Ring," has higher average ratings and has received significant recognition in terms of awards. There



Now, to answer your question about which movie series is considered the 'best' between Dune and Lord of the Rings, based on both reviews and non-review information:

1. Critical Reception and Awards:
   - Lord of the Rings: The series, particularly "The Fellowship of the Ring," received widespread acclaim. It won numerous awards, including four Academy Awards for the first film alone. The series is considered a landmark in filmmaking and a major achievement in the fantasy genre.
   - Dune: The 2021 film and its sequel have also received positive reception, with critics praising various aspects including performances, visuals, and score. The first film won six Academy Awards, showing strong recognition in the industry.

2. Cultural Impact:
   - Lord of the Rings: The series has had a massive cultural impact. "The Fellowship of the Ring" was named one of the 100 greatest American films in history by the American Film Institute and was selected for preservation in the United States Nation

>>>>>>> remote


#### ❓ Question #4:

How can you verify which tool was used for which query?

#### Answer #4:

This can be done by inspecting the W&B trace for the run in order to pinpoint the exact tool used for each query.

In [86]:
wandb_callback.finish()