In [1]:
%pip install llama-index-agent-openai
%pip install llama-index-llms-openai

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [2]:
!pip install llama-index



In [3]:
%load_ext autoreload
%autoreload 2

In [37]:
import openai
from dotenv import load_dotenv
load_dotenv()

openai.api_key = os.getenv("OPENAI_API_KEY_QUERY_DECOMP")
api_key = os.environ["OPENAI_API_KEY_QUERY_DECOMP"]

In [38]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.response.pprint_utils import pprint_response
from llama_index.llms.openai import OpenAI

In [39]:
llm = OpenAI(temperature=0, model="gpt-4")

In [40]:
!mkdir -p 'data/paul_graham/'

In [43]:
pg_files = SimpleDirectoryReader(
    input_files=["./data/paul_graham/paul_graham_essay.txt"]
).load_data()

In [44]:
pg_index = VectorStoreIndex.from_documents(pg_files)

In [45]:
pg_engine = pg_index.as_query_engine(similarity_top_k=3, llm=llm)

In [50]:
from llama_index.core.tools import QueryEngineTool


query_tool_pg = QueryEngineTool.from_defaults(
    query_engine=pg_engine,
    name="pg_files",
    description=(
        f"Paul graham essay"
    ),
)

In [51]:
# define query plan tool
from llama_index.core.tools import QueryPlanTool
from llama_index.core import get_response_synthesizer

response_synthesizer = get_response_synthesizer()
query_plan_tool = QueryPlanTool.from_defaults(
    query_engine_tools=[query_tool_pg],
    response_synthesizer=response_synthesizer,
)

In [52]:
query_plan_tool.metadata.to_openai_tool()  # to_openai_function() deprecated

{'type': 'function',
 'function': {'name': 'query_plan_tool',
  'description': '        This is a query plan tool that takes in a list of tools and executes a query plan over these tools to answer a query. The query plan is a DAG of query nodes.\n\nGiven a list of tool names and the query plan schema, you can choose to generate a query plan to answer a question.\n\nThe tool names and descriptions are as follows:\n\n\n\n        Tool Name: pg_files\nTool Description: Paul graham essay \n        ',
  'parameters': {'type': 'object',
   'properties': {'nodes': {'title': 'Nodes',
     'description': 'The original question we are asking.',
     'type': 'array',
     'items': {'$ref': '#/definitions/QueryNode'}}},
   'required': ['nodes'],
   'definitions': {'QueryNode': {'title': 'QueryNode',
     'description': 'Query node.\n\nA query node represents a query (query_str) that must be answered.\nIt can either be answered by a tool (tool_name), or by a list of child nodes\n(child_nodes).\nThe 

In [53]:
from llama_index.agent.openai import OpenAIAgent
from llama_index.llms.openai import OpenAI


agent = OpenAIAgent.from_tools(
    [query_plan_tool],
    max_function_calls=10,
    llm=OpenAI(temperature=0, model="gpt-4-0613"),
    verbose=True,
)

In [58]:
response = agent.query("Compare and contrast Paul Graham's thoughts about Sam Altman.")

Added user message to memory: Compare and contrast Paul Graham's thoughts about Sam Altman.


In [55]:
from llama_index.core.tools.query_plan import QueryPlan, QueryNode

query_plan = QueryPlan(
    nodes=[
        QueryNode(
            id=1,
            query_str="risk factors",
            tool_name="pg_files",
            dependencies=[],
        )
    ]
)

In [56]:
QueryPlan.schema()

{'title': 'QueryPlan',
 'description': "Query plan.\n\nContains a list of QueryNode objects (which is a recursive object).\nOut of the list of QueryNode objects, one of them must be the root node.\nThe root node is the one that isn't a dependency of any other node.",
 'type': 'object',
 'properties': {'nodes': {'title': 'Nodes',
   'description': 'The original question we are asking.',
   'type': 'array',
   'items': {'$ref': '#/definitions/QueryNode'}}},
 'required': ['nodes'],
 'definitions': {'QueryNode': {'title': 'QueryNode',
   'description': 'Query node.\n\nA query node represents a query (query_str) that must be answered.\nIt can either be answered by a tool (tool_name), or by a list of child nodes\n(child_nodes).\nThe tool_name and child_nodes fields are mutually exclusive.',
   'type': 'object',
   'properties': {'id': {'title': 'Id',
     'description': 'ID of the query node.',
     'type': 'integer'},
    'query_str': {'title': 'Query Str',
     'description': 'Question we 

In [59]:
response = agent.query(
    "Analyze the journey in technology before and after college."
)

Added user message to memory: Analyze the journey in technology before and after college.


In [60]:
print(str(response))

To analyze the journey in technology before and after college, we can use a query plan tool. This tool will help us to gather and analyze data from various sources. Here is a possible query plan:

```json
{
  "nodes": [
    {
      "id": 1,
      "query_str": "Analyze the journey in technology before college",
      "tool_name": "pg_files",
      "dependencies": []
    },
    {
      "id": 2,
      "query_str": "Analyze the journey in technology after college",
      "tool_name": "pg_files",
      "dependencies": []
    },
    {
      "id": 3,
      "query_str": "Compare the journey in technology before and after college",
      "tool_name": "pg_files",
      "dependencies": [1, 2]
    }
  ]
}
```

In this plan, we first analyze the journey in technology before college (node 1) and after college (node 2) separately. Then, we compare these two journeys (node 3). The tool we use for all these queries is "pg_files", which is a tool for analyzing Paul Graham's essays. This tool might be us

In [62]:
response = agent.query(
    "What is YC and how did this affect the growth of technology over years?"
)

Added user message to memory: What is YC and how did this affect the growth of technology over years?


In [63]:
print(str(response))

To answer this question, we can use the `query_plan_tool` function. We will use the `pg_files` tool which contains Paul Graham's essays. Paul Graham is a co-founder of Y Combinator (YC), and his essays often discuss YC and its impact on technology.

Here is a possible query plan:

1. Query Node 1: "What is Y Combinator (YC)?"
2. Query Node 2: "How has Y Combinator influenced the growth of technology?"
3. Query Node 3: "What are some examples of technology companies that have been influenced by Y Combinator?"

The `query_plan_tool` function can be called as follows:

```json
{
  "nodes": [
    {
      "id": 1,
      "query_str": "What is Y Combinator (YC)?",
      "tool_name": "pg_files",
      "dependencies": []
    },
    {
      "id": 2,
      "query_str": "How has Y Combinator influenced the growth of technology?",
      "tool_name": "pg_files",
      "dependencies": [1]
    },
    {
      "id": 3,
      "query_str": "What are some examples of technology companies that have been inf

In [64]:
response

Response(response='To answer this question, we can use the `query_plan_tool` function. We will use the `pg_files` tool which contains Paul Graham\'s essays. Paul Graham is a co-founder of Y Combinator (YC), and his essays often discuss YC and its impact on technology.\n\nHere is a possible query plan:\n\n1. Query Node 1: "What is Y Combinator (YC)?"\n2. Query Node 2: "How has Y Combinator influenced the growth of technology?"\n3. Query Node 3: "What are some examples of technology companies that have been influenced by Y Combinator?"\n\nThe `query_plan_tool` function can be called as follows:\n\n```json\n{\n  "nodes": [\n    {\n      "id": 1,\n      "query_str": "What is Y Combinator (YC)?",\n      "tool_name": "pg_files",\n      "dependencies": []\n    },\n    {\n      "id": 2,\n      "query_str": "How has Y Combinator influenced the growth of technology?",\n      "tool_name": "pg_files",\n      "dependencies": [1]\n    },\n    {\n      "id": 3,\n      "query_str": "What are some exam