In [1]:
import nest_asyncio
nest_asyncio.apply()

In [2]:
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [3]:
from llama_index.composability.joint_qa_summary import QASummaryGraphBuilder
from llama_index import SimpleDirectoryReader, ServiceContext, LLMPredictor
from llama_index.composability import ComposableGraph
from langchain.chat_models import ChatOpenAI

  from .autonotebook import tqdm as notebook_tqdm


In [4]:
reader = SimpleDirectoryReader('../paul_graham_essay/data')
documents = reader.load_data()

In [5]:
llm_predictor_gpt4 = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-4"))
service_context_gpt4 = ServiceContext.from_defaults(llm_predictor=llm_predictor_gpt4, chunk_size_limit=1024)

llm_predictor_chatgpt = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo"))
service_context_chatgpt = ServiceContext.from_defaults(llm_predictor=llm_predictor_chatgpt, chunk_size_limit=1024)

Unknown max input size for gpt-3.5-turbo, using defaults.


In [6]:
# NOTE: can also specify an existing docstore, service context, summary text, qa_text, etc.
graph_builder = QASummaryGraphBuilder(service_context=service_context_gpt4)
graph = graph_builder.build_graph_from_documents(documents)

INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 20729 tokens
> [build_index_from_nodes] Total embedding token usage: 20729 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 0 tokens
> [build_index_from_nodes] Total embedding token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 0 tokens
> [build_index_from_nodes] Total embedding 

In [7]:
graph.save_to_disk('test_qa_summary_graph.json')

In [8]:
graph = ComposableGraph.load_from_disk('test_qa_summary_graph.json')

In [9]:
# set query config
query_configs = [
    {
        "index_struct_type": "simple_dict",
        "query_mode": "default",
        "query_kwargs": {
            "similarity_top_k": 1
        },
    },
    {
        "index_struct_type": "list",
        "query_mode": "default",
        "query_kwargs": {
            "response_mode": "tree_summarize",
            "use_async": True,
            "verbose": True
        },
    },
    {
        "index_struct_type": "tree",
        "query_mode": "default",
        "query_kwargs": {
            "verbose": True
        },
    },
]

In [13]:
response = graph.query(
    "Can you give me a summary of the author's life?", 
    query_configs=query_configs, 
    service_context=service_context_gpt4
)

>[Level 0] Current response: ANSWER: 2

This summary was selected because the question asks for a summary of the author's life, which implies needing a summarized version rather than a specific context from the documents. Hence, choice 2 is more relevant for summarization queries.
INFO:llama_index.indices.tree.leaf_query:>[Level 0] Selected node: [2]/[2]
>[Level 0] Selected node: [2]/[2]
>[Level 0] Selected node: [2]/[2]
>[Level 0] Node [2] Summary text: Use this index for summarization queries
[36;1m[1;3m> Got node text: 		

What I Worked On

February 2021

Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supp...
[0m[36;1m[1;3m> Got node text: fields would be mere domain knowledge. What I discovered when I got to college was that the other fields took up so much of the space of ideas that there wasn't much left for these supposed ultimat...
[0m[36;1m[1;3m> Got node text: 

In [14]:
print(response)

The author's life has been a journey of exploration, learning, and creating, encompassing a diverse range of interests such as computers, painting, and writing. They studied art in Italy and computer science in the United States before co-founding Viaweb, which was later sold to Yahoo. They also became an influential online essayist and co-founded the successful startup accelerator Y Combinator. Additionally, they worked on designing a new programming language called Bel. After handing over Y Combinator's leadership to Sam Altman, the author returned to painting for a while, but later lost interest, illustrating their continuous search for personal growth and passion-driven endeavors.


In [15]:
response = graph.query(
    "What did the author do growing up?", 
    query_configs=query_configs,
    service_context=service_context_gpt4
)

>[Level 0] Current response: ANSWER: 1

The question, "What did the author do growing up?" asks for specific context from documents about the author's experiences or activities during their childhood. Choice 1 mentions retrieval of specific context from documents, which is more in line with answering this type of question than summarization queries mentioned in Choice 2.
INFO:llama_index.indices.tree.leaf_query:>[Level 0] Selected node: [1]/[1]
>[Level 0] Selected node: [1]/[1]
>[Level 0] Selected node: [1]/[1]
>[Level 0] Node [1] Summary text: Use this index for queries that require retrieval of specific context from documents.
[36;1m[1;3m> Got node text: Growing up, the author worked on writing short stories and programming on the IBM 1401, a computer used in their school district. They also experimented with programming on a TRS-80 microcomputer, ...
[0m

In [16]:
print(response)

Growing up, the author worked on writing short stories and programming on the IBM 1401 computer. They also experimented with programming on a TRS-80 microcomputer, creating simple games, a model rocket prediction program, and a word processor for their father.


In [17]:
response = graph.query(
    "What did the author do during his time in art school?", 
    query_configs=query_configs,
    service_context=service_context_gpt4
)

>[Level 0] Current response: ANSWER: 1

This summary was selected in relation to the question because it involves retrieval of specific context from documents, which is necessary to know what the author did during his time in art school. Choice 2 is focused on summarization queries, which isn't related to the specific detail needed to answer the question.
INFO:llama_index.indices.tree.leaf_query:>[Level 0] Selected node: [1]/[1]
>[Level 0] Selected node: [1]/[1]
>[Level 0] Selected node: [1]/[1]
>[Level 0] Node [1] Summary text: Use this index for queries that require retrieval of specific context from documents.
[36;1m[1;3m> Got node text: The author took classes in fundamental subjects like drawing, color, and design during the foundation program at RISD. He also prepared for the entrance exam at the Accademia di Belli Arti in Flore...
[0m

In [18]:
print(response)

During his time in art school, the author took classes in fundamental subjects, such as drawing, color, and design, as part of the foundation program at RISD. He also prepared for the entrance exam at the Accademia di Belli Arti in Florence, which involved learning Italian.
