In [1]:
import nest_asyncio
nest_asyncio.apply()

In [2]:
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [10]:
from llama_index.composability.joint_qa_summary import QASummaryQueryEngineBuilder
from llama_index import SimpleDirectoryReader, ServiceContext, LLMPredictor
from llama_index.response.notebook_utils import display_response
from langchain.chat_models import ChatOpenAI

In [4]:
reader = SimpleDirectoryReader('../paul_graham_essay/data')
documents = reader.load_data()

In [5]:
llm_predictor_gpt4 = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-4"))
service_context_gpt4 = ServiceContext.from_defaults(llm_predictor=llm_predictor_gpt4, chunk_size_limit=1024)

llm_predictor_chatgpt = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo"))
service_context_chatgpt = ServiceContext.from_defaults(llm_predictor=llm_predictor_chatgpt, chunk_size_limit=1024)

Unknown max input size for gpt-3.5-turbo, using defaults.


In [6]:
# NOTE: can also specify an existing docstore, service context, summary text, qa_text, etc.
query_engine_builder = QASummaryQueryEngineBuilder(service_context=service_context_gpt4)
query_engine = query_engine_builder.build_from_documents(documents)

INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 20729 tokens
> [build_index_from_nodes] Total embedding token usage: 20729 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 0 tokens
> [build_index_from_nodes] Total embedding token usage: 0 tokens


In [14]:
response = query_engine.query(
    "Can you give me a summary of the author's life?", 
)

INFO:llama_index.query_engine.router_query_engine:Selecting query engine 1 because: This choice is relevant because it is specifically for summarization queries, which matches the request for a summary of the author's life..
Selecting query engine 1 because: This choice is relevant because it is specifically for summarization queries, which matches the request for a summary of the author's life..
INFO:llama_index.indices.common_tree.base:> Building index from nodes: 6 chunks
> Building index from nodes: 6 chunks
INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 1012 tokens
> [get_response] Total LLM token usage: 1012 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens
> [get_response] Total embedding token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 23485 tokens
> [get_response] Total LLM token usage: 23485 tokens
INFO:llama_index.token_coun

In [15]:
display_response(response)

**`Final Response:`** The author's life has been a series of diverse experiences and accomplishments, starting with an interest in programming and writing short stories. They studied Artificial Intelligence in college and later developed an interest in art, studying at RISD and the Accademia in Florence, Italy. They worked at a software company called Interleaf and co-founded a startup called Viaweb, which was eventually sold to Yahoo. The author then began writing essays, co-founded Y Combinator, a startup accelerator, and developed a programming language called Arc. Throughout their life, the author has navigated various fields, including art, software, entrepreneurship, and essay writing.

---

**`Source Node 1/24`**

**Document ID:** 36cbb5a0-9b49-4c50-aa4c-b8839ecd3237<br>**Similarity:** None<br>**Text:** What I Worked On

February 2021

Before college the two main things I worked on, outside of schoo...<br>

---

**`Source Node 2/24`**

**Document ID:** b3acf71e-ffb6-46ff-b5e5-4985dc7671a7<br>**Similarity:** None<br>**Text:** fields would be mere domain knowledge. What I discovered when I got to college was that the other...<br>

---

**`Source Node 3/24`**

**Document ID:** bf07251f-6d6c-48cf-814f-386ff529c6fa<br>**Similarity:** None<br>**Text:** mean the sort of AI in which a program that's told "the dog is sitting on the chair" translates t...<br>

---

**`Source Node 4/24`**

**Document ID:** bb833522-2dc0-4d3d-92eb-b3d7dc08e5e8<br>**Similarity:** None<br>**Text:** make enough to survive. And as an artist you could be truly independent. You wouldn't have a boss...<br>

---

**`Source Node 5/24`**

**Document ID:** 516a0e65-7ec0-4cce-bd20-ba31db5ef326<br>**Similarity:** None<br>**Text:** fall. This was now only weeks away. My nice landlady let me leave my stuff in her attic. I had so...<br>

---

**`Source Node 6/24`**

**Document ID:** c08df16f-c5c4-4526-a727-90518545deb4<br>**Similarity:** None<br>**Text:** information-theoretic sense. [4]

I liked painting still lives because I was curious about what I...<br>

---

**`Source Node 7/24`**

**Document ID:** 0a79f1c1-378c-4879-af47-2ec70a1c8c92<br>**Similarity:** None<br>**Text:** people than sales people (though sales is a real skill and people who are good at it are really g...<br>

---

**`Source Node 8/24`**

**Document ID:** e3c462dc-6b59-4dda-9669-75ede12417df<br>**Similarity:** None<br>**Text:** at RISD, but otherwise I was basically teaching myself to paint, and I could do that for free. So...<br>

---

**`Source Node 9/24`**

**Document ID:** cd9a32c6-a162-4ca4-a44e-5c0609d69c2f<br>**Similarity:** None<br>**Text:** and Robert wrote some to resize images and set up an http server to serve the pages. Then we trie...<br>

---

**`Source Node 10/24`**

**Document ID:** 09552906-ef35-4613-8c10-1bce29a18fdd<br>**Similarity:** None<br>**Text:** in September, but we got more ambitious about the software as we worked on it. Eventually we mana...<br>

---

**`Source Node 11/24`**

**Document ID:** a9a2707b-5128-4df8-ab00-552b353bae02<br>**Similarity:** None<br>**Text:** of some clever insight that we set the price low. We had no idea what businesses paid for things....<br>

---

**`Source Node 12/24`**

**Document ID:** 9972da57-9b47-4703-a58e-399c0ddc4ac3<br>**Similarity:** None<br>**Text:** bought us it felt like going from rags to riches. Since we were going to California, I bought a c...<br>

---

**`Source Node 13/24`**

**Document ID:** eaf7f9a5-4c68-4f2b-903c-16ac1fb10a7a<br>**Similarity:** None<br>**Text:** old patterns, except now there were doors where there hadn't been. Now when I was tired of walkin...<br>

---

**`Source Node 14/24`**

**Document ID:** 3957e728-5d15-4a17-93e5-f127aacf64d9<br>**Similarity:** None<br>**Text:** I doing this? If this vision had to be realized as a company, then screw the vision. I'd build a ...<br>

---

**`Source Node 15/24`**

**Document ID:** c73a147a-5e7e-41a6-866c-6e5b2c1fa8cb<br>**Similarity:** None<br>**Text:** of the most conspicuous patterns I've noticed in my life is how well it has worked, for me at lea...<br>

---

**`Source Node 16/24`**

**Document ID:** 4d28c767-892b-40b4-bc01-bbe53650d5a9<br>**Similarity:** None<br>**Text:** start a startup. Maybe they'd be able to avoid the worst of the mistakes we'd made.

So I gave th...<br>

---

**`Source Node 17/24`**

**Document ID:** a0d065dc-a6e4-4e52-8e13-f3031c6e1a9e<br>**Similarity:** None<br>**Text:** due to our ignorance about investing. We needed to get experience as investors. What better way, ...<br>

---

**`Source Node 18/24`**

**Document ID:** 6a603dd2-5d09-4870-b04f-0e0e183c86cd<br>**Similarity:** None<br>**Text:** in. We also noticed that the startups were becoming one another's customers. We used to refer jok...<br>

---

**`Source Node 19/24`**

**Document ID:** 25582ee7-6f67-4efe-bcff-0bf81bb17b7f<br>**Similarity:** None<br>**Text:** day in 2010, when he was visiting California for interviews, Robert Morris did something astonish...<br>

---

**`Source Node 20/24`**

**Document ID:** 00847b3d-eb10-4b3e-b97c-c0674bcebb6d<br>**Similarity:** None<br>**Text:** took a while to get back into shape, but it was at least completely engaging. [18]

I spent most ...<br>

---

**`Source Node 21/24`**

**Document ID:** 9a964817-3224-4e6c-950e-e1a908fd3f7c<br>**Similarity:** None<br>**Text:** 2019. It was fortunate that I had a precisely defined goal, or it would have been hard to keep at...<br>

---

**`Source Node 22/24`**

**Document ID:** 324c3d08-f3b9-4db2-ab78-1445114fa543<br>**Similarity:** None<br>**Text:** Italian words for abstract concepts can nearly always be predicted from their English cognates (e...<br>

---

**`Source Node 23/24`**

**Document ID:** 7eb1a631-d012-45a5-87c8-05b9d88ab493<br>**Similarity:** None<br>**Text:** that our experience with Y Combinator also teaches: Customs continue to constrain you long after ...<br>

---

**`Source Node 24/24`**

**Document ID:** 7d15111e-6ba2-459d-9936-1026764887ad<br>**Similarity:** None<br>**Text:** up a deeply rooted tree.

[19] One way to get more precise about the concept of invented vs disco...<br>

{'36cbb5a0-9b49-4c50-aa4c-b8839ecd3237': None,
 'b3acf71e-ffb6-46ff-b5e5-4985dc7671a7': None,
 'bf07251f-6d6c-48cf-814f-386ff529c6fa': None,
 'bb833522-2dc0-4d3d-92eb-b3d7dc08e5e8': None,
 '516a0e65-7ec0-4cce-bd20-ba31db5ef326': None,
 'c08df16f-c5c4-4526-a727-90518545deb4': None,
 '0a79f1c1-378c-4879-af47-2ec70a1c8c92': None,
 'e3c462dc-6b59-4dda-9669-75ede12417df': None,
 'cd9a32c6-a162-4ca4-a44e-5c0609d69c2f': None,
 '09552906-ef35-4613-8c10-1bce29a18fdd': None,
 'a9a2707b-5128-4df8-ab00-552b353bae02': None,
 '9972da57-9b47-4703-a58e-399c0ddc4ac3': None,
 'eaf7f9a5-4c68-4f2b-903c-16ac1fb10a7a': None,
 '3957e728-5d15-4a17-93e5-f127aacf64d9': None,
 'c73a147a-5e7e-41a6-866c-6e5b2c1fa8cb': None,
 '4d28c767-892b-40b4-bc01-bbe53650d5a9': None,
 'a0d065dc-a6e4-4e52-8e13-f3031c6e1a9e': None,
 '6a603dd2-5d09-4870-b04f-0e0e183c86cd': None,
 '25582ee7-6f67-4efe-bcff-0bf81bb17b7f': None,
 '00847b3d-eb10-4b3e-b97c-c0674bcebb6d': None,
 '9a964817-3224-4e6c-950e-e1a908fd3f7c': None,
 '324c3d08-f3

In [20]:
response = query_engine.query(
    "What did the author do growing up?", 
)

INFO:llama_index.query_engine.router_query_engine:Selecting query engine 0 because: This choice is relevant because it involves retrieving specific context from documents, which is needed to answer the question about the author's activities growing up..
Selecting query engine 0 because: This choice is relevant because it involves retrieving specific context from documents, which is needed to answer the question about the author's activities growing up..
INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 8 tokens
> [retrieve] Total embedding token usage: 8 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 1893 tokens
> [get_response] Total LLM token usage: 1893 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens
> [get_resp

In [21]:
display_response(response)

**`Final Response:`** Growing up, the author worked on writing short stories and programming. They started programming on an IBM 1401 in 9th grade, using an early version of Fortran. Later, they got a TRS-80 microcomputer and wrote simple games, a program to predict model rocket flight, and a word processor.

---

**`Source Node 1/2`**

**Document ID:** 36cbb5a0-9b49-4c50-aa4c-b8839ecd3237<br>**Similarity:** 0.8085773050574637<br>**Text:** What I Worked On

February 2021

Before college the two main things I worked on, outside of schoo...<br>

---

**`Source Node 2/2`**

**Document ID:** bb833522-2dc0-4d3d-92eb-b3d7dc08e5e8<br>**Similarity:** 0.806118318463326<br>**Text:** make enough to survive. And as an artist you could be truly independent. You wouldn't have a boss...<br>

{'36cbb5a0-9b49-4c50-aa4c-b8839ecd3237': None,
 'bb833522-2dc0-4d3d-92eb-b3d7dc08e5e8': None}

In [17]:
response = query_engine.query(
    "What did the author do during his time in art school?", 
)

INFO:llama_index.query_engine.router_query_engine:Selecting query engine 0 because: This choice is relevant because it involves retrieving specific context from documents, which is needed to answer the question about the author's activities in art school..
Selecting query engine 0 because: This choice is relevant because it involves retrieving specific context from documents, which is needed to answer the question about the author's activities in art school..
INFO:llama_index.token_counter.token_counter:> [retrieve] Total LLM token usage: 0 tokens
> [retrieve] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [retrieve] Total embedding token usage: 12 tokens
> [retrieve] Total embedding token usage: 12 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total LLM token usage: 1883 tokens
> [get_response] Total LLM token usage: 1883 tokens
INFO:llama_index.token_counter.token_counter:> [get_response] Total embedding token usage: 0 tokens
> [

In [19]:
display_response(response)

**`Final Response:`** During his time in art school, the author attended the Accademia di Belli Arti in Florence, took painting classes, and painted still lives in his bedroom at night. He also spent time observing and learning from his fellow students and faculty, although he found the teaching and learning arrangement at the school to be somewhat lacking.

---

**`Source Node 1/2`**

**Document ID:** bb833522-2dc0-4d3d-92eb-b3d7dc08e5e8<br>**Similarity:** 0.859983360121018<br>**Text:** make enough to survive. And as an artist you could be truly independent. You wouldn't have a boss...<br>

---

**`Source Node 2/2`**

**Document ID:** 516a0e65-7ec0-4cce-bd20-ba31db5ef326<br>**Similarity:** 0.8319605972897547<br>**Text:** fall. This was now only weeks away. My nice landlady let me leave my stuff in her attic. I had so...<br>

{'bb833522-2dc0-4d3d-92eb-b3d7dc08e5e8': None,
 '516a0e65-7ec0-4cce-bd20-ba31db5ef326': None}