# Hands-on 1: Ingestion and Chunking

### Problem
Given the Paul Graham’s essay, build a sophisticated chat-engine address this target conversation:​

- Primary query: “What did Paul Graham do in the summer 1995?”​
- Follow-up query: “Why did he organize it?”​

### Suggested tasks
- 🔍 Use the previous QueryFusionRetrieve​
- 🔤 Create a ChatMemoryBuffer​
- 🚀 Create a ContensePlusContextChatEngine​
- 📊 Validate the result!​
- 🌐 [+] Create a Chat & QueryGeneration prompt

## Code

In [None]:
# if running on colab uncomment those lines
%pip install httpx==0.27.2
%pip install llama-index==0.12.3
%pip install llama-index-retrievers-bm25>=0.50
%pip install openai==1.57.0
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

In [None]:
# load .env
# from dotenv import load_dotenv
# load_dotenv()
import os
os.environ["OPENAI_API_KEY"] = "sk-..."

True

In [3]:
import nest_asyncio

nest_asyncio.apply()

In [4]:

from llama_index.core import VectorStoreIndex
from llama_index.core.node_parser import SentenceSplitter
from rich import print as rprint
from llama_index.retrievers.bm25 import BM25Retriever
from llama_index.core.retrievers import QueryFusionRetriever
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core import SimpleDirectoryReader
from llama_index.core.retrievers.fusion_retriever import FUSION_MODES
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.chat_engine.condense_plus_context import CondensePlusContextChatEngine
from llama_index.core.base.base_retriever import BaseRetriever
from llama_index.core.base.base_query_engine import BaseQueryEngine

resource module not available on Windows


  from .autonotebook import tqdm as notebook_tqdm


In [5]:
documents = SimpleDirectoryReader("./data/paul_graham").load_data()

splitter = SentenceSplitter(chunk_size=256)

index = VectorStoreIndex.from_documents(
    documents, transformations=[splitter], show_progress=True
)

Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 21.72it/s]
Generating embeddings: 100%|██████████| 587/587 [00:09<00:00, 60.57it/s]


In [6]:
QUERY_GEN_PROMPT="""You are a helpful assistant that generates multiple search queries based on a single input query. Generate {num_queries} search queries, one on each line, related to the following input query:
Query: {query}
Queries:"""

In [7]:
def init_query_fusion_retriever(index: VectorStoreIndex, similarity_top_k: int, mode: FUSION_MODES, num_queries:int = 3) -> QueryFusionRetriever:
    vector_retriever = index.as_retriever(similarity_top_k=similarity_top_k)
    bm25_retriever = BM25Retriever.from_defaults(
        docstore=index.docstore, similarity_top_k=similarity_top_k
    )
    retriever = QueryFusionRetriever(
        [vector_retriever, bm25_retriever],
        similarity_top_k=similarity_top_k,
        num_queries=num_queries,
        mode=mode,
        use_async=True,
        verbose=True,
        query_gen_prompt=QUERY_GEN_PROMPT,
    )
    return retriever

In [8]:
CHAT_PROMPT = """You're a chatbot, able to have normal conversations, about an essay discussing Paul Grahams life. 
Here are the relevant documents for the context:
{context_str}
Instruction: use the previous chat history, or the context aboute, to interact with the user. Use only the information provided in the context to generate responses."""

In [9]:
def init_condense_plus_context_chat_engine(retriever: BaseRetriever, query_engine: BaseQueryEngine) -> CondensePlusContextChatEngine:

    memory = ChatMemoryBuffer.from_defaults(token_limit=3000)

    chat_engine = CondensePlusContextChatEngine.from_defaults(
        retriever=retriever,
        query_engine=query_engine,
        memory=memory,
        context_prompt=CHAT_PROMPT,
    )
    return chat_engine


In [10]:
retriever = init_query_fusion_retriever(index, similarity_top_k=5, mode=FUSION_MODES.RECIPROCAL_RANK, num_queries=3)
query_engine = RetrieverQueryEngine(retriever)
chat_engine = init_condense_plus_context_chat_engine(retriever=retriever, query_engine=query_engine)

In [11]:

chat_engine.chat("What did Paul Graham do in the summer of 1995?")

Generated queries:
1. What projects did Paul Graham work on in the summer of 1995?
2. How did Paul Graham's experiences in the summer of 1995 influence his career?


AgentChatResponse(response='In the summer of 1995, Paul Graham and his team came up with something called the Summer Founders Program. They used a building in Cambridge as their headquarters and invited undergraduates to apply for the program. It was a way for them to practice being investors and for the students to have a more interesting summer than working at Microsoft.', sources=[ToolOutput(content='[NodeWithScore(node=TextNode(id_=\'4a132594-a960-4e7b-aeb3-e81a9787ae00\', embedding=None, metadata={\'file_path\': \'c:\\\\Users\\\\n.fretti\\\\Desktop\\\\projects\\\\rag_and_roll\\\\lesson_02\\\\data\\\\paul_graham\\\\paul_graham_essay.txt\', \'file_name\': \'paul_graham_essay.txt\', \'file_type\': \'text/plain\', \'file_size\': 75042, \'creation_date\': \'2024-12-04\', \'last_modified_date\': \'2024-12-04\'}, excluded_embed_metadata_keys=[\'file_name\', \'file_type\', \'file_size\', \'creation_date\', \'last_modified_date\', \'last_accessed_date\'], excluded_llm_metadata_keys=[\'file

In [12]:
chat_engine.chat("Why did he organize it?")

Generated queries:
1. What were the goals of the Summer Founders Program organized by Paul Graham in 1995?
2. How did the Summer Founders Program impact the startup ecosystem in 1995 and beyond?


AgentChatResponse(response='Paul Graham organized the Summer Founders Program to gain experience as investors. They wanted to fund a variety of startups at once and thought that organizing a summer program where undergraduates would start startups instead of taking temporary jobs at tech companies would be a great way to do so. It was a way for them to practice being investors while providing the students with a more engaging summer experience.', sources=[ToolOutput(content='[NodeWithScore(node=TextNode(id_=\'2d5033e8-6fc8-46f8-b715-5def47165a6e\', embedding=None, metadata={\'file_path\': \'c:\\\\Users\\\\n.fretti\\\\Desktop\\\\projects\\\\rag_and_roll\\\\lesson_02\\\\data\\\\paul_graham\\\\paul_graham_essay.txt\', \'file_name\': \'paul_graham_essay.txt\', \'file_type\': \'text/plain\', \'file_size\': 75042, \'creation_date\': \'2024-12-04\', \'last_modified_date\': \'2024-12-04\'}, excluded_embed_metadata_keys=[\'file_name\', \'file_type\', \'file_size\', \'creation_date\', \'last_mod

In [13]:
chat_engine.reset()

In [14]:
chat_engine.chat("What did I ask you?")

Generated queries:
1. How can I retrieve my previous questions to you?
2. Can you provide a summary of the recent queries I have made to you?


AgentChatResponse(response="You asked me about the content of an essay discussing Paul Graham's life. Would you like to know more details about it?", sources=[ToolOutput(content='[NodeWithScore(node=TextNode(id_=\'3c89b872-e28a-4297-9ea2-67dfc31039ee\', embedding=None, metadata={\'file_path\': \'c:\\\\Users\\\\n.fretti\\\\Desktop\\\\projects\\\\rag_and_roll\\\\lesson_02\\\\data\\\\paul_graham\\\\paul_graham_essay.txt\', \'file_name\': \'paul_graham_essay.txt\', \'file_type\': \'text/plain\', \'file_size\': 75042, \'creation_date\': \'2024-12-04\', \'last_modified_date\': \'2024-12-04\'}, excluded_embed_metadata_keys=[\'file_name\', \'file_type\', \'file_size\', \'creation_date\', \'last_modified_date\', \'last_accessed_date\'], excluded_llm_metadata_keys=[\'file_name\', \'file_type\', \'file_size\', \'creation_date\', \'last_modified_date\', \'last_accessed_date\'], relationships={<NodeRelationship.SOURCE: \'1\'>: RelatedNodeInfo(node_id=\'299e6d08-2fe7-48d1-9e37-cca5787b31e8\', node_t

Try to make more questions or just try to use QUERY_GEN_PROMPT and CHAT_PROMPT and see how it goes...