# Hands-on 1: Ingestion and Chunking

### Problem
Given the Paul Graham’s essay, build a sophisticated chat-engine address this target conversation:​

- Primary query: “What did Paul Graham do in the summer 1995?”​
- Follow-up query: “Why did he organize it?”​

### Suggested tasks
- 🔍 Use the previous QueryFusionRetrieve​
- 🔤 Create a ChatMemoryBuffer​
- 🚀 Create a CondensePlusContextChatEngine​
- 📊 Validate the result!​
- 🌐 [+] Create a Chat & QueryGeneration prompt

## Code

In [None]:
# if running on colab uncomment those lines
%pip install httpx==0.27.2
%pip install llama-index==0.12.3
%pip install llama-index-retrievers-bm25>=0.50
%pip install openai==1.57.0
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

In [1]:
# load .env
# from dotenv import load_dotenv
# load_dotenv()
import os
os.environ["OPENAI_API_KEY"] = "sk-..."

In [2]:
import nest_asyncio

nest_asyncio.apply()

In [None]:

from llama_index.core import VectorStoreIndex
from llama_index.core.node_parser import SentenceSplitter
from rich import print as rprint
from llama_index.retrievers.bm25 import BM25Retriever
from llama_index.core.retrievers import QueryFusionRetriever
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core import SimpleDirectoryReader
from llama_index.core.retrievers.fusion_retriever import FUSION_MODES
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.chat_engine.condense_plus_context import CondensePlusContextChatEngine
from llama_index.core.base.base_retriever import BaseRetriever
from llama_index.core.base.base_query_engine import BaseQueryEngine

In [None]:
documents = SimpleDirectoryReader("./data/paul_graham").load_data()

splitter = SentenceSplitter(chunk_size=256)

index = VectorStoreIndex.from_documents(
    documents, transformations=[splitter], show_progress=True
)

In [5]:
QUERY_GEN_PROMPT="""You are a helpful assistant that generates multiple search queries based on a single input query. Generate {num_queries} search queries, one on each line, related to the following input query:
Query: {query}
Queries:"""

In [6]:
def init_query_fusion_retriever(index: VectorStoreIndex, similarity_top_k: int, mode: FUSION_MODES, num_queries:int = 3) -> QueryFusionRetriever:
    # todo: write here your implementation
    # create a vector retriever
    # create a bm25 retriever
    # create a QueryFusionRetriever with the vector and bm25 retrievers
    raise NotImplementedError("QueryFusionRetriever is not implemented yet")

In [7]:
CHAT_PROMPT = """You're a chatbot, able to have normal conversations, about an essay discussing Paul Grahams life. 
Here are the relevant documents for the context:
{context_str}
Instruction: use the previous chat history, or the context aboute, to interact with the user. Use only the information provided in the context to generate responses."""

In [8]:
def init_condense_plus_context_chat_engine(retriever: BaseRetriever, query_engine: BaseQueryEngine) -> CondensePlusContextChatEngine:
    # todo: write here your implementation
    # create a ChatMemoryBuffer
    # create a CondensePlusContextChatEngine with the retriever, query_engine, memory
    raise NotImplementedError("CondensePlusContextChatEngine is not implemented yet")

In [9]:
retriever = init_query_fusion_retriever(index, similarity_top_k=5, mode=FUSION_MODES.RECIPROCAL_RANK, num_queries=3)
query_engine = RetrieverQueryEngine(retriever)
chat_engine = init_condense_plus_context_chat_engine(retriever=retriever, query_engine=query_engine)

In [None]:

chat_engine.chat("What did Paul Graham do in the summer of 1995?")

In [None]:
chat_engine.chat("Why did he organize it?")

In [12]:
chat_engine.reset()

In [None]:
chat_engine.chat("What did I ask you?")

Try to make more questions or just try to use QUERY_GEN_PROMPT and CHAT_PROMPT and see how it goes...