<a href="https://colab.research.google.com/github/towardsai/ragbook-notebooks/blob/main/notebooks/Chapter%2008%20-%20Mastering_Advanced_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [18]:
!pip install -q llama-index==0.12.43 deeplake==4.2.10 openai==1.92.0 cohere==5.15.0 llama-index-vector-stores-deeplake==0.3.3 llama-index-llms-openai==0.4.7 llama-index-postprocessor-cohere-rerank==0.4.0 jedi==0.19.2

In [19]:
import os

# os.environ['OPENAI_API_KEY'] = '<YOUR_OPENAI_API_KEY>'
# os.environ['ACTIVELOOP_TOKEN'] = '<YOUR_ACTIVELOOP_API_KEY>'
# os.environ['COHERE_API_KEY'] = '<YOUR_COHERE_API_KEY>'

from google.colab import userdata
os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')
os.environ['ACTIVELOOP_TOKEN'] = userdata.get('ACTIVELOOP_TOKEN')
os.environ['COHERE_API_KEY'] = userdata.get('COHERE_API_KEY')

In [20]:
!mkdir -p './paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O './paul_graham/paul_graham_essay.txt'

--2026-01-25 02:40:18--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2026-01-25 02:40:19 ERROR 404: Not Found.



In [21]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

# Configure Settings (replaces ServiceContext)
# Set up global settings
Settings.llm = OpenAI(model="gpt-4.1-mini", temperature=0.0)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
Settings.chunk_size = 512
Settings.chunk_overlap = 50

In [22]:
from llama_index.core import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader("./paul_graham").load_data()

In [23]:
from llama_index.core import Settings
Settings.chunk_size = 512
Settings.chunk_overlap = 64

node_parser = Settings.node_parser

nodes = node_parser.get_nodes_from_documents(documents)

In [24]:
from llama_index.vector_stores.deeplake import DeepLakeVectorStore

my_activeloop_org_id = "" # TODO: use your organization id here
my_activeloop_dataset_name = "LlamaIndex_paulgraham_essay"
dataset_path = f"hub://{my_activeloop_org_id}/{my_activeloop_dataset_name}" # Corrected prefix from hub:// to al://

# Create an index over the documnts
vector_store = DeepLakeVectorStore(dataset_path=dataset_path, overwrite=True)

In [25]:
from llama_index.core import StorageContext

storage_context = StorageContext.from_defaults(vector_store=vector_store)
storage_context.docstore.add_documents(nodes)

In [26]:
from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex(nodes, storage_context=storage_context)

In [27]:
query_engine = vector_index.as_query_engine(streaming=True, similarity_top_k=10)

In [28]:
streaming_response = query_engine.query(
    "What does Paul Graham do?",
)
streaming_response.print_response_stream()

Paul Graham is an essayist.

# SubQuestion Query Engine

In [29]:
query_engine = vector_index.as_query_engine(similarity_top_k=10)

In [30]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine

query_engine_tools = [
    QueryEngineTool(
        query_engine=query_engine,
        metadata=ToolMetadata(
            name="pg_essay",
            description="Paul Graham essay on What I Worked On",
        ),
    ),
]

query_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=query_engine_tools,
    use_async=True,
)


In [31]:
response = query_engine.query(
    "How was Paul Grahams life different before, during, and after YC?"
)

Generated 3 sub questions.
[1;3;38;2;237;90;200m[pg_essay] Q: What details does Paul Graham provide about his life before starting Y Combinator in his essay?
[0m[1;3;38;2;90;149;237m[pg_essay] Q: What experiences and challenges does Paul Graham describe during the time he was involved with Y Combinator?
[0m[1;3;38;2;11;159;203m[pg_essay] Q: How does Paul Graham describe his life and activities after his involvement with Y Combinator?
[0m[1;3;38;2;11;159;203m[pg_essay] A: Paul Graham describes his life after his involvement with Y Combinator as focused on writing essays and thinking deeply about startups and related topics. He shifted from active startup involvement to sharing his insights and experiences through his essays, which have influenced many in the tech and startup communities.
[0m[1;3;38;2;237;90;200m[pg_essay] A: Paul Graham describes his background before starting Y Combinator by discussing his experiences as a programmer and writer. He mentions his work on various

In [32]:
print( ">>> The final response:\n", response )

>>> The final response:
 Before starting Y Combinator, Paul Graham was primarily engaged as a programmer and writer, working on programming languages and software development while cultivating an interest in startups and technology. During his time with Y Combinator, his life involved intense efforts in selecting and supporting startups, mentoring founders, and continuously adapting the program to better meet entrepreneurs' needs, balancing hands-on guidance with allowing startups independence. After his involvement with Y Combinator, he shifted his focus to writing essays and reflecting deeply on startups and related topics, sharing his insights with the broader tech and startup communities.


# Cohere Rerank

In [33]:
import cohere

# Get your cohere API key on: www.cohere.com
co = cohere.Client(os.environ['COHERE_API_KEY'])

# Example query and passages
query = "What is the capital of the United States?"
documents = [
   "Carson City is the capital city of the American state of Nevada. At the  2010 United States Census, Carson City had a population of 55,274.",
   "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that are a political division controlled by the United States. Its capital is Saipan.",
   "Charlotte Amalie is the capital and largest city of the United States Virgin Islands. It has about 20,000 people. The city is on the island of Saint Thomas.",
   "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. ",
   "Capital punishment (the death penalty) has existed in the United States since before the United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states.",
   "North Dakota is a state in the United States. 672,591 people lived in North Dakota in the year 2010. The capital and seat of government is Bismarck."
   ]

In [34]:
results = co.rerank(query=query, documents=documents, top_n=3, model='rerank-english-v3.0') # Change top_n to change the number of results returned. If top_n is not passed, all results will be returned.


for idx, r in enumerate(results.results):
    print(f"Document Rank: {idx + 1}, Document Index: {r.index}")
    print(f"Relevance Score: {r.relevance_score:.5f}")
    print("\n")


Document Rank: 1, Document Index: 3
Relevance Score: 0.99907


Document Rank: 2, Document Index: 4
Relevance Score: 0.77798


Document Rank: 3, Document Index: 1
Relevance Score: 0.08882




# Cohere in LlamaIndex

In [35]:
import os
from llama_index.postprocessor.cohere_rerank import CohereRerank

cohere_rerank = CohereRerank(api_key=os.environ['COHERE_API_KEY'], top_n=10)

In [36]:
query_engine = vector_index.as_query_engine(
    similarity_top_k=10,
    node_postprocessors=[cohere_rerank],
)

In [37]:
response = query_engine.query(
    "What did Sam Altman do in this essay?",
)
print( response )

The essay discusses Sam Altman's role as a key figure in the startup and technology world, highlighting his leadership and influence in fostering innovation and supporting new ventures. It portrays him as someone who has contributed significantly to the growth and development of startups, often emphasizing his ability to identify promising ideas and help turn them into successful companies.
