# CONTEXTUAL RETRIEVAL WITH LLAMA_INDEX

This notebook covers contextual retrieval with llama_index DocumentContextExtractor

Based on an Anthropic [blost post](https://www.anthropic.com/news/contextual-retrieval), the concept is to:
1. Use an LLM to generate a 'context' for each chunk based on the entire document
2. embed the chunk + context together
3. reap the benefits of higher RAG accuracy

While you can also do this manually, the DocumentContextExtractor offers a lot of convenience and error handling, plus you can integrate it into your llama_index pipelines! Let's get started.

# INSTALL PACKAGES

In [None]:
%pip install llama-index-readers-file
%pip install llama-index-embeddings-huggingface
%pip install llama-index-llms-openai
%pip install -e "C:\\Users\\cklap\\llama_index\\llama-index-core"


Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Collecting llama-index-llms-openai
  Using cached llama_index_llms_openai-0.3.12-py3-none-any.whl.metadata (3.3 kB)
Collecting openai<2.0.0,>=1.58.1 (from llama-index-llms-openai)
  Using cached openai-1.58.1-py3-none-any.whl.metadata (27 kB)
Using cached llama_index_llms_openai-0.3.12-py3-none-any.whl (14 kB)
Using cached openai-1.58.1-py3-none-any.whl (454 kB)
Installing collected packages: openai, llama-index-llms-openai
  Attempting uninstall: openai
    Found existing installation: openai 1.57.2
    Uninstalling openai-1.57.2:
      Successfully uninstalled openai-1.57.2
Successfully installed llama-index-llms-openai-0.3.12 openai-1.58.1
Note: you may need to restart the kernel to use updated packages.
Obtaining file:///C:/Users/cklap/llama_index/llama-index-core
  Installing build dependencies: started
  Installing build dependencies: finished with 

# SETUP AN LLM
You can use the MockLLM or you can use a real LLM of your choice here. flash 2 and gpt-4o-mini work well.

In [None]:
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings

OPENAI_API_KEY = ""
llm = OpenAI(model="gpt-4o-mini", api_key=OPENAI_API_KEY)
Settings.llm = llm

 #Setup a data pipeline

 we'll need an embedding model, an index store, a vectore store, and a way to split tokens.

# Build Pipeline & Index

In [None]:
from llama_index.core import VectorStoreIndex, StorageContext, Settings
from llama_index.core.node_parser import TokenTextSplitter
from llama_index.core.storage.index_store.simple_index_store import (
    SimpleIndexStore,
)
from llama_index.core.vector_stores.simple import SimpleVectorStore
from llama_index.core.storage.docstore.simple_docstore import (
    SimpleDocumentStore,
)
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

# Initialize document store and embedding model
docstore = SimpleDocumentStore()
embed_model = HuggingFaceEmbedding(model_name="all-MiniLM-L6-v2")

# Create storage context
storage_context = StorageContext.from_defaults(docstore=docstore)

text_splitter = TokenTextSplitter(
    separator=" ", chunk_size=512, chunk_overlap=10
)

#### DocumentContextExtractor

In [None]:
# This is the new part!

from llama_index.core.extractors import DocumentContextExtractor

context_extractor = DocumentContextExtractor(
    # mandatory
    docstore=docstore,
    max_context_length=128000,
    # optional
    llm=llm,  # default to Settings.llm
    oversized_document_strategy="warn",
    max_output_tokens=100,
    key="context",
    prompt=DocumentContextExtractor.ORIGINAL_CONTEXT_PROMPT,
)

#### Build Index

In [None]:
import nest_asyncio

nest_asyncio.apply()

index = VectorStoreIndex.from_documents(
    documents=[],
    storage_context=storage_context,
    embed_model=embed_model,
    transformations=[text_splitter, context_extractor],
)

import nest_asyncio

nest_asyncio.apply()

index_nocontext = VectorStoreIndex.from_documents(
    documents=[],
    storage_context=storage_context,
    embed_model=embed_model,
    transformations=[text_splitter],
)




0it [00:00, ?it/s]


# LOAD DATA

In [None]:
!wget "https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt" "paul_graham_essay.txt"

In [None]:
from llama_index.core import SimpleDirectoryReader

reader = SimpleDirectoryReader(input_files=["paul_graham_essay.txt"])
documents = reader.load_data()

# Run the pipeline, then search

In [None]:
import nest_asyncio

nest_asyncio.apply()

# have to keep this updated for the DocumentContextExtractor to function.
# everytime we insert a doc the entire pipeline will run and context will be generated
storage_context.docstore.add_documents(documents)
for doc in documents:
    index.insert(doc)
    index_nocontext.insert(doc)





  0%|          | 0/3 [00:00<?, ?it/s]

[A[A[A[A


[A[A[A


100%|██████████| 3/3 [00:01<00:00,  1.63it/s]


In [None]:
# Verify all nodes have context
assert context_extractor.is_job_complete()

In [None]:
retriever = index.as_retriever(similarity_top_k=5)
nodes_fromcontext = retriever.retrieve("Who is Paul Graham.")

retriever_nocontext = index_nocontext.as_retriever(similarity_top_k=5)
nodes_nocontext = retriever.retrieve("Who is Paul Graham.")
# Print each node's content
print("==========")
print("NO CONTEXT")
for i, node in enumerate(nodes_nocontext, 1):
    print(f"\nChunk {i}:")
    print(f"Score: {node.score}")  # Similarity score
    print(f"Content: {node.node.text}")  # The actual text content

# Print each node's content
print("==========")
print("WITH CONTEXT")
for i, node in enumerate(nodes_fromcontext, 1):
    print(f"\nChunk {i}:")
    print(f"Score: {node.score}")  # Similarity score
    print(f"Content: {node.node.text}")  # The actual text content

NO CONTEXT

Chunk 1:
Score: 0.1881885285336999
Content: I met the Reddits before we even started Y Combinator. In fact they were one of the reasons we started it.

YC grew out of a talk I gave to the Harvard Computer Society (the undergrad computer club) about how to start a startup. Everyone else in the audience was probably local, but Steve and Alexis came up on the train from the University of Virginia, where they were seniors. Since they'd come so far I agreed to meet them for coffee. They told me about the startup idea we'd later fund them to drop: a way to order fast food on your cellphone.

This was before smartphones. They'd have had to make deals with cell carriers and fast food chains just to get it launched. So it was not going to happen. It still doesn't exist, 19 years later. But I was impressed with their brains and their energy. In fact I was so impressed with them and some of the other people I met at that talk that I decided to start something to fund them. A few days 

In [None]:
# save the index and vectorstore, cause it can take time and money to generate context!

# for google drive support
# persist_dir = '/content/drive/MyDrive/your_project_folder'
persist_dir = "./"
storage_context.persist(persist_dir=persist_dir)