# 02. Advanced RAG with PDF
In this notebook, we will:
1. Download a sample PDF.
2. Ingest and split the text.
3. Store embeddings in Qdrant (Persistent).
4. Perform RAG questions against the PDF.

In [1]:
import os
import requests
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_qdrant import QdrantVectorStore
from langchain_ollama import ChatOllama
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

# Ensure data directory exists
os.makedirs('data', exist_ok=True)

## 0. Cleanup Collection
Ensures we start fresh each run.

In [2]:
from qdrant_client import QdrantClient

try:
    client = QdrantClient(url=os.environ.get('QDRANT_URL'))
    client.delete_collection('pdf_rag')
    print('Collection cleared!')
except Exception as e:
    pass

Collection cleared!


## 0.1 Baseline (No RAG)
Let's see what the model knows about DataStage without content.
*(This represents the "Before" state)*

In [3]:
# Baseline Query
from langchain_ollama import ChatOllama

# Initialize LLM for this cell
llm = ChatOllama(
    base_url=os.environ.get('OLLAMA_BASE_URL'),
    model='llama3'
)

#query = 'what is the best partitioning strategy for a lookup stage in datastage?'
query = 'what is context engineering'
print(f'Question: {query}')
print('Answer:')
print(llm.invoke(query).content)

Question: what is context engineering
Answer:
Context Engineering is a relatively new and rapidly evolving field that combines insights from psychology, sociology, computer science, and design to create personalized experiences for individuals. It's an approach that focuses on understanding the nuances of human behavior in different contexts to inform product development, marketing strategies, and user experience design.

The term "context" refers to the complex web of factors that influence a person's thoughts, feelings, and behaviors, including:

1. Physical environment: Where someone is, what they're doing, and who they're with.
2. Emotional state: Their mood, emotions, and stress levels at any given moment.
3. Social context: The relationships, roles, and cultural norms that shape their behavior.
4. Temporal context: The time of day, week, month, or year, which can impact people's routines, habits, and decisions.

Context Engineers analyze these contextual factors to identify patte

In [4]:
# 1. Download Weaviate Context Engineering Redbook PDF
pdf_url = 'https://8738733.fs1.hubspotusercontent-na1.net/hubfs/8738733/eBooks/Weaviate-Context-Engineering-ebook.pdf'
pdf_path = 'data/sample.pdf'

if not os.path.exists(pdf_path):
    print('Downloading PDF...')
    response = requests.get(pdf_url)
    with open(pdf_path, 'wb') as f:
        f.write(response.content)
    print('PDF Downloaded')
else:
    print('PDF already exists')

PDF already exists


In [5]:
# 2. Load and Split PDF
loader = PyPDFLoader(pdf_path)
pages = loader.load()
print(f'Loaded {len(pages)} pages')

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(pages)
print(f'Created {len(splits)} splits')

Loaded 23 pages
Created 83 splits


In [6]:
# 3. Index in Qdrant (Persistent)
# We use a specific collection name 'pdf_rag'
embeddings = HuggingFaceEmbeddings(model_name='all-MiniLM-L6-v2')
url = os.environ.get('QDRANT_URL')

qdrant = QdrantVectorStore.from_documents(
    splits,
    embeddings,
    url=url,
    prefer_grpc=False,
    collection_name='pdf_rag',
    force_recreate=True  # Clean start for this tutorial
)
print('PDF Content Indexed!')

PDF Content Indexed!


In [7]:
# 4. Perform RAG
llm = ChatOllama(
    base_url=os.environ.get('OLLAMA_BASE_URL'),
    model='llama3'
)

retriever = qdrant.as_retriever(search_kwargs={'k': 3})
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

chain = (
    {'context': retriever, 'question': RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

#query = 'What did the president say about Ukraine?'
query = 'what is context engineering, and ground your answers with topics tied to the knowledge base,documents, page numbers,  cite them appropriately'
print(f'Question: {query}')
print('Answer:')
for chunk in chain.stream(query):
    print(chunk, end='', flush=True)

Question: what is context engineering, and ground your answers with topics tied to the knowledge base,documents, page numbers,  cite them appropriately
Answer:
Based on the provided documents, I can answer that context engineering refers to the process of designing a well-structured retrieval system that provides reliable and relevant answers. This is stated in Document 1 (page 13), where it says: "A well-designed retrieval system is the difference between an LLM that guesses and one that provides fact-based, reliable, and contextually relevant answers."

This concept is further elaborated on in Document 2 (page 6), which lists topics such as Knowledge Collections, External Knowledge Sources, and Contextual Relevance. It highlights the importance of considering contextual information, including previous conversations and other relevant data, to provide accurate responses.

In Document 3 (page 9), context engineering is also mentioned as a key aspect in designing query agents that can i