# 02. Advanced RAG with PDF
In this notebook, we will:
1. Download a sample PDF.
2. Ingest and split the text.
3. Store embeddings in Qdrant (Persistent).
4. Perform RAG questions against the PDF.

In [1]:
import os
import requests
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_qdrant import QdrantVectorStore
from langchain_ollama import ChatOllama
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

# Ensure data directory exists
os.makedirs('data', exist_ok=True)

## 0. Cleanup Collection
Ensures we start fresh each run.

In [6]:
from qdrant_client import QdrantClient

try:
    client = QdrantClient(url=os.environ.get('QDRANT_URL'))
    client.delete_collection('pdf_rag')
    print('Collection cleared!')
except Exception as e:
    pass

Collection cleared!


## 0.1 Baseline (No RAG)
Let's see what the model knows about DataStage without content.
*(This represents the "Before" state)*

In [10]:
# Baseline Query
from langchain_ollama import ChatOllama

# Initialize LLM for this cell
llm = ChatOllama(
    base_url=os.environ.get('OLLAMA_BASE_URL'),
    model='llama3'
)

#query = 'what is the best partitioning strategy for a lookup stage in datastage?'
query = 'what is context engineering'
print(f'Question: {query}')
print('Answer:')
print(llm.invoke(query).content)

Question: what is context engineering
A fascinating topic!

Context Engineering refers to the practice of designing, building, and maintaining systems that can adapt to different contexts or environments. A context can be defined as a specific situation, setting, or circumstance in which an individual or system operates. Contexts can be physical (e.g., indoor vs. outdoor), temporal (e.g., daytime vs. nighttime), social (e.g., with friends vs. alone), or even emotional (e.g., stressed vs. relaxed).

The goal of Context Engineering is to create systems that can recognize and respond effectively to different contexts, taking into account the varying needs, preferences, and behaviors of individuals within those contexts. This requires a deep understanding of human behavior, psychology, sociology, and technology.

Some examples of Context Engineering include:

1. **Smart homes**: Systems that adjust lighting, temperature, and entertainment settings based on the time of day, occupancy, or us

In [12]:
# 1. Download a Datastage Redbook PDF
pdf_url = 'https://8738733.fs1.hubspotusercontent-na1.net/hubfs/8738733/eBooks/Weaviate-Context-Engineering-ebook.pdf'
pdf_path = 'data/sample.pdf'

if not os.path.exists(pdf_path):
    print('Downloading PDF...')
    response = requests.get(pdf_url)
    with open(pdf_path, 'wb') as f:
        f.write(response.content)
    print('PDF Downloaded')
else:
    print('PDF already exists')

Downloading PDF...
PDF Downloaded


In [13]:
# 2. Load and Split PDF
loader = PyPDFLoader(pdf_path)
pages = loader.load()
print(f'Loaded {len(pages)} pages')

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(pages)
print(f'Created {len(splits)} splits')

Loaded 23 pages
Created 83 splits


In [14]:
# 3. Index in Qdrant (Persistent)
# We use a specific collection name 'pdf_rag'
embeddings = HuggingFaceEmbeddings(model_name='all-MiniLM-L6-v2')
url = os.environ.get('QDRANT_URL')

qdrant = QdrantVectorStore.from_documents(
    splits,
    embeddings,
    url=url,
    prefer_grpc=False,
    collection_name='pdf_rag',
    force_recreate=True  # Clean start for this tutorial
)
print('PDF Content Indexed!')

PDF Content Indexed!


In [18]:
# 4. Perform RAG
llm = ChatOllama(
    base_url=os.environ.get('OLLAMA_BASE_URL'),
    model='llama3'
)

retriever = qdrant.as_retriever(search_kwargs={'k': 3})
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

chain = (
    {'context': retriever, 'question': RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

#query = 'What did the president say about Ukraine?'
query = 'what is context engineering, and ground your answers with topics tied to the knowledge base,documents, page numbers,  cite them appropriately'
print(f'Question: {query}')
print('Answer:')
for chunk in chain.stream(query):
    print(chunk, end='', flush=True)

Question: what is context engineering, and ground your answers with topics tied to the knowledge base,documents, page numbers,  cite them appropriately
Answer:
Based on the provided documents, I can infer that context engineering refers to the process of designing a well-structured database system that enables efficient querying and retrieval of relevant information. This concept is mentioned in multiple pages across different documents.

Page 14 (Document 1) explicitly mentions the importance of mastering two elements ("Ho w" and "W hen") in context engineering, stating that it is fundamental to contextual engineering. It also highlights the difference between an LLM (Large Language Model) that guesses and one that provides fact-based, reliable, and contextually relevant answers.

Page 13 (Document 3) further elaborates on the concept of context engineering by mentioning the importance of previous conversation history and other relevant information in maintaining contextual awareness.