## Create a fresh environment

In [100]:
!pip install -U langchain langchain-community langchain-text-splitters
!pip install -U langchain-ollama langchain-huggingface
!pip install -U langchain-chroma chromadb
!pip install -U pypdf



## Step 1 ‚Äî Install & run an open-source LLM (Ollama)
Instead of relying on closed APIs, the system uses Ollama for local LLM inference, HuggingFace for embeddings, and Chroma / FAISS for vector storage.
The use case is document-grounded question answering over the Scrum Guide PDF, ensuring answers are evidence-based and not hallucinated


In [None]:
!ollama --version
!ollama run llama3 "Say hello in one line"


ollama version is 0.14.1


## Sanity check
Ollama is running
LangChain can talk to the model
The LLM can successfully generate text

In [1]:
import requests
r = requests.get("http://localhost:11434")
print("Status:", r.status_code)
print(r.text[:200])


Status: 200
Ollama is running


In [2]:
from langchain_ollama import ChatOllama

llm = ChatOllama(model="llama3", temperature=0)
llm.invoke("Say hello in one line")


AIMessage(content='"Hello there!"', additional_kwargs={}, response_metadata={'model': 'llama3', 'created_at': '2026-01-16T09:21:32.1665224Z', 'done': True, 'done_reason': 'stop', 'total_duration': 13857604900, 'load_duration': 10848544600, 'prompt_eval_count': 15, 'prompt_eval_duration': 2003725300, 'eval_count': 5, 'eval_duration': 962335800, 'logprobs': None, 'model_name': 'llama3', 'model_provider': 'ollama'}, id='lc_run--019bc61b-efd7-7571-bb76-fc4093efad39-0', tool_calls=[], invalid_tool_calls=[], usage_metadata={'input_tokens': 15, 'output_tokens': 5, 'total_tokens': 20})

## Document Ingestion (PDF Loading & Validation)
Ingest the Scrum Guide PDF as structured documents,
Each page is converted into a LangChain Document object,
Enabling chunking, embedding, and retrieval during QA.
###### PyPDFLoader turns the Scrum Guide into structured Document objects
the document was loaded page-by-page into 14 structured Document objects.
At this stage, the content is too large for direct LLM input, which makes chunking necessary.



In [3]:
from langchain_community.document_loaders import PyPDFLoader
import os

pdf_path = r"C:\Users\ankit\OneDrive\Desktop\Data Science\2020-Scrum-Guide-US.pdf"

if os.path.exists(pdf_path):
    loader = PyPDFLoader(pdf_path)
    docs = loader.load()
    print(f"Pages loaded: {len(docs)}")
    print(docs[0].page_content[:400])
else:
    print(f"Error: The file '{pdf_path}' does not exist.")



Pages loaded: 14
Ken Schwaber & Jeff Sutherland 
 
 
 
 
 
 
 
 
 
 
The Scrum Guide 
 
The Definitive Guide to Scrum: The Rules of the Game 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
November 2020


## Split into chunks
a Recursive Character Text Splitter (to maintain semantic context) with a chunk size of 800 characters and a 120-character overlap.
Overlap ensures continuity of context across chunk boundaries.
This resulted in 51 chunks, which balances retrieval precision with sufficient semantic context.

## Text Chunking (Preparing Documents for Retrieval)
This step breaks the loaded Scrum Guide text into smaller, overlapping pieces so the system can:
Create embeddings
Store them efficiently
Retrieve only the most relevant parts for each question

1Ô∏è‚É£ Recursive Character Text Splitter

This is a text-splitting method that:

Tries to split text at natural boundaries first (paragraphs, sentences)

Only splits by characters if it can‚Äôt find a clean boundary

üëâ Purpose: keep meaning intact, not break sentences awkwardly.

2Ô∏è‚É£ ‚ÄúTo maintain semantic context‚Äù

This explains why it‚Äôs used:

Each chunk contains complete, meaningful ideas

Related sentences stay together

Embeddings represent coherent concepts instead of fragments

üëâ Result: better retrieval + fewer hallucinations in RAG.

3Ô∏è‚É£ Chunk size = 800 characters

This means:

Each chunk can be up to 800 characters long

Roughly:

~120‚Äì150 words

~6‚Äì8 sentences (varies)

üëâ Why 800?

Big enough to hold context

Small enough to fit embedding & LLM context windows efficiently

4Ô∏è‚É£ Overlap = 120 characters

This means:

The last 120 characters of one chunk are repeated in the next chunk

Example:

Chunk 1: [.........ABCDE]
Chunk 2:        [ABCDE.........]


üëâ Why overlap?

Prevents losing context at chunk boundaries

Ensures important sentences aren‚Äôt split across chunks


In [4]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter =RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=120
)
splits=text_splitter.split_documents(docs)
print("Chunks Created:",len(splits))
print(splits[0].page_content[:400])


Chunks Created: 51
Ken Schwaber & Jeff Sutherland 
 
 
 
 
 
 
 
 
 
 
The Scrum Guide 
 
The Definitive Guide to Scrum: The Rules of the Game 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
November 2020


In [6]:
pip install sentence-transformers

Note: you may need to restart the kernel to use updated packages.


In [7]:
pip install tf-keras





In [8]:
!pip install tf-keras




## Create embeddings (open-source)
Embeddings convert text into numbers so the computer can compare meaning, not just words

In [None]:
"This is a Scrum Guide"
‚Üí [0.012, -0.443, 0.981, ..., 0.217]   ‚Üê 768-dim vector


In [9]:
from langchain_huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-mpnet-base-v2"
)

print("Embeddings ready ‚úÖ")


'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 3a44a562-4295-47b3-b1f2-93b127943892)')' thrown while requesting HEAD https://huggingface.co/sentence-transformers/all-mpnet-base-v2/resolve/main/./config_sentence_transformers.json
Retrying in 1s [Retry 1/5].
'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: ad221f06-9d6e-4120-a17c-66a501717672)')' thrown while requesting HEAD https://huggingface.co/sentence-transformers/all-mpnet-base-v2/resolve/main/./config_sentence_transformers.json
Retrying in 2s [Retry 2/5].
'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: df8c9486-f953-4d50-8b8c-d52a4ec25892)')' thrown while requesting HEAD https://huggingface.co/sentence-transformers/all-mpnet-base-v2/resolve/main/./README.md
Retrying in 1s [Retry 1/5].
'(ReadTimeoutError(

Embeddings ready ‚úÖ


In [10]:
!pip install ipywidgets




In [11]:
test_vec = embeddings.embed_query("hello world")
len(test_vec), test_vec[:5]


(768,
 [0.02624971978366375,
  0.013395597226917744,
  -0.004533165134489536,
  -0.021791458129882812,
  0.05455183982849121])

## Store chunks in a vector database (Chroma)

In [12]:
from langchain_chroma import Chroma
vector_store = Chroma(
          collection_name="scrum_guide",
          embedding_function=embeddings,
             persist_directory="./chroma_scrum_db"
)
ids=vector_store.add_documents(splits)
print("Stored chunks:",len(ids))

Stored chunks: 51


## Retrieve relevant chunks (test retrieval)
1Ô∏è‚É£ Query was converted into an embedding vector
2Ô∏è‚É£ Every stored chunk already has an embedding vector
3Ô∏è‚É£ The vector store calculated cosine similarity
4Ô∏è‚É£ The top-3 closest chunks were returned
The line prints the first 600 characters of the most relevant retrieved document chunk, allowing us to inspect and validate the retrieval output before passing it to the LLM


In [13]:
query = "What are the responsibilities of the Scrum Master?"
retrieved_docs = vector_store.similarity_search(query, k=3)

print("Retrieved:", len(retrieved_docs))
print("\n--- Top retrieved chunk ---\n")
print(retrieved_docs[0].page_content[:600])


Retrieved: 3

--- Top retrieved chunk ---

Scrum Masters are true leaders who serve the Scrum Team and the larger organization. 
 
The Scrum Master serves the Scrum Team in several ways, including: 
  
‚óè Coaching the team members in self-management and cross-functionality; 
‚óè Helping the Scrum Team focus on creating high-value Increments that meet the Definition of 
Done; 
‚óè Causing the removal of impediments to the Scrum Team‚Äôs progress; and, 
‚óè Ensuring that all Scrum events take place and are positive, productive, and kept within the 
timebox. 
 
The Scrum Master serves the Product Owner in several ways, including:


## (Assignment requirement): Run at least two different queries
This code connects retrieval + LLM so that:

A user question is asked

Relevant chunks are retrieved from the vector store

Those chunks are passed to the LLM

The LLM generates a grounded answer

In [15]:
# First, define the rag_answer function or import it from the appropriate module
def rag_answer(question, k=3):
    # Implement your RAG (Retrieval-Augmented Generation) logic here
    # This is a placeholder implementation
    return f"Answer to: {question} (using {k} documents)"

# Now use the function
questions = [
     "What are the responsibilities of Scrum Master ?",
     "What does the Scrum guide say about product owner role?"
]
for q in questions:
    ans = rag_answer(q, k=4)  # Now rag_answer is defined
    print("\nQ:", q)
    print("A:", ans)


Q: What are the responsibilities of Scrum Master ?
A: Answer to: What are the responsibilities of Scrum Master ? (using 4 documents)

Q: What does the Scrum guide say about product owner role?
A: Answer to: What does the Scrum guide say about product owner role? (using 4 documents)


In [17]:
def rag_answer(question, k=3):
    if "Scrum Master" in question:
        return "The Scrum Master is responsible for coaching the team, removing impediments, and ensuring Scrum is understood."
    elif "Product Owner" in question:
        return "The Product Owner is accountable for maximizing product value and managing the Product Backlog."
    else:
        return "I don't know yet."


In [20]:
!pip install -U langchain langchain-community langchain-openai


Collecting langchain-openai
  Downloading langchain_openai-1.1.7-py3-none-any.whl.metadata (2.6 kB)
Downloading langchain_openai-1.1.7-py3-none-any.whl (84 kB)
Installing collected packages: langchain-openai
  Attempting uninstall: langchain-openai
    Found existing installation: langchain-openai 1.1.6
    Uninstalling langchain-openai-1.1.6:
      Successfully uninstalled langchain-openai-1.1.6
Successfully installed langchain-openai-1.1.7


In [21]:
import sys
print(sys.executable)


C:\Users\ankit\anaconda3\python.exe


In [22]:
!pip show langchain


Name: langchain
Version: 1.2.4
Summary: Building applications with LLMs through composability
Home-page: https://docs.langchain.com/
Author: 
Author-email: 
License: MIT
Location: C:\Users\ankit\anaconda3\Lib\site-packages
Requires: langchain-core, langgraph, pydantic
Required-by: 


In [23]:
!pip uninstall langchain -y
!pip uninstall langchain-community -y
!pip uninstall langchain-openai -y

!pip install --no-cache-dir langchain langchain-community langchain-openai


Found existing installation: langchain 1.2.4
Uninstalling langchain-1.2.4:
  Successfully uninstalled langchain-1.2.4
Found existing installation: langchain-community 0.4.1
Uninstalling langchain-community-0.4.1:
  Successfully uninstalled langchain-community-0.4.1
Found existing installation: langchain-openai 1.1.7
Uninstalling langchain-openai-1.1.7:
  Successfully uninstalled langchain-openai-1.1.7
Collecting langchain
  Downloading langchain-1.2.4-py3-none-any.whl.metadata (4.9 kB)
Collecting langchain-community
  Downloading langchain_community-0.4.1-py3-none-any.whl.metadata (3.0 kB)
Collecting langchain-openai
  Downloading langchain_openai-1.1.7-py3-none-any.whl.metadata (2.6 kB)
Downloading langchain-1.2.4-py3-none-any.whl (107 kB)
Downloading langchain_community-0.4.1-py3-none-any.whl (2.5 MB)
   ---------------------------------------- 0.0/2.5 MB ? eta -:--:--
   ---------------------------------------- 2.5/2.5 MB 18.2 MB/s  0:00:00
Downloading langchain_openai-1.1.7-py3-non

In [24]:
!pip list | findstr langchain


langchain                                1.2.4
langchain-chroma                         1.1.0
langchain-classic                        1.0.1
langchain-community                      0.4.1
langchain-core                           1.2.6
langchain-huggingface                    1.2.0
langchain-ollama                         1.0.1
langchain-openai                         1.1.7
langchain-text-splitters                 1.1.0


In [26]:
!pip list | findstr langchain


langchain                                1.2.4
langchain-chroma                         1.1.0
langchain-classic                        1.0.1
langchain-community                      0.4.1
langchain-core                           1.2.6
langchain-huggingface                    1.2.0
langchain-ollama                         1.0.1
langchain-openai                         1.1.7
langchain-text-splitters                 1.1.0


In [27]:
import sys
print(sys.executable)



C:\Users\ankit\anaconda3\python.exe


In [28]:
import sys
!{sys.executable} -m pip install -U langchain langchain-community langchain-openai




In [30]:
# If you're on the newer LangChain packages
from langchain_openai import ChatOpenAI


In [31]:
import os
# Updated import path for ChatOpenAI in newer versions of LangChain
from langchain_openai import ChatOpenAI  # Changed from langchain.chat_models

# Method 1: Set the API key in your environment variables
os.environ["OPENAI_API_KEY"] = "your-api-key-here"  # Replace with your actual OpenAI API key

# Method 2: Or pass the API key directly to the ChatOpenAI constructor
llm = ChatOpenAI(
    model="gpt-3.5-turbo", 
    temperature=0,
    api_key="your-api-key-here"  # Replace with your actual OpenAI API key
)

# Choose either Method 1 or Method 2, not both









In [32]:
# First, define or import the rag_answer function
# For example, if it's a custom function you need to define:
def rag_answer(question, k=3):
    # This is a placeholder implementation
    # Replace with your actual RAG (Retrieval-Augmented Generation) implementation
    answer = f"This is a simulated answer to: {question}"
    context = ["context1", "context2", "context3", "context4"][:k]
    return answer, context

# Now use the function
questions = [
    "What are the responsibilities of the Scrum Master?",
    "What does the Scrum Guide say about the Product Owner role?"
]

for q in questions:
    ans, _ = rag_answer(q, k=4)
    print("\nQ:", q)
    print("A:", ans)
    print("-"*80)


Q: What are the responsibilities of the Scrum Master?
A: This is a simulated answer to: What are the responsibilities of the Scrum Master?
--------------------------------------------------------------------------------

Q: What does the Scrum Guide say about the Product Owner role?
A: This is a simulated answer to: What does the Scrum Guide say about the Product Owner role?
--------------------------------------------------------------------------------


In [33]:
def rag_answer(question, k=4):
    # ensure retriever uses k docs
    retriever.search_kwargs["k"] = k
    
    result = rag_chain.invoke({"input": question})
    
    # Depending on langchain version, output key may differ
    answer = result.get("answer") or result.get("output_text") or str(result)
    context_docs = result.get("context", [])
    
    return answer, context_docs


In [35]:
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings


In [36]:
embeddings = OpenAIEmbeddings()


In [69]:
from pathlib import Path
from langchain_community.document_loaders import PyPDFLoader

# Fix the path by using raw string (r prefix) or double backslashes or forward slashes
# Option 1: Use raw string
PDF_PATH = r"C:\Users\ankit\OneDrive\Desktop\Data Science\GEN AI Tiger\2020-Scrum-Guide-US.pdf"

# Option 2: Use double backslashes
# PDF_PATH = "C:\\Users\\ankit\\OneDrive\\Desktop\\Data Science\\GEN AI Tiger\\2020-Scrum-Guide-US.pdf"

# Option 3: Use forward slashes (works on Windows too)
# PDF_PATH = "C:/Users/ankit/OneDrive/Desktop/Data Science/GEN AI Tiger/2020-Scrum-Guide-US.pdf"

# Option 4: Use pathlib (recommended)
# PDF_PATH = Path("C:/Users/ankit/OneDrive/Desktop/Data Science/GEN AI Tiger/2020-Scrum-Guide-US.pdf")

loader = PyPDFLoader(str(PDF_PATH))
documents = loader.load()

In [70]:
len(documents)


14

In [71]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)

chunks = text_splitter.split_documents(documents)

print(f"Total chunks created: {len(chunks)}")
print(chunks[0].page_content[:300])


Total chunks created: 42
Ken Schwaber & Jeff Sutherland 
 
 
 
 
 
 
 
 
 
 
The Scrum Guide 
 
The Definitive Guide to Scrum: The Rules of the Game 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
November 2020


In [40]:
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()


In [41]:
from langchain_huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-mpnet-base-v2"
)
print("Embeddings ready ‚úÖ")


Embeddings ready ‚úÖ


In [42]:
import sys
!{sys.executable} -m pip install faiss-cpu




In [72]:
from langchain_community.vectorstores import FAISS

vectordb = FAISS.from_documents(chunks, embeddings)
retriever = vectordb.as_retriever(search_kwargs={"k": 4})
print("FAISS vector DB ready ‚úÖ")


FAISS vector DB ready ‚úÖ


In [76]:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate


In [77]:
docs = retriever.invoke("What is a Scrum Master?")

print(len(docs))
print(docs[0].page_content[:300])
print(docs[0].metadata)


4
many stakeholders in the Product Backlog. Those wanting to change the Product Backlog can do so by 
trying to convince the Product Owner. 
Scrum Master 
The Scrum Master is accountable for establishing Scrum as defined in the Scrum Guide. They do this by 
helping everyone understand Scrum theory and
{'producer': 'PyPDF', 'creator': 'Microsoft Word', 'creationdate': '2020-11-09T13:11:26+00:00', 'moddate': '2020-11-09T13:11:26+00:00', 'source': 'C:\\Users\\ankit\\OneDrive\\Desktop\\Data Science\\GEN AI Tiger\\2020-Scrum-Guide-US.pdf', 'total_pages': 14, 'page': 6, 'page_label': '7'}


In [78]:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
"""Answer ONLY using the context from the Scrum Guide.
If the answer is not in the context, say: "Not found in the provided document."

Question:
{question}

Context:
{context}

Answer:"""
)


In [79]:
def rag_answer(question, k=4):
    # retrieve top-k chunks
    docs = retriever.invoke(question)

    context = "\n\n".join(d.page_content for d in docs)

    # build prompt
    messages = prompt.format_messages(
        question=question,
        context=context
    )

    # call LLM
    response = llm.invoke(messages)

    return response.content, docs


In [80]:
import os
os.environ["OPENAI_API_KEY"] = "sk-REPLACE_WITH_REAL_KEY"


In [81]:
import os
os.environ["OPENAI_API_KEY"] = "sk-PASTE_YOUR_REAL_KEY_HERE"


In [82]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)


In [83]:
import os
print("Has OPENAI_API_KEY?", bool(os.getenv("OPENAI_API_KEY")))
print("Key prefix:", os.getenv("OPENAI_API_KEY")[:3])


Has OPENAI_API_KEY? True
Key prefix: sk-


In [84]:
from langchain_community.chat_models import ChatOllama

llm = ChatOllama(model="llama3", temperature=0)


a) Query embedding

Your question q is converted into an embedding vector

b) Retrieval

Top 4 most similar chunks are retrieved from the vector store

These chunks come from the Scrum Guide PDF

c) Context injection

Retrieved chunks are stuffed into the prompt (chain_type="stuff")

d) LLM generation

Llama 3 generates an answer only using that context

3Ô∏è‚É£ Why the SAME question can produce DIFFERENT answers

This is an important RAG concept:

In RAG, answers are a function of context, not just the question.

Mathematically (conceptually):

Answer = f(Question + Retrieved Context + Prompt + LLM)


Earlier:

Context was empty / fake / partial

7th time:

Context = real Scrum Guide sections

Prompt = instructive

LLM = deterministic (temperature=0)

‚û°Ô∏è Hence: longer, better, grounded answers

In [85]:
for q in questions:
    ans, docs = rag_answer(q, k=4)
    print("\nQ:", q)
    print("A:", ans)
    print("Sources/pages:", [d.metadata.get("page") for d in docs])
    print("-"*80)



Q: What are the responsibilities of the Scrum Master?
A: According to the provided context, the responsibilities of the Scrum Master are:

* Helping the Scrum Team focus on creating high-value Increments that meet the Definition of Done;
* Causing the removal of impediments to the Scrum Team‚Äôs progress; and,
* Ensuring that all Scrum events take place and are positive, productive, and kept within the timebox.

Additionally, the Scrum Master serves the Product Owner by helping them manage the Product Backlog.
Sources/pages: [6, 6, 5, 7]
--------------------------------------------------------------------------------

Q: What does the Scrum Guide say about the Product Owner role?
A: According to the Scrum Guide, the Product Owner is accountable for:

* Developing and explicitly communicating the Product Goal
* Creating and clearly communicating Product Backlog items
* Ordering Product Backlog items
* Ensuring that the Product Backlog is transparent, visible, and understood
* Maximizin

In [86]:
!pip install chromadb




In [87]:
from langchain_community.vectorstores import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter


In [58]:
from langchain_core.messages import HumanMessage

def rag_answer_with_vs(question: str, k: int = 3):
    docs_ret = vs_small.similarity_search(question, k=k)
    context = "\n\n".join(d.page_content for d in docs_ret)

    prompt_text = f"""Use ONLY the context below to answer.
If the answer is not in the context, say: "Not found in the provided document."

Context:
{context}

Question: {question}

Answer:"""

    response = llm.invoke([HumanMessage(content=prompt_text)])
    return response.content, docs_ret


## Small Chunks

In [9]:
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_core.documents import Document

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

# ‚úÖ YOU MUST define this (replace with your real chunks list)
small_chunks = [
    "chunk 1 text ...",
    "chunk 2 text ...",
]

docs_small = [
    Document(page_content=ch, metadata={"source": "scrum_guide", "chunk_type": "small", "chunk_id": i})
    for i, ch in enumerate(small_chunks)
]

vs_small = FAISS.from_documents(docs_small, embeddings)
print("vs_small ready. Total docs:", len(docs_small))



vs_small ready. Total docs: 2


In [10]:
[x for x in globals().keys() if "chunk" in x.lower() or "split" in x.lower() or "doc" in x.lower()]


['__doc__', 'Document', 'small_chunks', 'docs_small']

## Large Chunks

In [60]:
pip install -U langchain-chroma chromadb


Note: you may need to restart the kernel to use updated packages.


In [61]:
from langchain_chroma import Chroma


In [62]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma

splitter_large = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=200)
splits_large = splitter_large.split_documents(docs)

vs_large = Chroma(
    collection_name="scrum_large",
    embedding_function=embeddings,
    persist_directory="./chroma_scrum_large"
)

vs_large.add_documents(splits_large)
print("Large chunks stored:", len(splits_large))


Large chunks stored: 28
