#### **Context Window Enhancement**

 - After retrieving for each retrieved chunk we'll add its previous chunk and next chunk to add full context or missing parts

##### **At time of adding data to vector_store**
 - At the step of adding vector_data to vector_store we'll add a chunk_id with each chunks that. helps us to retrieve previous and next chunk using chunk index.

 ---

 #### **LLM used**

In [1]:
# LLM used is 'llama3.2'

from langchain_ollama import ChatOllama

llm = ChatOllama(
    model='llama3.2',
    temperature=0,
    verbose=True
)

llm.invoke('How are you?')

  from .autonotebook import tqdm as notebook_tqdm


AIMessage(content="I'm just a language model, so I don't have feelings or emotions like humans do. However, I'm functioning properly and ready to assist you with any questions or tasks you may have! How can I help you today?", additional_kwargs={}, response_metadata={'model': 'llama3.2', 'created_at': '2025-12-15T16:37:16.842442Z', 'done': True, 'done_reason': 'stop', 'total_duration': 30122713625, 'load_duration': 3654286292, 'prompt_eval_count': 29, 'prompt_eval_duration': 13227501667, 'eval_count': 47, 'eval_duration': 10374591997, 'logprobs': None, 'model_name': 'llama3.2', 'model_provider': 'ollama'}, id='lc_run--07cc604d-479a-4731-ae3b-4fdce0d3f555-0', usage_metadata={'input_tokens': 29, 'output_tokens': 47, 'total_tokens': 76})

#### **Embedding Model used**

In [2]:
from langchain_huggingface import HuggingFaceEmbeddings 

embedding_model = HuggingFaceEmbeddings(model='all-MiniLM-L6-v2')

sample_embeddings = embedding_model.embed_query("Hey How are you?")
print(f"Length of sample embeddings : {len(sample_embeddings)}")
print(f"Sample embedding : {sample_embeddings[:100]}")

Length of sample embeddings : 384
Sample embedding : [-0.013380538672208786, 0.003255972173064947, 0.10806030035018921, 0.08322358131408691, 0.02040085941553116, -0.049066152423620224, 0.0722508355975151, 0.002980925841256976, -0.08823534101247787, 0.016058299690485, -0.03367079421877861, -4.332493062975118e-06, -0.02510129101574421, 0.0007887802203185856, 0.060331884771585464, -0.0415474958717823, 0.07702311128377914, -0.14256997406482697, -0.13958506286144257, 0.06023767963051796, 0.003192346775904298, 0.018982844427227974, 0.02300790697336197, 0.06056844815611839, -0.07911035418510437, -0.05399537831544876, -0.0008475205395370722, 0.03202424943447113, -0.029674910008907318, -0.04484577104449272, -0.10411098599433899, 0.06399180740118027, -0.05713418126106262, -0.02695028856396675, -0.028776653110980988, 0.00333896791562438, -0.0355900302529335, -0.13525626063346863, 0.009469274431467056, 0.0003555373114068061, 0.009924577549099922, -0.0014938903041183949, -0.009747199714183807, -0.0

#### **Loading the Data**

In [3]:
from langchain_community.document_loaders import PyPDFLoader 

file_path = '../data/Understanding_Climate_Change.pdf'

loader = PyPDFLoader(file_path)

docs = loader.load()

print(f"Number of docs : {len(docs)}")

Number of docs : 33


#### **Create Chunks**

In [4]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

chunk_size=400
chunk_overlap=50

text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)

chunks = text_splitter.split_documents(docs)
print(f"Number of Chunks are : {len(chunks)}")


Number of Chunks are : 215


In [7]:
# lets see how a chunk looks like 
from pprint import pprint 

pprint(chunks[0])

Document(metadata={'producer': 'Microsoft® Word 2021', 'creator': 'Microsoft® Word 2021', 'creationdate': '2024-07-13T20:17:34+03:00', 'author': 'Nir', 'moddate': '2024-07-13T20:17:34+03:00', 'source': '../data/Understanding_Climate_Change.pdf', 'total_pages': 33, 'page': 0, 'page_label': '1'}, page_content='Understanding Climate Change \nChapter 1: Introduction to Climate Change \nClimate change refers to significant, long-term changes in the global climate. The term \n"global climate" encompasses the planet\'s overall weather patterns, including temperature, \nprecipitation, and wind patterns, over an extended period. Over the past century, human')


In [9]:
chunks[0].metadata

{'producer': 'Microsoft® Word 2021',
 'creator': 'Microsoft® Word 2021',
 'creationdate': '2024-07-13T20:17:34+03:00',
 'author': 'Nir',
 'moddate': '2024-07-13T20:17:34+03:00',
 'source': '../data/Understanding_Climate_Change.pdf',
 'total_pages': 33,
 'page': 0,
 'page_label': '1'}

#### In this **metadata** we'll also add something which is **chunk_id**

In [10]:
# so in this metadata we'll also add something which is chunk_id 
c_id = 0
for chunk in chunks:
    chunk.metadata['chunk_id'] = c_id 
    c_id += 1


In [11]:
chunks[0].metadata

{'producer': 'Microsoft® Word 2021',
 'creator': 'Microsoft® Word 2021',
 'creationdate': '2024-07-13T20:17:34+03:00',
 'author': 'Nir',
 'moddate': '2024-07-13T20:17:34+03:00',
 'source': '../data/Understanding_Climate_Change.pdf',
 'total_pages': 33,
 'page': 0,
 'page_label': '1',
 'chunk_id': 0}

#### **Create a Vectorstore and make a retriever**

In [12]:
import faiss
from langchain_community.docstore.in_memory import InMemoryDocstore
from langchain_community.vectorstores import FAISS

index = faiss.IndexFlatL2(len(embedding_model.embed_query("hello world")))

vector_store = FAISS(
    embedding_function=embedding_model,
    index=index,
    docstore=InMemoryDocstore(),
    index_to_docstore_id={},
)

In [14]:
vector_store.add_documents(documents=chunks)

['693c5fed-3632-4b58-8dd3-c3172f80db7b',
 '2a1e17a5-e925-482c-9807-43b739fcc49e',
 'd81e63b9-bcbb-4539-af91-52221a3023c6',
 'f8dcccf1-b4c8-40ed-868f-727279833277',
 '90a918c6-8fa7-44e5-992c-2488dc73dfda',
 'd8f35aaa-b8f5-435e-9895-d1358acd0335',
 '9418fa84-79f4-4e9c-a73e-8b62b03f2df7',
 'e298cd87-ba07-4af8-9b5d-5fb76a06851c',
 '046e15fd-827a-4fe2-83d7-5d59ff48526f',
 'f5ce6c4a-de9e-41bb-8831-6a8fcb9e53dc',
 'aa092c6e-0f29-4444-be24-7091abf9cdce',
 'ad6825d4-649f-4c60-8bd0-d043307e24e9',
 '157a9879-82e1-4fbf-94d3-5215cd51029d',
 'dfaca812-6f82-4a99-b488-4bdc9a121816',
 '9f4cae15-d812-4e85-a640-54a63a3cdae5',
 '7301e099-7e74-4b2b-9181-f24f1ae35657',
 '87542339-904b-40e9-a69d-6e324d8c894e',
 '7029857a-7dbc-4e5c-bd44-c103a7947712',
 'a4358680-b4cd-465b-bbb8-c661063f0b0e',
 '43c4ee39-1c3e-4d72-bc66-217e5228bf6b',
 'd9d1136c-7644-43a3-91d5-9e662431cfc0',
 '32494972-3b1f-4615-ac41-b1cd9af8c47e',
 '550c896c-9c90-41cf-9d59-82d4e511b961',
 'f9c477ec-e568-4693-857a-0b6a83b8d2d5',
 '27a2fd6f-8f6a-

#### **Make a retriever**

In [18]:
retriever = vector_store.as_retriever(search_type="mmr", search_kwargs={"k": 3})

contexts = retriever.invoke("What is Climate Change?")

for i, doc in enumerate(contexts):
    print("-"*89)
    print(doc.page_content)

-----------------------------------------------------------------------------------------
Understanding Climate Change 
Chapter 1: Introduction to Climate Change 
Climate change refers to significant, long-term changes in the global climate. The term 
"global climate" encompasses the planet's overall weather patterns, including temperature, 
precipitation, and wind patterns, over an extended period. Over the past century, human
-----------------------------------------------------------------------------------------
Chapter 14: Climate Change and the Economy 
Economic Transformation
-----------------------------------------------------------------------------------------
and infrastructure. Cities are particularly vulnerable due to the "urban heat island" effect. 
Heatwaves can lead to heat-related illnesses and exacerbate existing health conditions. 
Changing Seasons 
Climate change is altering the timing and length of seasons, affecting ecosystems and human 
activities. For example, 

In [None]:
# Now for each chunk we need to add chunk that comes before it and the chunk that comes after it