In [1]:
%pip install langchain-community langchain

from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

loader = PyPDFLoader("Atomic habits ( PDFDrive ).pdf")
pages = loader.load()

splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50
)

chunks = splitter.split_documents(pages)

print(f"Loaded {len(pages)} pages.")
print(f"Split into {len(chunks)} chunks.")
print(chunks[0].page_content[:300])

You should consider upgrading via the '/Users/adityakumar/Library/Mobile Documents/com~apple~CloudDocs/RAG/.venv/bin/python -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.




Loaded 256 pages.
Split into 1136 chunks.
AN	IMPRINT	OF	PENGUIN	RANDOM	HOUSE	LLC
375	Hudson	Street
New	York,	New	York	10014
Copyright	©	2018	by	James	Clear
Penguin	supports	copyright.	Copyright	fuels	creativity,	encourages	diverse	voices,	promotes	free	speech,	and	creates	a	vibrant	culture.	Thank	you	for	buying	an	authorized	edition	of	this


In [2]:
from langchain_community.vectorstores import FAISS
from langchain.embeddings import OllamaEmbeddings

embedding = OllamaEmbeddings(model="nomic-embed-text")
vectorstore = FAISS.from_documents(chunks, embedding=embedding)
vectorstore.save_local("faiss_index")

print("✅ Vector store created and saved.")

  embedding = OllamaEmbeddings(model="nomic-embed-text")


✅ Vector store created and saved.


In [3]:
from langchain.chains import RetrievalQA
from langchain_community.llms import Ollama

# Load the FAISS index
vectorstore = FAISS.load_local("faiss_index", embedding, allow_dangerous_deserialization=True)

# Create retriever
retriever = vectorstore.as_retriever()

# Connect Ollama LLM (Mistral)
llm = Ollama(model="mistral")

# Build the RAG QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    return_source_documents=True
)

# Ask a question!
query = "What are the key habits described in the book?"
result = qa_chain(query)

print("Answer:", result["result"])
print("\nSources:")
for doc in result["source_documents"]:
    print("-", doc.metadata.get("source", "No source"))

  llm = Ollama(model="mistral")
  result = qa_chain(query)


Answer:  The book does not specifically describe any particular key habits, but instead focuses on the process of building better habits in general. The Four Laws of Behavior Change outlined in the text are considered a simple set of rules to build better habits:

1. Make it obvious - Identify the cue and craving that trigger a behavior, and make the desired response associated with it more apparent.
2. Make it attractive - Increase the reward and satisfaction you get from performing the desired habit.
3. Make it easy - Minimize the effort or barriers to doing the desired habit.
4. Make it satisfying - Reward yourself for performing the desired habit to reinforce it.

Sources:
- Atomic habits ( PDFDrive ).pdf
- Atomic habits ( PDFDrive ).pdf
- Atomic habits ( PDFDrive ).pdf
- Atomic habits ( PDFDrive ).pdf


In [8]:
query = "tell more about walk slowely but never backward"
result = qa_chain(query)

print("Answer:", result["result"])
print("\nSources:")
for doc in result["source_documents"]:
    print("-", doc.metadata.get("source", "No source"))

Answer:  The context provided does not directly mention walking slowly but never backward. However, the text discusses how habits form based on frequency, not time, and it emphasizes that once a habit is formed, it often determines the choices we make. If someone were to develop a habit of walking slowly, they might find it difficult to break that habit and start walking quickly or even walk backwards, as their brain has established a pattern of walking slowly. But without specific information about walking slowly but never backward, this answer is speculative and based on general interpretations from the provided context.

Sources:
- Atomic habits ( PDFDrive ).pdf
- Atomic habits ( PDFDrive ).pdf
- Atomic habits ( PDFDrive ).pdf
- Atomic habits ( PDFDrive ).pdf


In [7]:
query = "who is the author?"
result = qa_chain(query)

print("Answer:", result["result"])
print("\nSources:")
for doc in result["source_documents"]:
    print("-", doc.metadata.get("source", "No source"))

Answer:  The author of the provided context is James Clear.

Sources:
- Atomic habits ( PDFDrive ).pdf
- Atomic habits ( PDFDrive ).pdf
- Atomic habits ( PDFDrive ).pdf
- Atomic habits ( PDFDrive ).pdf


In [6]:
query = "what is Enviorment matter more?"
result = qa_chain(query)

print("Answer:", result["result"])
print("\nSources:")
for doc in result["source_documents"]:
    print("-", doc.metadata.get("source", "No source"))

Answer:  In the context provided, it's not entirely clear what "Enviorment matter more" refers to. However, based on the discussion about reducing friction associated with habits and designing one's environment for success, it seems that in this context, the emphasis is on how one's environment can significantly impact their habits and overall well-being. So, in a sense, the "environment" matters more when it comes to simplifying one's life and achieving success. This is achieved by designing an environment that aligns with one's goals and reduces the friction associated with one's habits.

Sources:
- Atomic habits ( PDFDrive ).pdf
- Atomic habits ( PDFDrive ).pdf
- Atomic habits ( PDFDrive ).pdf
- Atomic habits ( PDFDrive ).pdf


##

In this assignment, I learned how to build a local RAG pipeline using LangChain and Ollama. I uploaded a PDF, split it into chunks, embedded it using a local model, stored it in FAISS, and queried it with Mistral. I also customized prompts to improve answer quality.

Tools used:
- LangChain
- Ollama (nomic-embed-text & mistral)
- FAISS vector store

This helped me understand how retrieval-based systems work in real-world GenAI apps.