### Building a RAG System with LangChain and FAISS 
Introduction to RAG (Retrieval-Augmented Generation)
RAG combines the power of retrieval systems with generative AI models. Instead of relying solely on the model's training data, RAG:

1. Retrieves relevant documents from a knowledge base
2. Uses these documents as context for the LLM
3. Generates responses based on both the retrieved context and the model's knowledge

### FAISS 
https://github.com/facebookresearch/faiss

FAISS is a library for efficient similarity search and clustering of dense vectors.

Key advantages:
1. Extremely fast similarity search
2. Memory efficient
3. Supports GPU acceleration
4. Can handle millions of vectors

How it works:
- Indexes vectors for fast nearest neighbor search
- Returns most similar vectors based on distance metrics


In [1]:
## laod libraries
import os
from dotenv import load_dotenv
import numpy as np
import warnings
warnings.filterwarnings('ignore')

#Langchain core imports
from langchain_core.documents import Document
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import(
    RunnablePassthrough
)
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_classic.chains.retrieval import create_retrieval_chain
from langchain_classic.chains.combine_documents import create_stuff_documents_chain

#Load environment variables
load_dotenv

<function dotenv.main.load_dotenv(dotenv_path: Union[str, ForwardRef('os.PathLike[str]'), NoneType] = None, stream: Optional[IO[str]] = None, verbose: bool = False, override: bool = False, interpolate: bool = True, encoding: Optional[str] = 'utf-8') -> bool>

### Data Ingestion And Processing

In [2]:
sample_documents = [
    Document(
        page_content="""
        Artificial Intelligence (AI) is the simulation of human intelligence in machines.
        These systems are designed to think like humans and mimic their actions.
        AI can be categorized into narrow AI and general AI.
        """,
        metadata={"source": "AI Introduction", "page": 1, "topic": "AI"}
    ),
    Document(
        page_content="""
        Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being explicitly programmed, ML algorithms find patterns in data.
        Common types include supervised, unsupervised, and reinforcement learning.
        """,
        metadata={"source": "ML Basics", "page": 1, "topic": "ML"}
    ),
    Document(
        page_content="""
        Deep Learning is a subset of machine learning based on artificial neural networks.
        It uses multiple layers to progressively extract higher-level features from raw input.
        Deep learning has revolutionized computer vision, NLP, and speech recognition.
        """,
        metadata={"source": "Deep Learning", "page": 1, "topic": "DL"}
    ),
    Document(
        page_content="""
        Natural Language Processing (NLP) is a branch of AI that helps computers understand human language.
        It combines computational linguistics with machine learning and deep learning models.
        Applications include chatbots, translation, sentiment analysis, and text summarization.
        """,
        metadata={"source": "NLP Overview", "page": 1, "topic": "NLP"}
    )
]

print(sample_documents)

[Document(metadata={'source': 'AI Introduction', 'page': 1, 'topic': 'AI'}, page_content='\n        Artificial Intelligence (AI) is the simulation of human intelligence in machines.\n        These systems are designed to think like humans and mimic their actions.\n        AI can be categorized into narrow AI and general AI.\n        '), Document(metadata={'source': 'ML Basics', 'page': 1, 'topic': 'ML'}, page_content='\n        Machine Learning is a subset of AI that enables systems to learn from data.\n        Instead of being explicitly programmed, ML algorithms find patterns in data.\n        Common types include supervised, unsupervised, and reinforcement learning.\n        '), Document(metadata={'source': 'Deep Learning', 'page': 1, 'topic': 'DL'}, page_content='\n        Deep Learning is a subset of machine learning based on artificial neural networks.\n        It uses multiple layers to progressively extract higher-level features from raw input.\n        Deep learning has revolu

In [3]:
## text splitting
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap = 50,
    length_function = len,
    separators=[" "]
)

##split the documents into chunks
chunks = text_splitter.split_documents(sample_documents)
print(chunks[0])
print(chunks[1])

page_content='Artificial Intelligence (AI) is the simulation of human intelligence in machines.
        These systems are designed to think like humans and mimic their actions.
        AI can be categorized into narrow AI and general AI.' metadata={'source': 'AI Introduction', 'page': 1, 'topic': 'AI'}
page_content='Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being explicitly programmed, ML algorithms find patterns in data.
        Common types include supervised, unsupervised, and reinforcement learning.' metadata={'source': 'ML Basics', 'page': 1, 'topic': 'ML'}


In [4]:
print(f"Created {len(chunks)} chunks from {len(sample_documents)} documents")
print("\nExample chunk:")
print(f"Content: {chunks[0].page_content}")
print(f"Metadata: {chunks[0].metadata}")

Created 4 chunks from 4 documents

Example chunk:
Content: Artificial Intelligence (AI) is the simulation of human intelligence in machines.
        These systems are designed to think like humans and mimic their actions.
        AI can be categorized into narrow AI and general AI.
Metadata: {'source': 'AI Introduction', 'page': 1, 'topic': 'AI'}


In [5]:
### load the embedding models
import os
load_dotenv()

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

In [6]:
# Initialize OpenAI embeddings with the latest model

embeddings = OpenAIEmbeddings(
    model="text-embedding-3-small",
    dimensions=1536
)

## Example: create a embedding for a single text
sample_text = "What is machine Learning"
sample_embedding = embeddings.embed_query(sample_text)
sample_text

'What is machine Learning'

In [7]:
sample_embedding

[-0.012354416772723198,
 0.0067105707712471485,
 -0.016795476898550987,
 -0.04177643358707428,
 0.007984110154211521,
 0.019396979361772537,
 -0.010933930985629559,
 0.00220012036152184,
 -0.03226298838853836,
 0.053510408848524094,
 0.01938609406352043,
 -0.032458916306495667,
 -0.0401436910033226,
 0.0022844786290079355,
 0.03010776825249195,
 0.01352999173104763,
 0.008931100368499756,
 -0.02436051517724991,
 0.023054322227835655,
 0.01219114288687706,
 0.00583977485075593,
 0.04419289156794548,
 0.02832263708114624,
 -0.020169809460639954,
 0.01378034520894289,
 -0.020670518279075623,
 0.02224883623421192,
 0.027016442269086838,
 0.0028491353150457144,
 -0.013889194466173649,
 -0.0038940904196351767,
 -0.029302282258868217,
 -0.027865469455718994,
 0.02947644144296646,
 0.03927289694547653,
 0.01148362085223198,
 -0.0014449770096689463,
 -0.012702735140919685,
 -0.05943182110786438,
 -0.0008646731148473918,
 -0.05246545374393463,
 -0.016773706302046776,
 0.0154239721596241,
 0.0801

In [8]:
texts = ["AI", "Machine Learning", "Deep Learning", "Neural Network"]
batch_embeddings = embeddings.embed_documents(texts)
print(batch_embeddings[0])

[-0.008146658539772034, -0.024611903354525566, 0.002883070847019553, 0.025180581957101822, 0.0064902156591415405, -0.028275255113840103, -0.004995780065655708, 0.02094855159521103, -0.036871567368507385, 0.012861405499279499, -0.0030467314645648003, -0.02016827091574669, 0.00026408862322568893, -0.03277178853750229, 0.006460458971560001, -0.02531283348798752, -0.031052524223923683, -0.0543815940618515, 0.03279823809862137, -0.018396107479929924, 0.01662394590675831, 0.048324499279260635, -0.02488962933421135, 0.014402128756046295, 0.029359711334109306, 0.004013816360384226, 0.00925756711512804, 0.013410246931016445, 0.0025061555206775665, -0.022588463500142097, 0.032136980444192886, -0.02798430249094963, 0.005392532795667648, -0.03819407522678375, -0.016690069809556007, 0.014362453483045101, -0.03861727938055992, -0.010348637588322163, -0.010540401563048363, -0.019189612939953804, 0.03203118219971657, 0.014679855667054653, -0.021504005417227745, 0.016094941645860672, -0.011836460791528

In [9]:
print(batch_embeddings[1])

[-0.0220950935035944, -0.003550386056303978, -0.019209157675504684, -0.03406089171767235, 0.03383275493979454, 0.008634994737803936, 0.0014451069291681051, 0.026030460372567177, -0.041361283510923386, 0.04215976595878601, -0.0006733613554388285, -0.03789359703660011, -0.03771108761429787, -0.0016283296281471848, 0.016049455851316452, 0.016482917591929436, 0.019254785031080246, -0.017144514247775078, 0.01755516231060028, 0.017646417021751404, 0.02183273620903492, 0.02427380345761776, 0.01995060220360756, -0.017361244186758995, 0.0400380864739418, -0.02029280923306942, 0.029155941680073738, 0.04083656892180443, -0.007260468322783709, -0.027193961665034294, -0.014965804293751717, -0.014566564001142979, -0.03805329278111458, -0.016505731269717216, 0.02393159829080105, -0.00165399513207376, -0.0030370771419256926, 0.02671487256884575, -0.049095138907432556, 0.008634994737803936, -0.028950616717338562, -0.015182534232735634, 0.015319416299462318, 0.07364270836114883, -0.017589382827281952, -

In [10]:
## compare Embeddings using cosine similarity

def compare_embeddings(text1: str, text2: str) :
    """Compare semantic simialrity of 2 texts using embeddings"""

    emb1 = np.array(embeddings.embed_query(text1))
    emb2 = np.array(embeddings.embed_query(text2))

    ## calculate the similarity score

    similarity = np.dot(emb1, emb2)/(np.linalg.norm(emb1) * np.linalg.norm(emb2))
    return similarity


In [11]:
# Test semantic similarity
print("\nSemantic Similarity Examples:")
print(f"'AI' vs 'Artificial Intelligence': {compare_embeddings('AI', 'Artificial Intelligence'):.3f}")


Semantic Similarity Examples:
'AI' vs 'Artificial Intelligence': 0.563


In [12]:
print(f"'AI' vs 'Pizza': {compare_embeddings('AI', 'Pizza'):.3f}")

'AI' vs 'Pizza': 0.254


In [13]:
print(f"'Machine Learning' vs 'ML': {compare_embeddings('Machine Learning', 'ML'):.3f}")

'Machine Learning' vs 'ML': 0.461


### Create FAISS Vector Store

In [14]:
## install -  pip install faiss-cpu
vectorestore = FAISS.from_documents(
    documents=chunks,
    embedding=embeddings
)

print(f"Vectore store calculate {vectorestore.index.ntotal}")

Vectore store calculate 4


In [15]:
vectorestore

<langchain_community.vectorstores.faiss.FAISS at 0x1362f0558b0>

In [16]:
## Save vector store for later use
vectorestore.save_local("faiss_index")
print("vector store saved to 'faiss_index' directory" )

vector store saved to 'faiss_index' directory


In [17]:
## load vector store
loaded_vector_store = FAISS.load_local(
    "faiss_index",
    embeddings,
    allow_dangerous_deserialization= True
)

loaded_vector_store

<langchain_community.vectorstores.faiss.FAISS at 0x1362e1ace90>

In [18]:
## similarity search

query = "what is deep learning"

results = vectorestore.similarity_search(query, key=3)

In [19]:
results

[Document(id='c129b1f4-1f4a-408a-bad8-08cd90c9982c', metadata={'source': 'Deep Learning', 'page': 1, 'topic': 'DL'}, page_content='Deep Learning is a subset of machine learning based on artificial neural networks.\n        It uses multiple layers to progressively extract higher-level features from raw input.\n        Deep learning has revolutionized computer vision, NLP, and speech recognition.'),
 Document(id='8d6ce0f4-0867-4343-a029-b3050826388b', metadata={'source': 'ML Basics', 'page': 1, 'topic': 'ML'}, page_content='Machine Learning is a subset of AI that enables systems to learn from data.\n        Instead of being explicitly programmed, ML algorithms find patterns in data.\n        Common types include supervised, unsupervised, and reinforcement learning.'),
 Document(id='7387e3b0-3e64-4c32-b458-8b16603f2e45', metadata={'source': 'NLP Overview', 'page': 1, 'topic': 'NLP'}, page_content='Natural Language Processing (NLP) is a branch of AI that helps computers understand human la

In [20]:
print(f"Query: {query}\n")
print("Top 3 similar chunks:")
for i, doc in enumerate(results):
    print(f"\n{i+1}. Source: {doc.metadata['source']}")
    print(f"   Content: {doc.page_content[:200]}...")

Query: what is deep learning

Top 3 similar chunks:

1. Source: Deep Learning
   Content: Deep Learning is a subset of machine learning based on artificial neural networks.
        It uses multiple layers to progressively extract higher-level features from raw input.
        Deep learning ...

2. Source: ML Basics
   Content: Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being explicitly programmed, ML algorithms find patterns in data.
        Common types include supervised...

3. Source: NLP Overview
   Content: Natural Language Processing (NLP) is a branch of AI that helps computers understand human language.
        It combines computational linguistics with machine learning and deep learning models.
      ...

4. Source: AI Introduction
   Content: Artificial Intelligence (AI) is the simulation of human intelligence in machines.
        These systems are designed to think like humans and mimic their actions.
        AI can be catego

In [21]:
### Similarity Search with score

results_with_scores = vectorestore.similarity_search_with_score(query, k=3)

print("\n\nSimilarity search with scores:")
for doc, score in results_with_scores:
    print(f"\nScore: {score:.3f}")
    print(f"Source: {doc.metadata['source']}")
    print(f"Content preview: {doc.page_content[:100]}...")



Similarity search with scores:

Score: 0.538
Source: Deep Learning
Content preview: Deep Learning is a subset of machine learning based on artificial neural networks.
        It uses m...

Score: 1.168
Source: ML Basics
Content preview: Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being...

Score: 1.287
Source: NLP Overview
Content preview: Natural Language Processing (NLP) is a branch of AI that helps computers understand human language.
...


In [22]:
### Search with metadata filtering
filter_dict = {"topic":"ML"}
filtered_results = vectorestore.similarity_search(
    query,
    k=3,
    filter=filter_dict
)
print(filtered_results)

[Document(id='8d6ce0f4-0867-4343-a029-b3050826388b', metadata={'source': 'ML Basics', 'page': 1, 'topic': 'ML'}, page_content='Machine Learning is a subset of AI that enables systems to learn from data.\n        Instead of being explicitly programmed, ML algorithms find patterns in data.\n        Common types include supervised, unsupervised, and reinforcement learning.')]


In [23]:
len(filtered_results)

1

### Building RAG with LCEL

In [26]:
## LLM GROQ LLM
from langchain.chat_models import init_chat_model

os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")

llm = init_chat_model(
    model="gpt-4.1-mini",  # example real OpenAI model
    model_provider="openai"
)

llm

ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x000001362F06CB30>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x000001362F057920>, root_client=<openai.OpenAI object at 0x000001362E1D6180>, root_async_client=<openai.AsyncOpenAI object at 0x000001362F0558E0>, model_name='gpt-4.1-mini', model_kwargs={}, openai_api_key=SecretStr('**********'), stream_usage=True)

In [28]:
# 1. Simple RAG Chain with LCEL
simple_prompt = ChatPromptTemplate.from_template("""Answer the question based only on the following context:

Context: {context}
                                                 
Question: {question}
                                                 
Answer: """)

In [29]:
vectorestore

<langchain_community.vectorstores.faiss.FAISS at 0x1362f0558b0>

In [35]:
#Basic retriver
retriever = vectorestore.as_retriever(
    search_type = "similarity",
    search_kwargs = {"k" : 3}
)

In [36]:
from typing import List
# Format documents for the prompt
def format_docs(docs: List[Document]) -> str:
    """Format documents for insertion into prompt"""
    formatted = []
    for i, doc in enumerate(docs):
        source = doc.metadata.get('source', 'Unknown')
        formatted.append(f"Document {i+1} (Source: {source}):\n{doc.page_content}")
    return "\n\n".join(formatted)

In [37]:
simple_rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | simple_prompt
    | llm
    | StrOutputParser()
)

In [38]:
simple_rag_chain

{
  context: VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x000001362F0558B0>, search_kwargs={'k': 3})
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template='Answer the question based only on the following context:\n\nContext: {context}\n\nQuestion: {question}\n\nAnswer: '), additional_kwargs={})])
| ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x000001362F06CB30>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x000001362F057920>, root_client=<openai.OpenAI object at 0x000001362E1D6180>, root_async_client=<openai.AsyncOpenAI object at 0x000001362F0558E0>, mode

In [39]:
### Conversational RAG Chain -> Chat history will be taken care

conversational_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI assistant. use the provided context to answer questions"),
    ("placeholder", "{chat_history}"),
    ("human", "Context: {context}\n\nQuestion: {input}")
])

In [40]:
def create_conversational_rag():
    """Create a conversational RAG chain with memory"""
    return (
        RunnablePassthrough.assign(
            context=lambda x: format_docs(retriever.invoke(x["input"]))
        )
        | conversational_prompt
        | llm| StrOutputParser()
    )

In [43]:
conversational_rag = create_conversational_rag()
conversational_rag

RunnableAssign(mapper={
  context: RunnableLambda(lambda x: format_docs(retriever.invoke(x['input'])))
})
| ChatPromptTemplate(input_variables=['context', 'input'], optional_variables=['chat_history'], input_types={'chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage, Tag(tag='chat')], typing.Annotated[langchain_core.messages.system.SystemMessage, Tag(tag='system')], typing.Annotated[langchain_core.messages.function.FunctionMessage, Tag(tag='function')], typing.Annotated[langchain_core.messages.tool.ToolMessage, Tag(tag='tool')], typing.Annotated[langchain_core.messages.ai.AIMessageChunk, Tag(tag='AIMessageChunk')], typing.Annotated[langchain_core.messages.human.HumanMessageChunk, Tag(tag='HumanMessageChunk')], typing.Annotated[langchain_core.messages.chat.ChatMessageChunk, Tag(tag=

In [44]:
## streaming RAG Chain -> no string output

streaming_rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    |simple_prompt
    |llm
)

In [48]:
# Test function for different chain types
def test_rag_chains(question: str):
    """Test all RAG chain variants"""
    print(f"Question: {question}")
    print("=" * 80)

    #1. Simple RAG
    print("\n1. Simple RAG Chain:")
    answer = simple_rag_chain.invoke(question)
    print(f"Answer: {answer}")

    #2. Streaming RAG
    print("\n2. Streaming RAG:")
    print("Answer: ", end="", flush=True)
    for chunck in streaming_rag_chain.stream(question):
        print(chunck.content, end="", flush=True)
print()




In [49]:
test_rag_chains("What is the difference between AI and machine Learning")

Question: What is the difference between AI and machine Learning

1. Simple RAG Chain:
Answer: Artificial Intelligence (AI) is the broader concept of machines simulating human intelligence and mimicking human actions, while Machine Learning (ML) is a subset of AI that enables systems to learn from data by finding patterns without being explicitly programmed. In other words, AI encompasses the overall goal of creating intelligent machines, and ML is one approach within AI that achieves this by allowing machines to learn from data.

2. Streaming RAG:
Answer: Artificial Intelligence (AI) is the simulation of human intelligence in machines, designed to think like humans and mimic their actions. Machine Learning (ML), on the other hand, is a subset of AI that enables systems to learn from data by finding patterns, rather than being explicitly programmed. In summary, AI is the broader concept of machines exhibiting human-like intelligence, while ML is a specific approach within AI focused on

In [50]:
# Test with multiple questions
test_questions = [
    "What is the difference between AI and Machine Learning?",
    "Explain deep learning in simple terms",
    "How does NLP work?"
]

for question in test_questions:
    print("\n" + "=" * 80 + "\n")
    test_rag_chains(question)



Question: What is the difference between AI and Machine Learning?

1. Simple RAG Chain:
Answer: Artificial Intelligence (AI) is the broader concept of machines designed to simulate human intelligence and mimic human actions. Machine Learning (ML) is a subset of AI focused on enabling systems to learn from data by finding patterns, rather than being explicitly programmed. In other words, AI encompasses the overall goal of creating intelligent machines, while ML is one approach within AI that achieves this by allowing systems to learn from and improve with data.

2. Streaming RAG:
Answer: Artificial Intelligence (AI) is the broader concept of machines designed to simulate human intelligence and mimic human actions. Machine Learning (ML), on the other hand, is a subset of AI that specifically enables systems to learn from data by finding patterns, rather than being explicitly programmed. In other words, AI encompasses the overall goal of creating intelligent machines, while ML focuses o

In [56]:
## Conversational example
print("\n3. Conversational RAG Example:")
chat_history = []

# First question
q1 = "What is machine learning?"
a1 = conversational_rag.invoke({
    "input": q1,
    "chat_history": chat_history
})

print(f"Q1: {q1}")
print(f"A1: {a1}")


3. Conversational RAG Example:
Q1: What is machine learning?
A1: Machine learning is a subset of artificial intelligence (AI) that enables systems to learn from data. Instead of being explicitly programmed, machine learning algorithms find patterns in data. Common types of machine learning include supervised, unsupervised, and reinforcement learning.


In [58]:
# Update history
from langchain_core.messages import HumanMessage, AIMessage

chat_history.extend([
    HumanMessage(content=q1),
    AIMessage(content=a1)
])

In [59]:
# Follow-up question
q2 = "How is it different from traditional programming?"
a2 = conversational_rag.invoke({
    "input": q2,
    "chat_history": chat_history
})
print(f"\nQ2: {q2}")
print(f"A2: {a2}")


Q2: How is it different from traditional programming?
A2: Machine learning is different from traditional programming because, instead of being explicitly programmed with specific instructions, machine learning algorithms learn patterns from data. In traditional programming, developers write explicit rules for the computer to follow, whereas in machine learning, the system improves its performance by analyzing data and identifying patterns on its own.


**Traditional Programming:**

* **Explicit Instructions:**  Programmers write very specific instructions (code) that the computer follows step-by-step.  
* **Rule-Based:**  Programs rely on predefined rules and logic to process information.

**Machine Learning:**

* **Learning from Data:** Instead of explicit rules, ML algorithms learn patterns and relationships from large datasets.
* **Data-Driven:** The "program" is not fixed code but a set of algorithms that adapt and improve based on the data they are trained on.

**In essence:**

Traditional programming is like giving the computer a detailed recipe to follow. Machine learning is more like giving the computer a bunch of ingredients and letting it figure out how to make something delicious based on patterns it observes. 


