# Building RAG Pipeline with LangChain and FAISS

In [8]:
# Import libraries
import os
from dotenv import load_dotenv
import numpy as np
from typing import List, Any, Dict
import warnings
warnings.filterwarnings("ignore")

# Langchain Core imports
from langchain_core.documents import Document
from langchain_core.prompts import PromptTemplate, ChatPromptTemplate
from langchain_core.runnables import (RunnablePassthrough, )
from langchain_core.output_parsers import StrOutputParser
from langchain_core.messages import HumanMessage, AIMessage

# Langchain imports
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import TextLoader, PyPDFLoader
from langchain_classic.chains import create_retrieval_chain
from langchain_classic.chains.combine_documents import create_stuff_documents_chain

# Load environment variables
load_dotenv()

True

## 1. Data Parsing and Ingestion

### 1. Sample Data creation

In [5]:
sample_documents = [
    Document(page_content="""
    Artificial Intelligence (AI) is the field of computer science focused on creating machines 
    and systems that can perform tasks typically requiring human intelligence. AI systems can 
    learn from data, recognize patterns, make decisions, solve problems, understand language, 
    perceive their environment, and adapt to new situations. The goal is to build machines that 
    can think, reason, and act intelligently.
    """,
    metadata={"source": "AI Introduction", "page": 1, "topic": "AI"} ),
    Document(page_content="""
    Machine Learning (ML) is a subset of AI that focuses on developing algorithms and statistical
    models that enable computers to learn from and make predictions or decisions based on data. 
    Instead of being explicitly programmed for every task, ML systems identify patterns in data 
    and use these patterns to improve their performance over time. Common ML techniques include 
    supervised learning, unsupervised learning, and reinforcement learning.
    """,
    metadata={"source": "ML Overview", "page": 2, "topic": "ML"} ),
    Document(page_content="""
    Deep Learning (DL) is a specialized subset of ML that uses artificial neural networks with  
    multiple layers (hence "deep") to model and understand complex patterns in large datasets. DL has 
    been particularly successful in areas such as image and speech recognition, natural language 
    processing, and game playing. By leveraging vast amounts of data and computational power, DL 
    models can automatically learn hierarchical representations of data, enabling them to perform 
    complex tasks with high accuracy.
    """,
    metadata={"source": "DL Basics", "page": 3, "topic": "DL"} )
]

print(sample_documents)

[Document(metadata={'source': 'AI Introduction', 'page': 1, 'topic': 'AI'}, page_content='\n    Artificial Intelligence (AI) is the field of computer science focused on creating machines \n    and systems that can perform tasks typically requiring human intelligence. AI systems can \n    learn from data, recognize patterns, make decisions, solve problems, understand language, \n    perceive their environment, and adapt to new situations. The goal is to build machines that \n    can think, reason, and act intelligently.\n    '), Document(metadata={'source': 'ML Overview', 'page': 2, 'topic': 'ML'}, page_content='\n    Machine Learning (ML) is a subset of AI that focuses on developing algorithms and statistical\n    models that enable computers to learn from and make predictions or decisions based on data. \n    Instead of being explicitly programmed for every task, ML systems identify patterns in data \n    and use these patterns to improve their performance over time. Common ML techniq

### 2. Text splitting into Chunks

In [6]:
# Create text splitter instance
text_split = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=150,
    length_function=len,
    separators=[" "],
)
# Split documents into chunks
chunks = text_split.split_documents(sample_documents)
print(f"Number of chunks created: {len(chunks)}")
print(chunks)


Number of chunks created: 4
[Document(metadata={'source': 'AI Introduction', 'page': 1, 'topic': 'AI'}, page_content='Artificial Intelligence (AI) is the field of computer science focused on creating machines \n    and systems that can perform tasks typically requiring human intelligence. AI systems can \n    learn from data, recognize patterns, make decisions, solve problems, understand language, \n    perceive their environment, and adapt to new situations. The goal is to build machines that \n    can think, reason, and act intelligently.'), Document(metadata={'source': 'ML Overview', 'page': 2, 'topic': 'ML'}, page_content='Machine Learning (ML) is a subset of AI that focuses on developing algorithms and statistical\n    models that enable computers to learn from and make predictions or decisions based on data. \n    Instead of being explicitly programmed for every task, ML systems identify patterns in data \n    and use these patterns to improve their performance over time. Common 

### 3. Load embedding models and create vectors

In [9]:
# Initialize embedding models
os.environ['OPENAI_API_KEY']=os.getenv("OPENAI_API_KEY")
embeddings = OpenAIEmbeddings(model="text-embedding-3-small", dimensions=1536)

# Test with a single line chunk
sample_text = "Artificial Intelligence (AI) is the field of computer science focused on creating machines and systems that can perform tasks typically requiring human intelligence."
sample_embedding = embeddings.embed_query(sample_text)
print(f"Sample text embedding: {sample_embedding}")

Sample text embedding: [-0.00878689531236887, 0.0031398888677358627, -0.01423735823482275, -0.010479079559445381, 0.045904695987701416, -0.038368962705135345, -0.0009317799704149365, 0.03570365160703659, 0.006931724026799202, 0.031753625720739365, 0.0173916295170784, -0.006792705971747637, -0.0301237590610981, -0.04122602194547653, -0.002380083780735731, 0.003269319422543049, -0.01740121655166149, -0.031082503497600555, 0.06519463658332825, -0.03503252938389778, -0.00144291075412184, 0.03123590163886547, -1.8575678041088395e-05, -0.011504936031997204, 0.004724214319139719, -0.014697556383907795, 0.009736052714288235, 0.036528173834085464, -0.03202207386493683, 0.008513652719557285, -0.006840643472969532, -0.012300694361329079, -0.0083266980946064, -0.006960486527532339, 0.024716438725590706, 0.02015281282365322, -0.016423296183347702, -0.010316092520952225, 0.013547062873840332, -0.000422147277276963, -0.016394535079598427, 0.02260719984769821, 0.0284171923995018, 0.01752585358917713, 

In [10]:
texts = ["AI", "Machine Learning", "Deep Learning","Neural Networks"]
batch_embeddings = embeddings.embed_documents(texts)
print(batch_embeddings[0])
print(len(batch_embeddings[0]))

[-0.008208601735532284, -0.024612585082650185, 0.002946041990071535, 0.025167757645249367, 0.006533174309879541, -0.02826085314154625, -0.0050229765474796295, 0.020977536216378212, -0.036879222840070724, 0.012868072837591171, -0.0030385705176740885, -0.020144781097769737, 0.000270769844064489, -0.032781533896923065, 0.006414209026843309, -0.02529994025826454, -0.031010271981358528, -0.05440676957368851, 0.03280796855688095, -0.01842639409005642, 0.01658904179930687, 0.04832632467150688, -0.024876952171325684, 0.014341920614242554, 0.0293976329267025, 0.004044818226248026, 0.009246242232620716, 0.0133439339697361, 0.0025065315421670675, -0.02259017713367939, 0.03214704990386963, -0.028022922575473785, 0.005317085422575474, -0.038227494806051254, -0.0167080070823431, 0.014355138875544071, -0.03859760984778404, -0.01037641242146492, -0.01056146901100874, -0.01924593187868595, 0.03206774219870567, 0.014632724225521088, -0.02155914530158043, 0.016126397997140884, -0.011843650601804256, 0.00

In [11]:
### Compare embeddings using cosine similarity
def compare_embeddings(text1:str, text2:str):
    """ Compare semantic similarity between two texts using cosine similarity of their embeddings. """
    emb1 = np.array(embeddings.embed_query(text1))
    emb2 = np.array(embeddings.embed_query(text2))

    cosine_similarity = np.dot(emb1, emb2) / (np.linalg.norm(emb1) * np.linalg.norm(emb2))
    return cosine_similarity

In [13]:
# Test semantic similarity
print(f"Semantic similarity examples:")
print(f"AI vs Artificial Intelligence: {compare_embeddings('AI', 'Artificial Intelligence'):.3f}")

Semantic similarity examples:
AI vs Artificial Intelligence: 0.563


### 4. Create FAISS Vectorstore

In [15]:
vectorstore=FAISS.from_documents(documents=chunks, embedding=embeddings)
print(f"No of chunks in vector store: {len(chunks)}")
print(f"Vector store created with {vectorstore.index.ntotal} vectors.")

No of chunks in vector store: 4
Vector store created with 4 vectors.


In [16]:
vectorstore

<langchain_community.vectorstores.faiss.FAISS at 0x175cdcad0>

In [17]:
## Save vectorstore to disk
vectorstore.save_local("faiss_index")
print("Vector store is stored locally as 'faiss_index' directory.")

Vector store is stored locally as 'faiss_index' directory.


In [18]:
## Load vectorstore from disk
load_vectorstore = FAISS.load_local(
    "faiss_index", 
    embeddings, 
    allow_dangerous_deserialization=True
)
print(f"Vector store loaded from 'faiss_index' directory contains {load_vectorstore.index.ntotal} vectors.")

Vector store loaded from 'faiss_index' directory contains 4 vectors.


In [20]:
# Similarity search
query = "What is deep learning?"

results = load_vectorstore.similarity_search(query=query, k=3)
print("Top 3 similar chunks:")
for i, doc in enumerate(results, 1):
    print(f"Chunk {i}:\n{doc.page_content}\n")
    print(f"Chunk {i} Metadata:\n{doc.metadata}\n")

Top 3 similar chunks:
Chunk 1:
Deep Learning (DL) is a specialized subset of ML that uses artificial neural networks with  
    multiple layers (hence "deep") to model and understand complex patterns in large datasets. DL has 
    been particularly successful in areas such as image and speech recognition, natural language 
    processing, and game playing. By leveraging vast amounts of data and computational power, DL 
    models can automatically learn hierarchical representations of data, enabling them to perform

Chunk 1 Metadata:
{'source': 'DL Basics', 'page': 3, 'topic': 'DL'}

Chunk 2:
amounts of data and computational power, DL 
    models can automatically learn hierarchical representations of data, enabling them to perform 
    complex tasks with high accuracy.

Chunk 2 Metadata:
{'source': 'DL Basics', 'page': 3, 'topic': 'DL'}

Chunk 3:
Machine Learning (ML) is a subset of AI that focuses on developing algorithms and statistical
    models that enable computers to learn fro

In [21]:
### Similarity search with scores
results_with_scores = vectorstore.similarity_search_with_score(query=query, k=3)
for i, (doc, score) in enumerate(results_with_scores, 1):
    print(f"Chunk {i} (Score: {score:.4f}):\n{doc.page_content}\n")
    print(f"Chunk {i} Metadata:\n{doc.metadata}\n")


Chunk 1 (Score: 0.5757):
Deep Learning (DL) is a specialized subset of ML that uses artificial neural networks with  
    multiple layers (hence "deep") to model and understand complex patterns in large datasets. DL has 
    been particularly successful in areas such as image and speech recognition, natural language 
    processing, and game playing. By leveraging vast amounts of data and computational power, DL 
    models can automatically learn hierarchical representations of data, enabling them to perform

Chunk 1 Metadata:
{'source': 'DL Basics', 'page': 3, 'topic': 'DL'}

Chunk 2 (Score: 1.0125):
amounts of data and computational power, DL 
    models can automatically learn hierarchical representations of data, enabling them to perform 
    complex tasks with high accuracy.

Chunk 2 Metadata:
{'source': 'DL Basics', 'page': 3, 'topic': 'DL'}

Chunk 3 (Score: 1.1043):
Machine Learning (ML) is a subset of AI that focuses on developing algorithms and statistical
    models that ena

In [22]:
### Search with metadata filter
filter_dict = {"topic": "ML"}
filtered_results = load_vectorstore.similarity_search(
    query=query,
    k=3,
    filter=filter_dict
)
print("Top 3 similar chunks with metadata filter:")
for i, doc in enumerate(filtered_results, 1):
    print(f"Chunk {i}:\n{doc.page_content}\n")
    print(f"Chunk {i} Metadata:\n{doc.metadata}\n") 

Top 3 similar chunks with metadata filter:
Chunk 1:
Machine Learning (ML) is a subset of AI that focuses on developing algorithms and statistical
    models that enable computers to learn from and make predictions or decisions based on data. 
    Instead of being explicitly programmed for every task, ML systems identify patterns in data 
    and use these patterns to improve their performance over time. Common ML techniques include 
    supervised learning, unsupervised learning, and reinforcement learning.

Chunk 1 Metadata:
{'source': 'ML Overview', 'page': 2, 'topic': 'ML'}



### 5. Build RAG Chain with LCEL

In [41]:
## LLM GROQ LLM
from langchain.chat_models import init_chat_model
os.environ['GROQ_API_KEY'] = os.getenv("GROQ_API_KEY")

llm = init_chat_model(model="groq:llama-3.1-8b-instant")
llm

ChatGroq(profile={'max_input_tokens': 131072, 'max_output_tokens': 8192, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': False, 'tool_calling': True}, client=<groq.resources.chat.completions.Completions object at 0x14e0be510>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x14e0c4610>, model_name='llama-3.1-8b-instant', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [42]:
# 1. Simple RAG Chain with LCEL
simple_prompt = ChatPromptTemplate.from_template("""Answer the question based on the context below.
Context: {context}
Question: {question}
Answer:""")

In [43]:
# Basic retriever
retriever=vectorstore.as_retriever(search_type="similarity", search_kwargs={"k":3})

In [44]:
retriever

VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x175cdcad0>, search_kwargs={'k': 3})

In [45]:
def format_docs(docs: List[Document]) -> str:
    """ Format retrieved documents into a single context string. """

    formatted = []
    for i, doc in enumerate(docs):
        source = doc.metadata.get("source", "Unknown")
        formatted.append(f"Document {i+1} (Source: {source}):\n{doc.page_content}\n")
    return "\n\n".join(formatted)

In [46]:
simple_rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | simple_prompt
    | llm
    | StrOutputParser()
)
simple_rag_chain

{
  context: VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x175cdcad0>, search_kwargs={'k': 3})
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template='Answer the question based on the context below.\nContext: {context}\nQuestion: {question}\nAnswer:'), additional_kwargs={})])
| ChatGroq(profile={'max_input_tokens': 131072, 'max_output_tokens': 8192, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': False, 'tool_calling': True}, client=<groq.resources.chat.completions.Completions object at 0x14e0be510>, async_client=<groq.resources.c

In [47]:
### Conversational RAG Chain with LCEL
from langchain_classic.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage

conversational_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI assistant. Use the provided context to answer the user's questions."),
    ("placeholder","{chat_history}"),
    ("human", "Context: {context}\n\nQuestion: {input}"),
])


In [48]:
def create_conversational_rag():
    """ Create a conversational RAG chain using LCEL. """
    return (
        RunnablePassthrough.assign(
            context=lambda x: format_docs(retriever.invoke(x["input"]))
        )
        | conversational_prompt
        | llm
        | StrOutputParser()
    )

conversational_rag = create_conversational_rag()

In [49]:
conversational_rag

RunnableAssign(mapper={
  context: RunnableLambda(lambda x: format_docs(retriever.invoke(x['input'])))
})
| ChatPromptTemplate(input_variables=['context', 'input'], optional_variables=['chat_history'], input_types={'chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage, Tag(tag='chat')], typing.Annotated[langchain_core.messages.system.SystemMessage, Tag(tag='system')], typing.Annotated[langchain_core.messages.function.FunctionMessage, Tag(tag='function')], typing.Annotated[langchain_core.messages.tool.ToolMessage, Tag(tag='tool')], typing.Annotated[langchain_core.messages.ai.AIMessageChunk, Tag(tag='AIMessageChunk')], typing.Annotated[langchain_core.messages.human.HumanMessageChunk, Tag(tag='HumanMessageChunk')], typing.Annotated[langchain_core.messages.chat.ChatMessageChunk, Tag(tag=

In [50]:
### Streaming RAG chain with LCEL
streaming_rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | simple_prompt
    | llm
)
streaming_rag_chain

{
  context: VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x175cdcad0>, search_kwargs={'k': 3})
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template='Answer the question based on the context below.\nContext: {context}\nQuestion: {question}\nAnswer:'), additional_kwargs={})])
| ChatGroq(profile={'max_input_tokens': 131072, 'max_output_tokens': 8192, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': False, 'tool_calling': True}, client=<groq.resources.chat.completions.Completions object at 0x14e0be510>, async_client=<groq.resources.c

In [51]:
print("Modern RAG chains created successfully.")
print("Available chains:")
print("- Simple RAG Chain: Basic Q&A")
print("- Conversational RAG Chain: Maintains conversation history")
print("- Streaming RAG Chain: Supports token streaming")

Modern RAG chains created successfully.
Available chains:
- Simple RAG Chain: Basic Q&A
- Conversational RAG Chain: Maintains conversation history
- Streaming RAG Chain: Supports token streaming


In [56]:
# Test functions for different chain types
def test_rag_chain(question:str):
    """ Test all RAG chain variants """
    print(f"Question: {question}\n")
    print("=" * 80)
    print("\n1. Simple RAG Chain:")
    answer = simple_rag_chain.invoke(question)
    print(f"Answer: {answer}\n")
    print("\n\n2. Streaming RAG Chain:")
    print("Answer: ", end="", flush=True)
    for chunk in streaming_rag_chain.stream(question):
        print(chunk.content, end="", flush=True)
    print("\n")

test_rag_chain("What is deep learning?")

Question: What is deep learning?


1. Simple RAG Chain:
Answer: Deep learning is a specialized subset of Machine Learning (ML) that uses artificial neural networks with multiple layers to model and understand complex patterns in large datasets.



2. Streaming RAG Chain:
Answer: Deep learning is a specialized subset of Machine Learning (ML) that uses artificial neural networks with multiple layers to model and understand complex patterns in large datasets.



In [57]:
test_questions = [
    "What is difference between AI and Machine Learning?",
    "Explain Deep learning in simple terms.",
    "How does NLP work?"
]

for question in test_questions:
    print("\n" + "="*100 + "\n")
    test_rag_chain(question)



Question: What is difference between AI and Machine Learning?


1. Simple RAG Chain:
Answer: Based on the provided context, the difference between AI and Machine Learning (ML) lies in their scope and focus.

Artificial Intelligence (AI) is a broader field of computer science that focuses on creating machines and systems that can perform tasks typically requiring human intelligence. This includes tasks such as learning, recognizing patterns, making decisions, solving problems, understanding language, perceiving the environment, and adapting to new situations.

Machine Learning (ML), on the other hand, is a subset of AI that specifically focuses on developing algorithms and statistical models that enable computers to learn from and make predictions or decisions based on data. While AI encompasses a wide range of tasks, ML is primarily concerned with enabling machines to learn from data and improve their performance over time.

In other words, all Machine Learning is a part of Artificia

In [59]:
## Conversational example
print("Conversational RAG Chain Example:\n")
chat_history = []

q1 = "What is machine learning?"
a1 = conversational_rag.invoke({"input": q1, "chat_history": chat_history})
print(f"User: {q1}\nAI: {a1}\n")
chat_history.extend([HumanMessage(content=q1),AIMessage(content=a1)])
q2 = "How is it different from AI?"
a2 = conversational_rag.invoke({"input": q2, "chat_history": chat_history})
print(f"User: {q2}\nAI: {a2}\n")
chat_history.extend([HumanMessage(content=q2),AIMessage(content=a2)])


Conversational RAG Chain Example:

User: What is machine learning?
AI: According to Document 1: Machine Learning (ML) is a subset of Artificial Intelligence (AI) that focuses on developing algorithms and statistical models that enable computers to learn from and make predictions or decisions based on data.

User: How is it different from AI?
AI: Based on the provided documents, Machine Learning (ML) is a subset or a part of Artificial Intelligence (AI). This means that AI encompasses a broader range of technologies and techniques, including but not limited to ML.

The key differences between AI and ML are:

1. **Scope**: AI is a more general field that focuses on creating machines and systems that can perform tasks typically requiring human intelligence. ML, on the other hand, is a specific subset of AI that focuses on developing algorithms and statistical models that enable computers to learn from and make predictions or decisions based on data.

2. **Approach**: AI often involves exp