### Building a RAG System with LangChain and FAISS

### FAISS 
https://github.com/facebookresearch/faiss

FAISS is a library for efficient similarity search and clustering of dense vectors.

Key advantages:
1. Extremely fast similarity search
2. Memory efficient
3. Supports GPU acceleration
4. Can handle millions of vectors

How it works:
- Indexes vectors for fast nearest neighbor search
- Returns most similar vectors based on distance metrics


In [40]:
### Load libraries
import os
from dotenv import load_dotenv
import numpy as np
import warnings
warnings.filterwarnings('ignore')

#langchain core imports 
from langchain_core.documents import Document
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.runnables import (
    RunnablePassthrough,
)
from langchain_core.output_parsers import StrOutputParser
from langchain_core.messages import HumanMessage, AIMessage

#Langchain specific imports 
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import Chroma, FAISS
from langchain_community.document_loaders import TextLoader, PyPDFLoader
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

##Load env variables
load_dotenv()

True

### Data Ingestion and Processing

In [2]:
sample_documents = [
    Document(
        page_content="""
        Artificial Intelligence (AI) is the simulation of human intelligence in machines.
        These systems are designed to think like humans and mimic their actions.
        AI can be categorized into narrow AI and general AI.
        """,
        metadata={"source": "AI Introduction", "page": 1, "topic": "AI"}
    ),
    Document(
        page_content="""
        Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being explicitly programmed, ML algorithms find patterns in data.
        Common types include supervised, unsupervised, and reinforcement learning.
        """,
        metadata={"source": "ML Basics", "page": 1, "topic": "ML"}
    ),
    Document(
        page_content="""
        Deep Learning is a subset of machine learning based on artificial neural networks.
        It uses multiple layers to progressively extract higher-level features from raw input.
        Deep learning has revolutionized computer vision, NLP, and speech recognition.
        """,
        metadata={"source": "Deep Learning", "page": 1, "topic": "DL"}
    ),
    Document(
        page_content="""
        Natural Language Processing (NLP) is a branch of AI that helps computers understand human language.
        It combines computational linguistics with machine learning and deep learning models.
        Applications include chatbots, translation, sentiment analysis, and text summarization.
        """,
        metadata={"source": "NLP Overview", "page": 1, "topic": "NLP"}
    )
]

print(sample_documents)

[Document(metadata={'source': 'AI Introduction', 'page': 1, 'topic': 'AI'}, page_content='\n        Artificial Intelligence (AI) is the simulation of human intelligence in machines.\n        These systems are designed to think like humans and mimic their actions.\n        AI can be categorized into narrow AI and general AI.\n        '), Document(metadata={'source': 'ML Basics', 'page': 1, 'topic': 'ML'}, page_content='\n        Machine Learning is a subset of AI that enables systems to learn from data.\n        Instead of being explicitly programmed, ML algorithms find patterns in data.\n        Common types include supervised, unsupervised, and reinforcement learning.\n        '), Document(metadata={'source': 'Deep Learning', 'page': 1, 'topic': 'DL'}, page_content='\n        Deep Learning is a subset of machine learning based on artificial neural networks.\n        It uses multiple layers to progressively extract higher-level features from raw input.\n        Deep learning has revolu

In [5]:
####text splitting 
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 500, 
    chunk_overlap = 50, 
    length_function = len,
    separators=[" "]
)

###Split the documents into chunks 
chunks = text_splitter.split_documents(sample_documents)
print(chunks[0])
print(chunks[1])

page_content='Artificial Intelligence (AI) is the simulation of human intelligence in machines.
        These systems are designed to think like humans and mimic their actions.
        AI can be categorized into narrow AI and general AI.' metadata={'source': 'AI Introduction', 'page': 1, 'topic': 'AI'}
page_content='Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being explicitly programmed, ML algorithms find patterns in data.
        Common types include supervised, unsupervised, and reinforcement learning.' metadata={'source': 'ML Basics', 'page': 1, 'topic': 'ML'}


In [6]:
###Embedding models 
import os
load_dotenv()

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

In [7]:
embeddings = OpenAIEmbeddings(
    model = "text-embedding-3-small",
    dimensions = 1536
)

###Example:create a embedding for a single text 
sample_text = "What is machine learning"
sample_embedding = embeddings.embed_query(sample_text)
print(sample_embedding)

[-0.0059221177361905575, -0.005889697000384331, 0.0005751235294155777, -0.033544257283210754, 0.021192103624343872, 0.02213229238986969, -0.0007868689717724919, 0.00929923728108406, -0.022564563900232315, 0.037823744118213654, 0.015183531679213047, -0.03676467761397362, -0.03397652879357338, 0.0016048074467107654, 0.026735983788967133, 0.016015654429793358, 0.0008530605118721724, -0.030864175409078598, 0.023385880514979362, 0.004903578199446201, 3.242035018047318e-05, 0.032550033181905746, 0.042276136577129364, -0.029307996854186058, 0.015961619094014168, -0.02092193439602852, 0.0266279149800539, 0.01239538099616766, 0.00744047062471509, -0.006019378546625376, -0.009423515759408474, -0.029027020558714867, -0.03127483278512955, 0.02755729854106903, 0.04173579812049866, 0.00043159592314623296, -0.0080564571544528, -0.010633875615894794, -0.055503640323877335, 0.0025855230633169413, -0.05632495880126953, -0.016264209523797035, 0.03028060868382454, 0.07612298429012299, -0.02600112184882164

In [8]:
texts=["AI","MAchine learning","Deep Learning","Neural Network"]
batch_embeddings=embeddings.embed_documents(texts)
print(batch_embeddings[0])

[-0.00816146656870842, -0.024636395275592804, 0.00291268527507782, 0.02513863891363144, 0.006509347818791866, -0.028257839381694794, -0.005015832372009754, 0.02094886638224125, -0.03687528893351555, 0.012899743393063545, -0.0030365942511707544, -0.02011619694530964, 0.000270121410721913, -0.032751601189374924, 0.0064399586990475655, -0.025297241285443306, -0.03103339858353138, -0.0544009655714035, 0.03283090516924858, -0.018411211669445038, 0.016640139743685722, 0.04832116886973381, -0.02488751709461212, 0.014406475238502026, 0.029394496232271194, 0.004087341949343681, 0.009271690621972084, 0.01337555330246687, 0.002489742822945118, -0.022534899413585663, 0.032143622636795044, -0.028046367689967155, 0.005359473172575235, -0.03819698467850685, -0.016706224530935287, 0.014393257908523083, -0.03861992806196213, -0.010355480015277863, -0.010580168105661869, -0.01920422725379467, 0.0320378877222538, 0.014697248116135597, -0.02155684493482113, 0.016124678775668144, -0.011829170398414135, 0.0

In [9]:
##Compare embedding using cosine similarity 

def compare_embeddings(text1:str, text2:str):
    """Compare semantic similarity of 2 text using embeddings"""

    emb1 = np.array(embeddings.embed_query(text1))
    emb2 = np.array(embeddings.embed_query(text2))

    ##Calculate the similarity score

    similarity = np.dot(emb1, emb2)/(np.linalg.norm(emb1)*np.linalg.norm(emb2))
    return similarity

In [10]:
### test semantic similarity 
print("\n Semantic Similarity Examples:")
print(f"'AI' vs 'Artificial Intelligence' : {compare_embeddings('AI', 'Artificial Intelligence'):.3f}")


 Semantic Similarity Examples:
'AI' vs 'Artificial Intelligence' : 0.563


In [12]:
### test semantic similarity 
print("\n Semantic Similarity Examples:")
print(f"'ML' vs 'Machine Learning' : {compare_embeddings('ML','Machine Learning'):.3f}")


 Semantic Similarity Examples:
'ML' vs 'Machine Learning' : 0.461


### Create FIASS Vectore Store

In [13]:
vectorestore = FAISS.from_documents(
    documents=chunks, 
    embedding=embeddings
)

print(f"Vectore store created with {vectorestore.index.ntotal} vectores")

Vectore store created with 4 vectores


In [14]:
### Save vectore store for later use

vectorestore.save_local("fiass_index")

In [15]:
### load vectore store
load_vectorstore = FAISS.load_local(
    "fiass_index",
    embeddings,
    allow_dangerous_deserialization=True
)

print(f"Loaded vectore store contains {load_vectorstore.index.ntotal} vectores")

Loaded vectore store contains 4 vectores


In [16]:
#### Similarity Search
query = "What is deep learning"

result = vectorestore.similarity_search(query, k=3)
print(result)

[Document(id='7b0fd4f1-5c8b-4231-8ec4-36f5a02bf39b', metadata={'source': 'Deep Learning', 'page': 1, 'topic': 'DL'}, page_content='Deep Learning is a subset of machine learning based on artificial neural networks.\n        It uses multiple layers to progressively extract higher-level features from raw input.\n        Deep learning has revolutionized computer vision, NLP, and speech recognition.'), Document(id='dfd2b136-572e-4e1c-baa0-80b0556d969b', metadata={'source': 'ML Basics', 'page': 1, 'topic': 'ML'}, page_content='Machine Learning is a subset of AI that enables systems to learn from data.\n        Instead of being explicitly programmed, ML algorithms find patterns in data.\n        Common types include supervised, unsupervised, and reinforcement learning.'), Document(id='6704697d-5428-4160-8f7c-e759da4b0c27', metadata={'source': 'NLP Overview', 'page': 1, 'topic': 'NLP'}, page_content='Natural Language Processing (NLP) is a branch of AI that helps computers understand human lang

In [18]:
print(f"Query: {query}\n")
print("Top 3 similar chunks:")
for i, doc in enumerate(result):
    print(f"\n{i+1}. Source: {doc.metadata['source']}")
    print(f"   Content: {doc.page_content[:200]}...")

Query: What is deep learning

Top 3 similar chunks:

1. Source: Deep Learning
   Content: Deep Learning is a subset of machine learning based on artificial neural networks.
        It uses multiple layers to progressively extract higher-level features from raw input.
        Deep learning ...

2. Source: ML Basics
   Content: Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being explicitly programmed, ML algorithms find patterns in data.
        Common types include supervised...

3. Source: NLP Overview
   Content: Natural Language Processing (NLP) is a branch of AI that helps computers understand human language.
        It combines computational linguistics with machine learning and deep learning models.
      ...


In [20]:
###Similarity Search with score

result_with_scores = vectorestore.similarity_search_with_score(
    query, k=3
)

print("\n\nSimilarity search with scores:")
for doc, score in result_with_scores:
    print(f"\nScore: {score:.3f}")
    print(f"Source: {doc.metadata['source']}")
    print(f"Content preview: {doc.page_content[:100]}...")



Similarity search with scores:

Score: 0.556
Source: Deep Learning
Content preview: Deep Learning is a subset of machine learning based on artificial neural networks.
        It uses m...

Score: 1.207
Source: ML Basics
Content preview: Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being...

Score: 1.274
Source: NLP Overview
Content preview: Natural Language Processing (NLP) is a branch of AI that helps computers understand human language.
...


In [21]:
chunks

[Document(metadata={'source': 'AI Introduction', 'page': 1, 'topic': 'AI'}, page_content='Artificial Intelligence (AI) is the simulation of human intelligence in machines.\n        These systems are designed to think like humans and mimic their actions.\n        AI can be categorized into narrow AI and general AI.'),
 Document(metadata={'source': 'ML Basics', 'page': 1, 'topic': 'ML'}, page_content='Machine Learning is a subset of AI that enables systems to learn from data.\n        Instead of being explicitly programmed, ML algorithms find patterns in data.\n        Common types include supervised, unsupervised, and reinforcement learning.'),
 Document(metadata={'source': 'Deep Learning', 'page': 1, 'topic': 'DL'}, page_content='Deep Learning is a subset of machine learning based on artificial neural networks.\n        It uses multiple layers to progressively extract higher-level features from raw input.\n        Deep learning has revolutionized computer vision, NLP, and speech recogn

In [24]:
###Search with metadata filtering

filter_dict = {"topic":"ML"}

filtered_results = vectorestore.similarity_search(
    query, 
    k=3, 
    filter=filter_dict
)
print(filtered_results[0].page_content)


Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being explicitly programmed, ML algorithms find patterns in data.
        Common types include supervised, unsupervised, and reinforcement learning.


### Build RAH Chain with LCEL

In [26]:
### LLM GROQ LLM 

from langchain.chat_models import init_chat_model

os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")
llm = init_chat_model(model = "groq:gemma2-9b-it")
llm

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x00000204FB1C78C0>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x00000204FF4AC050>, model_name='gemma2-9b-it', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [27]:
# 1. Simple RAG Chain with LCEL
simple_prompt = ChatPromptTemplate.from_template("""Answer the question based only on the following context:
Context: {context}

Question: {question}

Answer:""")

In [30]:
retriever = vectorestore.as_retriever(
    search_type = "similarity", 
    search_kwargs = {"k":3}
)

In [31]:
from typing import List
#Format documents for the the prompt
def format_docs(docs:List[Document])->str:
    """Format documents for insertion into prompt"""
    formatted = []
    for i, doc in enumerate(docs):
        source = doc.metadata.get('source', 'Unknown')
        formatted.append(f"Document {i+1} (Source:{source}):\n{doc.page_content}")
    return "\n\n".join(formatted)

In [34]:
simple_rag_chain = (
    {"context":retriever | format_docs, "question":RunnablePassthrough()}
    |simple_prompt
    | llm
    | StrOutputParser()
)
simple_rag_chain

{
  context: VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x00000204FAD46F90>, search_kwargs={'k': 3})
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template='Answer the question based only on the following context:\nContext: {context}\n\nQuestion: {question}\n\nAnswer:'), additional_kwargs={})])
| ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x00000204FB1C78C0>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x00000204FF4AC050>, model_name='gemma2-9b-it', model_kwargs={}, groq_api_key=SecretStr('**********'))
| StrOutputParser()

In [36]:
###Conversational RAG Chain

conversational_prompt = ChatPromptTemplate.from_messages([
    ("system","You are a helpful AI assistant. Use the provided context to answer questions."),
    ("placeholder","{chat_history}"),
    ("human", "Context:{context}\n\nQuestion: {input}"), 
])

In [42]:
from langchain_core.runnables import RunnablePassthrough
def create_conversational_rag():
    """Create a conversational RAG chain with memory"""
    return (
        RunnablePassthrough.assign(
            context = lambda x:format_docs(retriever.invoke(x["input"]))
        )
        | conversational_prompt
        | llm
        | StrOutputParser()
    )

conversational_rag = create_conversational_rag()

In [43]:
conversational_rag

RunnableAssign(mapper={
  context: RunnableLambda(lambda x: format_docs(retriever.invoke(x['input'])))
})
| ChatPromptTemplate(input_variables=['context', 'input'], optional_variables=['chat_history'], input_types={'chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage, Tag(tag='chat')], typing.Annotated[langchain_core.messages.system.SystemMessage, Tag(tag='system')], typing.Annotated[langchain_core.messages.function.FunctionMessage, Tag(tag='function')], typing.Annotated[langchain_core.messages.tool.ToolMessage, Tag(tag='tool')], typing.Annotated[langchain_core.messages.ai.AIMessageChunk, Tag(tag='AIMessageChunk')], typing.Annotated[langchain_core.messages.human.HumanMessageChunk, Tag(tag='HumanMessageChunk')], typing.Annotated[langchain_core.messages.chat.ChatMessageChunk, Tag(tag=

In [45]:
###Steaming RAG chain

streaming_rag_chain = (
    {"context":retriever | format_docs, "question":RunnablePassthrough()}
    |simple_prompt
    |llm
)

streaming_rag_chain

{
  context: VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x00000204FAD46F90>, search_kwargs={'k': 3})
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template='Answer the question based only on the following context:\nContext: {context}\n\nQuestion: {question}\n\nAnswer:'), additional_kwargs={})])
| ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x00000204FB1C78C0>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x00000204FF4AC050>, model_name='gemma2-9b-it', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [46]:
print("Modern RAG chains created successfully!")
print("Available chains:")
print("- simple_rag_chain: Basic Q&A")
print("- conversational_rag: Maintains conversation history")
print("- streaming_rag_chain: Supports token streaming")

Modern RAG chains created successfully!
Available chains:
- simple_rag_chain: Basic Q&A
- conversational_rag: Maintains conversation history
- streaming_rag_chain: Supports token streaming


In [50]:
##Test function for different chain types
def test_rag_chain(question:str):
    """Test all RAG chain variants"""
    print(f"Question: :{question}")
    print("="*90)

    #1.Simple RAG 
    print("\n Simple RAG chain:")
    answer = simple_rag_chain.invoke(question)
    print(f"Answer: {answer}")

    #2 Streaming RAG
    print("\n Streaming RAg:")
    print("Answer: ", end="", flush=True)
    for chunk in streaming_rag_chain.stream(question):
        print(chunk.content, end="", flush=True)
    print()

In [51]:
test_rag_chain("What is the difference between AI and Machine Learning")

Question: :What is the difference between AI and Machine Learning

 Simple RAG chain:
Answer: AI is the broad simulation of human intelligence in machines, while Machine Learning is a subset of AI that enables systems to learn from data without explicit programming.  


 Streaming RAg:
Answer: AI is the broad concept of machines simulating human intelligence, while Machine Learning is a subset of AI that focuses on enabling systems to learn from data without explicit programming. 



In [52]:
### Test with multiple questions

test_question = [
    "What is the difference between AI and Machine leanrning?", 
    "Explain deep learning is simple terms", 
    "How does NLP work?"
]

for question in test_question:
    print("\n" + "="*80, "\n")
    test_rag_chain(question)



Question: :What is the difference between AI and Machine leanrning?

 Simple RAG chain:
Answer: Artificial Intelligence (AI) is a broad concept that encompasses the simulation of human intelligence in machines. Machine learning, on the other hand, is a *subset* of AI that focuses on enabling systems to learn from data without explicit programming.  


 Streaming RAg:
Answer: AI is the simulation of human intelligence in machines, while Machine Learning is a subset of AI that enables systems to learn from data instead of being explicitly programmed. 



Question: :Explain deep learning is simple terms

 Simple RAG chain:
Answer: Deep learning is a type of machine learning that uses artificial networks with many layers to learn complex patterns from data. 

Think of it like teaching a computer to recognize a cat in a picture. Instead of giving it specific rules, you show it thousands of cat pictures.  The deep learning network analyzes these images, learning to identify features that m

In [53]:
### Conversational Example 
print("\n. Conversational RAG Example:")
chat_history = []

#First question 
q1 = "What is machine learning?"
a1 = conversational_rag.invoke({
    "input":q1, 
    "chat_history":chat_history
})

print(f"Q1: {q1}")
print(f"A1: {a1}")


. Conversational RAG Example:
Q1: What is machine learning?
A1: Machine learning is a subset of Artificial Intelligence (AI) that allows systems to learn from data.  Instead of being explicitly programmed, machine learning algorithms identify patterns within data.  



In [54]:
chat_history.extend([
    HumanMessage(content = q1),
    AIMessage(content=a1)
])

In [56]:
chat_history

[HumanMessage(content='What is machine learning?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='Machine learning is a subset of Artificial Intelligence (AI) that allows systems to learn from data.  Instead of being explicitly programmed, machine learning algorithms identify patterns within data.  \n', additional_kwargs={}, response_metadata={})]

In [57]:
### Follow up question 
q2 = "How is it different from traditional programming?"
a2 = conversational_rag.invoke({
    "input":q2, 
    "chat_history":chat_history
})

print(f"Q2: {q2}")
print(f"A2: {a2}")

Q2: How is it different from traditional programming?
A2: Based on the provided documents, here's how machine learning differs from traditional programming:

* **Traditional Programming:**  
    *  You explicitly write instructions for the computer to follow.
    *  The computer executes these instructions step-by-step to solve a specific problem.
* **Machine Learning:**
    * You provide the computer with data and algorithms.
    * The algorithms learn patterns and relationships within the data *without* explicit instructions.
    * The machine learns to make predictions or decisions based on the patterns it has identified.

Essentially, in traditional programming, you tell the computer *what* to do. In machine learning, you show the computer *data* and let it figure out *what* to do. 




