FAISS is a library efficient similarity search and clusturing of dense vector

Key Advantages :
1) Extremely Fast Similarity Search
2) Memory Efficient
3) Support GPU acceleration
4) Can Hangle millions of vectors


Working :
 - Indexes Vector for fast nearest neightbour search
 - Returns most similar vectors based on distance metrics
 

In [90]:
# Libraries required for the working

import os
from dotenv import load_dotenv
import numpy as np # for cosine similarity function
import warnings
warnings.filterwarnings('ignore')

# Langchain core imports
from langchain_core.documents import Document
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.runnables import RunnablePassthrough


from langchain_core.output_parsers import StrOutputParser # How everything inside the chain will be printed
from langchain_core.messages import HumanMessage, AIMessage

# LangChain Specific Imports
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings

# Vector store
from langchain_community.vectorstores import FAISS


from langchain_community.document_loaders import TextLoader, PyPDFLoader
from langchain.chains import create_retrieval_chain
# from langchain.chains.combine_documents import create_stuff_documents_chain


- Before load_dotenv() - the python program can't see .env file directly
- Once load_dotenv() is loaded, it has access to all the variables inside .env file
- Using os.getenv("API_KEY") : we can access the variable


In [91]:
# load the environment variable
load_dotenv()

True

Step 1: Having all the data in Document Structure

In [92]:
sample_documents = [
    Document(
        page_content="""
        Artificial Intelligence (AI) is the simulation of human intelligence in machines.
        These systems are designed to think like humans and mimic their actions.
        AI can be categorized into narrow AI and general AI.
        """,
        metadata={"source": "AI Introduction", "page": 1, "topic": "AI"}
    ),
    Document(
        page_content="""
        Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being explicitly programmed, ML algorithms find patterns in data.
        Common types include supervised, unsupervised, and reinforcement learning.
        """,
        metadata={"source": "ML Basics", "page": 1, "topic": "ML"}
    ),
    Document(
        page_content="""
        Deep Learning is a subset of machine learning based on artificial neural networks.
        It uses multiple layers to progressively extract higher-level features from raw input.
        Deep learning has revolutionized computer vision, NLP, and speech recognition.
        """,
        metadata={"source": "Deep Learning", "page": 1, "topic": "DL"}
    ),
    Document(
        page_content="""
        Natural Language Processing (NLP) is a branch of AI that helps computers understand human language.
        It combines computational linguistics with machine learning and deep learning models.
        Applications include chatbots, translation, sentiment analysis, and text summarization.
        """,
        metadata={"source": "NLP Overview", "page": 1, "topic": "NLP"}
    )
]

sample_documents

[Document(metadata={'source': 'AI Introduction', 'page': 1, 'topic': 'AI'}, page_content='\n        Artificial Intelligence (AI) is the simulation of human intelligence in machines.\n        These systems are designed to think like humans and mimic their actions.\n        AI can be categorized into narrow AI and general AI.\n        '),
 Document(metadata={'source': 'ML Basics', 'page': 1, 'topic': 'ML'}, page_content='\n        Machine Learning is a subset of AI that enables systems to learn from data.\n        Instead of being explicitly programmed, ML algorithms find patterns in data.\n        Common types include supervised, unsupervised, and reinforcement learning.\n        '),
 Document(metadata={'source': 'Deep Learning', 'page': 1, 'topic': 'DL'}, page_content='\n        Deep Learning is a subset of machine learning based on artificial neural networks.\n        It uses multiple layers to progressively extract higher-level features from raw input.\n        Deep learning has revo

Step 2 : Split the text using a text splitter

In [93]:
## Do text splitting
# create template

text_split = RecursiveCharacterTextSplitter(
    chunk_size = 500,
    chunk_overlap = 50, # 10% of text
    length_function= len,
    separators= [" ", "", "\n", "\n\n"]
)


In [94]:
# chunks
chunks = text_split.split_documents(sample_documents)
print(chunks[0].page_content)
print(chunks[1].page_content)
print(chunks[2].page_content)

Artificial Intelligence (AI) is the simulation of human intelligence in machines.
        These systems are designed to think like humans and mimic their actions.
        AI can be categorized into narrow AI and general AI.
Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being explicitly programmed, ML algorithms find patterns in data.
        Common types include supervised, unsupervised, and reinforcement learning.
Deep Learning is a subset of machine learning based on artificial neural networks.
        It uses multiple layers to progressively extract higher-level features from raw input.
        Deep learning has revolutionized computer vision, NLP, and speech recognition.


Step : 3 Load the embedding model and create the embeddings of chunks

In [95]:
Embeddings = HuggingFaceEmbeddings(
	model_name = "sentence-transformers/all-MiniLM-L6-v2" )


In [96]:
q1 = "This is AKASH"
q1_embeddings = Embeddings.embed_query(q1)
q1_embeddings


# check on the document
document_query = ["AI", "MachineLearning", "Deep Learning"]
doc_embeddings = Embeddings.embed_documents(document_query)
print(doc_embeddings[0]) # provide embeddings of AI

[-0.03653930127620697, -0.015164414420723915, 0.016432441771030426, 0.010568802244961262, 0.006010616198182106, -0.01847323216497898, 0.08546523004770279, 0.020968416705727577, 0.027815338224172592, 0.012431556358933449, -0.029376475140452385, -0.031135255470871925, 0.03491252288222313, -0.01815079152584076, -0.06498481333255768, 0.05168246105313301, -0.019606152549386024, -0.015734178945422173, -0.13371680676937103, -0.09645994007587433, -0.025471778586506844, -0.0014895368367433548, -0.006349359638988972, -0.02582070417702198, -0.027371758595108986, 0.12269002944231033, -0.007792471442371607, -0.03852272033691406, 0.014383504167199135, -0.0921841710805893, 0.008695729076862335, 0.0026133896317332983, 0.0910346731543541, -0.03031357377767563, -0.09604643285274506, 0.022288983687758446, -0.0902431309223175, -0.0329473614692688, 0.07158324867486954, -0.00889309961348772, -0.025708917528390884, -0.0791395977139473, 0.014530429616570473, -0.07420433312654495, 0.08045005798339844, 0.078040

Step 4 : Function of Cosine Similarity


In [97]:
def cosine_similarity(text1, text2) :
    vector1 = np.array(Embeddings.embed_query(text1))
    vector2 = np.array(Embeddings.embed_query(text2))

    # calculate similarity score

    dot_prod = np.dot(vector1, vector2)
    n1= np.linalg.norm(vector1)
    n2= np.linalg.norm(vector2)

    return dot_prod/(n1*n2)
    

In [98]:
# Test similarity  score
print(f"AI vs Artificial Intelligence : {cosine_similarity("AI","Arificial Intelligence"):.3f}")
print(f"AI vs Pizza : {cosine_similarity("AI","Pizza"):.3f}")

AI vs Artificial Intelligence : 0.590
AI vs Pizza : 0.257


Step 5 : Create Vector Store

In [99]:
vectorstore = FAISS.from_documents(
    documents= chunks,
    embedding = Embeddings )

print(f"Vector store created with {vectorstore.index.ntotal} vectors")

Vector store created with 4 vectors


In [100]:
vectorstore

<langchain_community.vectorstores.faiss.FAISS at 0x30b18f920>

Saving and Loading the vector store

In [24]:
# save vector store for later use
vectorstore.save_local("faiss_index")


In [101]:
#load_vector
loaded_vectorstore = FAISS.load_local(
    "faiss_index",
    Embeddings,
    allow_dangerous_deserialization= True
)

print(f"Loaded vector store contains : {loaded_vectorstore.index.ntotal} vectors")


Loaded vector store contains : 4 vectors


In [102]:
## Similarity search - In Built function 

query = "what is deep learning"
result = vectorstore.similarity_search(query, k=3)
print(result[0].page_content)
print(result[1].page_content)




Deep Learning is a subset of machine learning based on artificial neural networks.
        It uses multiple layers to progressively extract higher-level features from raw input.
        Deep learning has revolutionized computer vision, NLP, and speech recognition.
Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being explicitly programmed, ML algorithms find patterns in data.
        Common types include supervised, unsupervised, and reinforcement learning.


In [103]:
# Now the top 3 answers are stored in result-it's just an array -> iterate it to check the responses

for i, doc in enumerate(result) :
    print(f"{i+1}. Source : {doc.metadata['source']}")
    print(f"Content : {doc.page_content[:200]}")
    


1. Source : Deep Learning
Content : Deep Learning is a subset of machine learning based on artificial neural networks.
        It uses multiple layers to progressively extract higher-level features from raw input.
        Deep learning 
2. Source : ML Basics
Content : Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being explicitly programmed, ML algorithms find patterns in data.
        Common types include supervised
3. Source : NLP Overview
Content : Natural Language Processing (NLP) is a branch of AI that helps computers understand human language.
        It combines computational linguistics with machine learning and deep learning models.
      


In [104]:
# similarit search with score
results_with_score = vectorstore.similarity_search_with_score(query, k=3)
print(results_with_score[0])
print(type(results_with_score[0]))
print(len(results_with_score[0])) # first doc, second element is score 


(Document(id='0c91efe6-4e75-40f7-a1ef-4a0d11bbb7a2', metadata={'source': 'Deep Learning', 'page': 1, 'topic': 'DL'}, page_content='Deep Learning is a subset of machine learning based on artificial neural networks.\n        It uses multiple layers to progressively extract higher-level features from raw input.\n        Deep learning has revolutionized computer vision, NLP, and speech recognition.'), np.float32(0.34321663))
<class 'tuple'>
2


In [105]:
for doc, score in results_with_score :
    print(f"Score : {score:.3f}")
    print(f"Source : {doc.metadata['source']}")
    print(f"Content preview : {doc.page_content[:100]}")
    

Score : 0.343
Source : Deep Learning
Content preview : Deep Learning is a subset of machine learning based on artificial neural networks.
        It uses m
Score : 1.090
Source : ML Basics
Content preview : Machine Learning is a subset of AI that enables systems to learn from data.
        Instead of being
Score : 1.154
Source : NLP Overview
Content preview : Natural Language Processing (NLP) is a branch of AI that helps computers understand human language.



In [106]:

list1 = [("Akash" , 1), ("Ashirwad" , 2), ("Bhim Singh" , 10)]

for name, number in list1 :
    print(f"Number  : {number}")
    print(f"Name : {name}")


Number  : 1
Name : Akash
Number  : 2
Name : Ashirwad
Number  : 10
Name : Bhim Singh


In [107]:
## Search with Metadata Filtering :
filter_dict = {"topic" : "ML"}

filtered_result = vectorstore.similarity_search(query, k=3, filter = filter_dict)
print(filtered_result) # it will only return 1, coz ML is only in one metadata


[Document(id='d6b160f5-f55b-49d2-9500-4a9e937107e4', metadata={'source': 'ML Basics', 'page': 1, 'topic': 'ML'}, page_content='Machine Learning is a subset of AI that enables systems to learn from data.\n        Instead of being explicitly programmed, ML algorithms find patterns in data.\n        Common types include supervised, unsupervised, and reinforcement learning.')]


In [108]:
# Using Anthropic LLM for RAG System
from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(
    model = "claude-haiku-4-5-20251001",
    temperature= 0.7,
    api_key=os.getenv("claudeAPI") 
)


In [109]:
output = llm.invoke("Hey Claude, How is everything goin ? ")
print(type(output))
print(output)

<class 'langchain_core.messages.ai.AIMessage'>
content="Hey! Everything's going well, thanks for asking! I'm doing good and ready to chat.\n\nHow about you? What's on your mind today?" additional_kwargs={} response_metadata={'id': 'msg_01RTc9QDSLqySmmAmtj6XwTN', 'model': 'claude-haiku-4-5-20251001', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'cache_creation': {'ephemeral_1h_input_tokens': 0, 'ephemeral_5m_input_tokens': 0}, 'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'input_tokens': 17, 'output_tokens': 35, 'server_tool_use': None, 'service_tier': 'standard'}, 'model_name': 'claude-haiku-4-5-20251001'} id='run--0a9871d3-6cba-41e8-a94c-58d0bf85c564-0' usage_metadata={'input_tokens': 17, 'output_tokens': 35, 'total_tokens': 52, 'input_token_details': {'cache_read': 0, 'cache_creation': 0, 'ephemeral_5m_input_tokens': 0, 'ephemeral_1h_input_tokens': 0}}


### Simple RAG Chain Using LCEL

In [110]:

simple_prompt = ChatPromptTemplate.from_template(
    """ 
    Answer the question based only on the following context
    Context : {context}    
    
    Question : {question}

    Answer : 
    """
)

- Next Step
- Convert Vector Store to Retriever : 
    it allows your application to store the embeddings and fetch the most relevant document for a given query
    
User Query
   ↓
Retriever
   ↓
Vector Store (similarity search)
   ↓
Relevant Documents
   ↓
LLM

### Retriever - Provide an interface for the queries and returns the Top K results

In [111]:
retriever = vectorstore.as_retriever(
            search_type= "similarity",
            search_kwargs = {"k":3}
)

In [112]:
retriever

VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x30b18f920>, search_kwargs={'k': 3})

In [113]:
# Format the document for prompt

from typing import List
def format_docs (docs : List[Document]) -> str :
    """Format document for insertion into prompt """
    formatted = []
    
    for i, doc in enumerate(docs) :
        source = doc.metadata.get('source', 'unknown')
        formatted.append(f"Document  {i+1} (Source : {source}) : \n {doc.page_content}")
    return "\n\n".join(formatted)


#### Create the RAG pipeline

1. Format Prompt, question -> simple_prompt
2. simple_prompt -> LLM
3. LLM -> output 

In [114]:
# RunnablePassthrough - move from one step to another in the pipeline
# StrOutputParser() - prints the output of LLM


# simple_rag_chain.invoke(question)
# cause of RunnablePassthrough when we invoke the chain we can put the question and it will operate


simple_rag_chain = (
    {"context" : retriever | format_docs, "question" : RunnablePassthrough()}
    | simple_prompt
    | llm 
    | StrOutputParser()
)

simple_rag_chain

{
  context: VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x30b18f920>, search_kwargs={'k': 3})
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template=' \n    Answer the question based only on the following context\n    Context : {context}    \n\n    Question : {question}\n\n    Answer : \n    '), additional_kwargs={})])
| ChatAnthropic(model='claude-haiku-4-5-20251001', temperature=0.7, anthropic_api_url='https://api.anthropic.com', anthropic_api_key=SecretStr('**********'), model_kwargs={})
| StrOutputParser()

### Conversational Rag Chain

In [115]:
conversational_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI assitant. Use the provided context to answer quesstions."),
    ("placeholder", "{chat_history}"),
    ("human", "context : {context}\n\n Question : {input}")
])


Define the functions for creational_prompt

In [116]:
# what does lambda x do ? 
def create_conversational_rag():
    """ Create a conversational RAG Chain with memory"""
    return (
        RunnablePassthrough.assign(
            context = lambda x : format_docs(retriever.invoke(x['input']))
        )
        | conversational_prompt
        |llm
        |StrOutputParser()
    )

conversational_rag = create_conversational_rag()
conversational_rag

RunnableAssign(mapper={
  context: RunnableLambda(lambda x: format_docs(retriever.invoke(x['input'])))
})
| ChatPromptTemplate(input_variables=['context', 'input'], optional_variables=['chat_history'], input_types={'chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage, Tag(tag='chat')], typing.Annotated[langchain_core.messages.system.SystemMessage, Tag(tag='system')], typing.Annotated[langchain_core.messages.function.FunctionMessage, Tag(tag='function')], typing.Annotated[langchain_core.messages.tool.ToolMessage, Tag(tag='tool')], typing.Annotated[langchain_core.messages.ai.AIMessageChunk, Tag(tag='AIMessageChunk')], typing.Annotated[langchain_core.messages.human.HumanMessageChunk, Tag(tag='HumanMessageChunk')], typing.Annotated[langchain_core.messages.chat.ChatMessageChunk, Tag(tag=

Streaming Rag Chain

In [117]:
streaming_rag_chain = (
    {"context" : retriever | format_docs, "question" : RunnablePassthrough()}
    | simple_prompt
    | llm
)

- Test for different chain types

In [123]:
def test_rag_chains(question : str):
    """ Test all RAG chain variants """
    print(f"Question : {question}")
    print("="*80)

    # 1. simple rag
    print("\n 1.Simple Rag Chain")
    answer = simple_rag_chain.invoke(question)
    print(f"Answer : {answer}")
    

    print("\n 2. Streaming RAG")
    print("Answer : ", end = " ", flush= True)
    
    for chunk in streaming_rag_chain.stream(question):
        print(chunk.content, end = "", flush= True)
    print()

In [124]:
test_rag_chains("what is the difference b/w AI and Machine Learning")

Question : what is the difference b/w AI and Machine Learning

 1.Simple Rag Chain
Answer : # Difference between AI and Machine Learning

Based on the provided context:

**Artificial Intelligence (AI):**
- The simulation of human intelligence in machines
- Systems designed to think like humans and mimic their actions
- Can be categorized into narrow AI and general AI

**Machine Learning (ML):**
- A subset of AI that enables systems to learn from data
- Instead of being explicitly programmed, ML algorithms find patterns in data
- Common types include supervised, unsupervised, and reinforcement learning

**Key Difference:**
The main difference is that **AI is the broader field** focused on simulating human intelligence, while **Machine Learning is a specific subset of AI** that accomplishes this through algorithms that learn patterns from data, rather than through explicit programming.

 2. Streaming RAG
Answer :  Based on the provided context, here are the key differences between AI and

In [None]:
# Test with multiple questions

test_question = [
    "what is the diffrence b/w AI and Machine Learning",
    "Explain deep learning in simple terms",
    "How does NLP work ?"
]

for question in test_question:
    print("\n" + "="*80 + "\n")
    test_rag_chains(question)
    

Conversational Rag Chain

In [125]:
chat_history = []

q1= "What is machine learning"
a1 = conversational_rag.invoke(
   {
    "input" : q1, 
    "chat_history" : chat_history
   } 
)

print(q1)
print(a1)


What is machine learning
# What is Machine Learning?

Based on the provided context, **Machine Learning (ML) is a subset of Artificial Intelligence that enables systems to learn from data.**

## Key Characteristics:

- **Learning from Data**: Instead of being explicitly programmed with instructions, ML algorithms automatically find patterns in data
- **Types of ML**: Common approaches include:
  - Supervised learning
  - Unsupervised learning
  - Reinforcement learning

In essence, Machine Learning allows computers to improve their performance on tasks by learning from examples and patterns in data, rather than relying solely on pre-programmed rules.


In [None]:
chat_history.extend(
    [
        HumanMessage(content=q1),
        AIMessage(content = a1)

    ]
)