# Chroma DB Vector Store

### Building a RAG system using LangChain and ChromaDB
#### Introduction

RAG:
Retrieval-Augmented Generation (RAG) is an AI approach that improves the quality of responses from large language models (LLMs) by retrieving relevant information from external data sources before generating an answer.

Langchain:
LangChain is an open-source framework designed to help developers build applications powered by Large Language Models (LLMs), especially those that involve retrieval, reasoning, and multi-step workflows.

In [1]:
import os
from dotenv import load_dotenv
load_dotenv()

True

In [2]:
## Import langchain libraries
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import TextLoader
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document


# Vector store imports
from langchain_community.vectorstores import Chroma

# Utility imports
import numpy as np
from typing import List, Any, Dict

  from .autonotebook import tqdm as notebook_tqdm


### 1. Sample Data

In [3]:
### 1. Sample Data
sample_docs = [
    """
    Machine Learning Fundamentals
    
    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There are three main 
    types of machine learning: supervised learning, unsupervised learning, and reinforcement 
    learning. Supervised learning uses labeled data to train models, while unsupervised 
    learning finds patterns in unlabeled data. Reinforcement learning learns through 
    interaction with an environment using rewards and penalties.
    """,
    
    """
    Deep Learning and Neural Networks
    
    Deep learning is a subset of machine learning based on artificial neural networks. 
    These networks are inspired by the human brain and consist of layers of interconnected 
    nodes. Deep learning has revolutionized fields like computer vision, natural language 
    processing, and speech recognition. Convolutional Neural Networks (CNNs) are particularly 
    effective for image processing, while Recurrent Neural Networks (RNNs) and Transformers 
    excel at sequential data processing.
    """,
    
    """
    Natural Language Processing (NLP)
    
    NLP is a field of AI that focuses on the interaction between computers and human language. 
    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, 
    machine translation, and question answering. Modern NLP heavily relies on transformer 
    architectures like BERT, GPT, and T5. These models use attention mechanisms to understand 
    context and relationships between words in text.
    """
]
sample_docs

['\n    Machine Learning Fundamentals\n\n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: supervised learning, unsupervised learning, and reinforcement \n    learning. Supervised learning uses labeled data to train models, while unsupervised \n    learning finds patterns in unlabeled data. Reinforcement learning learns through \n    interaction with an environment using rewards and penalties.\n    ',
 '\n    Deep Learning and Neural Networks\n\n    Deep learning is a subset of machine learning based on artificial neural networks. \n    These networks are inspired by the human brain and consist of layers of interconnected \n    nodes. Deep learning has revolutionized fields like computer vision, natural language \n    processing, and speech recognition. Convolutional Neural Networks (CNNs) are particularly \n    effective f

In [4]:
## Save the sample docs into text files
import tempfile
temp_dir=tempfile.mkdtemp()

for i, doc in enumerate(sample_docs):
    with open(f"{temp_dir}/doc_{i}.txt", "w") as f:
        f.write(doc)

print(f"Sample_documents created in : {temp_dir}")

Sample_documents created in : /var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpisxiyz8r


In [5]:
## Save the sample docs into text files

for i, doc in enumerate(sample_docs):
    with open(f"doc_{i}.txt", "w") as f:
        f.write(doc)

print(f"Sample_documents created")

Sample_documents created


### 2. Document Loading

In [6]:
from langchain_community.document_loaders import DirectoryLoader

loader = DirectoryLoader(
    path=temp_dir,
    glob="*.txt",
    loader_cls=TextLoader,
    loader_kwargs={"encoding":"utf-8"}
)

documents = loader.load()

print(f"\nNumber of documents loaded: {len(documents)}")
print(f"\nPreview of first document:")
print(f"{documents[0].page_content[:256]}\n")
print(f"Metadata of {documents[0].metadata}\n")
print(f"Metadata of {documents[1].metadata}\n")
print(f"Metadata of {documents[2].metadata}\n")


Number of documents loaded: 3

Preview of first document:

    Natural Language Processing (NLP)

    NLP is a field of AI that focuses on the interaction between computers and human language. 
    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, 
    machine translation

Metadata of {'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpisxiyz8r/doc_2.txt'}

Metadata of {'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpisxiyz8r/doc_0.txt'}

Metadata of {'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpisxiyz8r/doc_1.txt'}



### 3. Document Splitting using Text_splitter

In [7]:
# Initialize text splitter
text_splitter = RecursiveCharacterTextSplitter(
    separators=[" "],
    chunk_size=500,
    chunk_overlap=50,
    length_function=len,
)

In [8]:
doc_chunks = text_splitter.split_documents(documents=documents)
print(f"\nNumber of Chunks {len(doc_chunks)} for the given set of documents {len(documents)}")
print(f"\nChunk Sample of First document:")
print(f"\nChunk Content: {doc_chunks[0].page_content[:300]}...")
print(f"\nChunk Metadata: {doc_chunks[0].metadata}")


Number of Chunks 5 for the given set of documents 3

Chunk Sample of First document:

Chunk Content: Natural Language Processing (NLP)

    NLP is a field of AI that focuses on the interaction between computers and human language. 
    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, 
    machine translation, and question answering. Modern NLP heavily reli...

Chunk Metadata: {'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpisxiyz8r/doc_2.txt'}


### 4. Embedding Models

In [9]:
os.environ['OPENAI_API_KEY']=os.getenv("OPENAI_API_KEY")

In [10]:
embeddings=OpenAIEmbeddings(model="text-embedding-ada-002")
embeddings
sample_text="Machine Learning is fascinating"

In [11]:
vector=embeddings.embed_query(sample_text)
vector

[-0.02172444760799408,
 0.016208980232477188,
 0.010213345289230347,
 -0.022516079246997833,
 -0.0037213172763586044,
 0.01783117651939392,
 4.82096329506021e-05,
 0.01027174387127161,
 -0.015547124668955803,
 -0.04134652763605118,
 0.007929293438792229,
 0.03628527745604515,
 -0.019128933548927307,
 -0.008234266191720963,
 -0.0013058676850050688,
 0.00581719446927309,
 0.03880292549729347,
 0.008811768144369125,
 -0.0005584409227594733,
 -0.008591149002313614,
 -0.031224025413393974,
 0.022048886865377426,
 -0.005914526060223579,
 -0.03441650792956352,
 -0.014898247085511684,
 0.0023018959909677505,
 0.003834871109575033,
 -0.03885483369231224,
 -0.012523352168500423,
 -0.002739888848736882,
 0.027590306475758553,
 -0.004736811853945255,
 -0.0170655008405447,
 -0.03981517627835274,
 -0.008513283915817738,
 -0.012211889959871769,
 -0.004152821376919746,
 0.0028583090752363205,
 -0.01670212857425213,
 -0.00013068816042505205,
 0.020076295360922813,
 0.02541007660329342,
 -0.008435418829

### 5. Initialize the ChromaDB Vector Store and store the chunks in embedded vectors

In [12]:
## Create a ChromaDB vector store
persist_directory="./chroma_db"

## Initialize the Chromadb with OpenAI Embeddings
vectorstore = Chroma.from_documents(
    documents=doc_chunks,
    embedding=embeddings,
    persist_directory=persist_directory,
    collection_name="rag_collection"
)

print(f"\nVector store created with {vectorstore._collection.count()} vectors\n")
print(f"Persisted to: {persist_directory}\n")


Vector store created with 20 vectors

Persisted to: ./chroma_db



### 6. Test the Similarity search

In [13]:
query="What are the types of machine learning?"
# query_embed=embeddings.embed_query(query)
similar_docs=vectorstore.similarity_search(query=query,k=3)
similar_docs

[Document(metadata={'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpyzdx9kmx/doc_0.txt'}, page_content='Machine Learning Fundamentals\n\n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: supervised learning, unsupervised learning, and reinforcement \n    learning. Supervised learning uses labeled data to train models, while unsupervised \n    learning finds patterns in unlabeled data. Reinforcement learning learns through'),
 Document(metadata={'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpqftcpjhz/doc_0.txt'}, page_content='Machine Learning Fundamentals\n\n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: supervised learning, uns

In [14]:
print(f"Query: {query}\n\n")
print(f"Top {len(similar_docs)} relevant documents of the query is fetched!")
for i, doc in enumerate(similar_docs):
    print(f"\n----- Chunk {i+1} -----\n")
    print(doc.page_content[:200]+"... ")
    print(f"Source: {doc.metadata.get('source','unknown')}")

Query: What are the types of machine learning?


Top 3 relevant documents of the query is fetched!

----- Chunk 1 -----

Machine Learning Fundamentals

    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There are... 
Source: /var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpyzdx9kmx/doc_0.txt

----- Chunk 2 -----

Machine Learning Fundamentals

    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There are... 
Source: /var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpqftcpjhz/doc_0.txt

----- Chunk 3 -----

Machine Learning Fundamentals

    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There are... 
Source: /var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpis

In [15]:
query="What is NLP?"
# query_embed=embeddings.embed_query(query)
similar_docs=vectorstore.similarity_search(query=query,k=3)
similar_docs

[Document(metadata={'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpyzdx9kmx/doc_2.txt'}, page_content='Natural Language Processing (NLP)\n\n    NLP is a field of AI that focuses on the interaction between computers and human language. \n    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, \n    machine translation, and question answering. Modern NLP heavily relies on transformer \n    architectures like BERT, GPT, and T5. These models use attention mechanisms to understand \n    context and relationships between words in text.'),
 Document(metadata={'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpqftcpjhz/doc_2.txt'}, page_content='Natural Language Processing (NLP)\n\n    NLP is a field of AI that focuses on the interaction between computers and human language. \n    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, \n    machine translation, and question answering. Moder

In [16]:
print(f"Query: {query}\n\n")
print(f"Top {len(similar_docs)} relevant documents of the query is fetched!")
for i, doc in enumerate(similar_docs):
    print(f"\n----- Chunk {i+1} -----\n")
    print(doc.page_content[:200]+"... ")
    print(f"Source: {doc.metadata.get('source','unknown')}")

Query: What is NLP?


Top 3 relevant documents of the query is fetched!

----- Chunk 1 -----

Natural Language Processing (NLP)

    NLP is a field of AI that focuses on the interaction between computers and human language. 
    Key tasks in NLP include text classification, named entity recogn... 
Source: /var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpyzdx9kmx/doc_2.txt

----- Chunk 2 -----

Natural Language Processing (NLP)

    NLP is a field of AI that focuses on the interaction between computers and human language. 
    Key tasks in NLP include text classification, named entity recogn... 
Source: /var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpqftcpjhz/doc_2.txt

----- Chunk 3 -----

Natural Language Processing (NLP)

    NLP is a field of AI that focuses on the interaction between computers and human language. 
    Key tasks in NLP include text classification, named entity recogn... 
Source: /var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpisxiyz8r/doc_2.txt


### 7. Advanced Similarity search with scores

In [17]:
results_scores=vectorstore.similarity_search_with_score(query,k=3)
results_scores

[(Document(metadata={'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpyzdx9kmx/doc_2.txt'}, page_content='Natural Language Processing (NLP)\n\n    NLP is a field of AI that focuses on the interaction between computers and human language. \n    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, \n    machine translation, and question answering. Modern NLP heavily relies on transformer \n    architectures like BERT, GPT, and T5. These models use attention mechanisms to understand \n    context and relationships between words in text.'),
  0.2073928862810135),
 (Document(metadata={'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpqftcpjhz/doc_2.txt'}, page_content='Natural Language Processing (NLP)\n\n    NLP is a field of AI that focuses on the interaction between computers and human language. \n    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, \n    machine translation, and 

#### Understanding similarity scores


The similarity score represents how closely related a document chunk is to your query. The scoring depends on the distance metric used:

ChromaDB default: Uses L2 distance (Euclidean distance)


- Lower score = More similarity (closer in vector space)
- Score of 0 = Identical vectors
- Typical range is between 0 to 2 (but can be higher)

Cosine Similarity (if configured):

- Higher score = More similarity
- Score of 1 = Identical vectors
- Range is between -1 to 1


#### Initialize LLM, RAG Chain, Prompt Template, Query RAG

In [18]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0
)

In [19]:
test_response=llm.invoke("What is LLM?")
test_response

AIMessage(content='LLM stands for "Large Language Model." It refers to a type of artificial intelligence model that is designed to understand and generate human language. These models are trained on vast amounts of text data and use deep learning techniques, particularly neural networks, to learn patterns, grammar, facts, and even some reasoning abilities from the data.\n\nLLMs can perform a variety of language-related tasks, including:\n\n- Text generation\n- Translation\n- Summarization\n- Question answering\n- Sentiment analysis\n- Conversational agents (chatbots)\n\nExamples of large language models include OpenAI\'s GPT-3 and GPT-4, Google\'s BERT, and others. These models have gained significant attention for their ability to produce coherent and contextually relevant text, making them useful in various applications across industries.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 158, 'prompt_tokens': 12, 'total_tokens': 170, 'comp

In [20]:
from langchain.chat_models.base import init_chat_model

llm=init_chat_model("openai:gpt-4o-mini")
llm

ChatOpenAI(profile={'max_input_tokens': 128000, 'max_output_tokens': 16384, 'image_inputs': True, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': False, 'tool_calling': True, 'structured_output': True, 'image_url_inputs': True, 'pdf_inputs': True, 'pdf_tool_message': True, 'image_tool_message': True, 'tool_choice': True}, client=<openai.resources.chat.completions.completions.Completions object at 0x10d9e42d0>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x157ea6150>, root_client=<openai.OpenAI object at 0x1080ccbd0>, root_async_client=<openai.AsyncOpenAI object at 0x157ea5e10>, model_name='gpt-4o-mini', model_kwargs={}, openai_api_key=SecretStr('**********'), stream_usage=True)

In [21]:
llm.invoke("What is AI?")

AIMessage(content='Artificial Intelligence (AI) refers to the simulation of human intelligence processes by computer systems. These processes include learning (the acquisition of information and rules for using it), reasoning (the use of rules to reach approximate or definite conclusions), and self-correction. AI can be classified into two main categories:\n\n1. **Narrow AI**: Also known as weak AI, this type of AI is designed and trained to perform specific tasks. Examples include voice assistants like Siri and Alexa, recommendation systems, and image recognition software. Narrow AI systems excel in their particular domain but lack general intelligence.\n\n2. **General AI**: Also referred to as strong AI or human-level AI, this is a theoretical form of AI that possesses the ability to understand, learn, and apply intelligence across a wide range of tasks, similar to a human being. General AI does not yet exist and remains a topic of research and speculation.\n\nAI encompasses various 

### 8. Modern RAG Chain

In [22]:
# from langchain.chains import create_retrieval_chain
from langchain_classic.chains.retrieval import create_retrieval_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain_classic.chains.combine_documents import create_stuff_documents_chain

In [23]:
import importlib.metadata

try:
    # This is the modern standard way
    version = importlib.metadata.version('langchain')
    print(f"The installed LangChain version is: {version}")
except importlib.metadata.PackageNotFoundError:
    print("LangChain is not installed in the current environment.")

The installed LangChain version is: 1.2.0


In [24]:
retriever=vectorstore.as_retriever(
    search_kwargs={"k":3} # retrieve top 3 relevant chunks
)
retriever

VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x15670dd90>, search_kwargs={'k': 3})

In [25]:
## Create a prompt template
system_prompt=""" 

You are an assistant for a question answering tasks. 
Use the following pieces of retrieved context to answer the question.
If you don't know the answer, just say that you don't know.
Use three sentences maximum and keep the answer concise. 

Context: {context}"""

prompt = ChatPromptTemplate.from_messages([
    {'role': "system", 'content': system_prompt},
    {'role':"human", 'content': "{input}"}
])

In [26]:
# Create a document chain - Create an augmented relevant document for the LLM
document_chain=create_stuff_documents_chain(llm=llm, prompt=prompt)
document_chain


RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableLambda(format_docs)
}), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
| ChatPromptTemplate(input_variables=['context', 'input'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template=" \n\nYou are an assistant for a question answering tasks. \nUse the following pieces of retrieved context to answer the question.\nIf you don't know the answer, just say that you don't know.\nUse three sentences maximum and keep the answer concise. \n\nContext: {context}"), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, template='{input}'), additional_kwargs={})])
| ChatOpenAI(profile={'max_input_tokens': 128000, 'max_output_tokens': 16384, 'image_inputs': True, 'audio_inputs': False, 'vide

In [27]:
### Create the final RAG chain
rag_chain=create_retrieval_chain(retriever=retriever, combine_docs_chain=document_chain)
rag_chain

RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x15670dd90>, search_kwargs={'k': 3}), kwargs={}, config={'run_name': 'retrieve_documents'}, config_factories=[])
})
| RunnableAssign(mapper={
    answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
              context: RunnableLambda(format_docs)
            }), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
            | ChatPromptTemplate(input_variables=['context', 'input'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template=" \n\nYou are an assistant for a question answering tasks. \nUse the following pieces of retrieved context to answer the question.\nIf you

In [28]:
response=rag_chain.invoke({"input":"What is deep learning?"})

In [29]:
response["answer"]

'Deep learning is a subset of machine learning that utilizes artificial neural networks inspired by the human brain. These networks consist of layers of interconnected nodes and have significantly advanced fields such as computer vision, natural language processing, and speech recognition.'

In [30]:
response["context"]

[Document(metadata={'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpo73gxp3b/doc_1.txt'}, page_content='Deep Learning and Neural Networks\n\n    Deep learning is a subset of machine learning based on artificial neural networks. \n    These networks are inspired by the human brain and consist of layers of interconnected \n    nodes. Deep learning has revolutionized fields like computer vision, natural language \n    processing, and speech recognition. Convolutional Neural Networks (CNNs) are particularly \n    effective for image processing, while Recurrent Neural Networks (RNNs) and Transformers'),
 Document(metadata={'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpyzdx9kmx/doc_1.txt'}, page_content='Deep Learning and Neural Networks\n\n    Deep learning is a subset of machine learning based on artificial neural networks. \n    These networks are inspired by the human brain and consist of layers of interconnected \n    nodes. Deep learning has revolutioniz

In [31]:
response["input"]

'What is deep learning?'

In [32]:
# Function to query the modern RAG System

def query_rag_modern(question):
    print(f"Question: {question}")
    print("-" * 50)

    # Using retrieval chain approach
    result = rag_chain.invoke({"input":question})

    print(f"Answer: {result['answer']}")
    print("\nRetrieved Context:")
    for i, doc in enumerate(result['context']):
        print(f"\n--- Source {i+1} ---")
        print(doc.page_content[:200] + "...")
    return result

# Test Queries
test_questions = [
    "What are the three types of machine learning?",
    "What is deep learning and how does it relate to neural network?",
    "What are CNNs best used for?"
]

for question in test_questions:
    result = query_rag_modern(question=question)
    print("\n" + "="*80 + "\n")

Question: What are the three types of machine learning?
--------------------------------------------------
Answer: The three main types of machine learning are supervised learning, unsupervised learning, and reinforcement learning. Supervised learning uses labeled data to train models, unsupervised learning finds patterns in unlabeled data, and reinforcement learning learns through interactions with the environment.

Retrieved Context:

--- Source 1 ---
Machine Learning Fundamentals

    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There are...

--- Source 2 ---
Machine Learning Fundamentals

    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There are...

--- Source 3 ---
Machine Learning Fundamentals

    Machine learning is a subset of artificial intelligence that en

### 9. Create RAG Chain Using LCEL (LangChain Expression Language)

In [33]:
# LCEL
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel, RunnablePassthrough

In [34]:
# Create custom prompt
custom_prompt = ChatPromptTemplate.from_template(""" Use the following context to answer the question. 
If you don't know the answer based on the context, say you don't know.
Provide specific details from the context to support your answer.
                                                 

Context:
{context}

Question: {question}
Answer: 
""")
custom_prompt

ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template=" Use the following context to answer the question. \nIf you don't know the answer based on the context, say you don't know.\nProvide specific details from the context to support your answer.\n\n\nContext:\n{context}\n\nQuestion: {question}\nAnswer: \n"), additional_kwargs={})])

In [35]:
retriever

VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x15670dd90>, search_kwargs={'k': 3})

In [36]:
# Format retriever output for context
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [37]:
# Build a chain using LCEL
rag_chain_lcel=(
    {
        "context":retriever | format_docs, 
        "question": RunnablePassthrough()
    }
    | custom_prompt
    | llm
    | StrOutputParser()
)

rag_chain_lcel

{
  context: VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x15670dd90>, search_kwargs={'k': 3})
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template=" Use the following context to answer the question. \nIf you don't know the answer based on the context, say you don't know.\nProvide specific details from the context to support your answer.\n\n\nContext:\n{context}\n\nQuestion: {question}\nAnswer: \n"), additional_kwargs={})])
| ChatOpenAI(profile={'max_input_tokens': 128000, 'max_output_tokens': 16384, 'image_inputs': True, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False

In [38]:
response = rag_chain_lcel.invoke("What is deep learning?")
response

'Deep learning is a subset of machine learning based on artificial neural networks. These networks are inspired by the human brain and consist of layers of interconnected nodes. Deep learning has significantly impacted various fields, including computer vision, natural language processing, and speech recognition.'

In [41]:
documents = retriever.invoke("What is deep learning?")
documents

[Document(metadata={'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpo73gxp3b/doc_1.txt'}, page_content='Deep Learning and Neural Networks\n\n    Deep learning is a subset of machine learning based on artificial neural networks. \n    These networks are inspired by the human brain and consist of layers of interconnected \n    nodes. Deep learning has revolutionized fields like computer vision, natural language \n    processing, and speech recognition. Convolutional Neural Networks (CNNs) are particularly \n    effective for image processing, while Recurrent Neural Networks (RNNs) and Transformers'),
 Document(metadata={'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpyzdx9kmx/doc_1.txt'}, page_content='Deep Learning and Neural Networks\n\n    Deep learning is a subset of machine learning based on artificial neural networks. \n    These networks are inspired by the human brain and consist of layers of interconnected \n    nodes. Deep learning has revolutioniz

In [40]:
type(retriever)

langchain_core.vectorstores.base.VectorStoreRetriever

In [52]:
# Query using LCEL approach
queries = [
    "What is deep learning?",
    "What is machine learning?",
    "What are the types of machine learning?"
]
contexts = retriever.batch(queries)


responses = rag_chain_lcel.batch(queries)

input_dict = {
    query: (context, response) for query, context, response in zip(queries, contexts, responses)
}

for i, (query, (docs, response)) in enumerate(input_dict.items()):
    print(f"\nQuery {i+1}: {query}")
    print(f"\nRetrieved {len(docs)} documents:")
    for j, con in enumerate(docs):
        print(f"\n  Retrieved Document: {con.page_content}...")
    print(f"\nResponse is:\n {response}")




Query 1: What is deep learning?

Retrieved 3 documents:

  Retrieved Document: Deep Learning and Neural Networks

    Deep learning is a subset of machine learning based on artificial neural networks. 
    These networks are inspired by the human brain and consist of layers of interconnected 
    nodes. Deep learning has revolutionized fields like computer vision, natural language 
    processing, and speech recognition. Convolutional Neural Networks (CNNs) are particularly 
    effective for image processing, while Recurrent Neural Networks (RNNs) and Transformers...

  Retrieved Document: Deep Learning and Neural Networks

    Deep learning is a subset of machine learning based on artificial neural networks. 
    These networks are inspired by the human brain and consist of layers of interconnected 
    nodes. Deep learning has revolutionized fields like computer vision, natural language 
    processing, and speech recognition. Convolutional Neural Networks (CNNs) are particularly 


### 10. Adding new document to the existing VectorStore

In [53]:
new_document = """ 
Reinforcement learning in detail

Reinforcement Learning (RL) is a type of machine learning where an agent learns to make 
decisions by interacting with an environment through trial and error. The agent takes actions 
in an environment, receives feedback in the form of rewards or penalties, and learns to maximize 
cumulative rewards over time. Think of it like training a dog - you give treats for good behavior 
and discourage bad behavior, and over time the dog learns what actions lead to rewards.
"""

In [54]:
new_document

' \nReinforcement learning in detail\n\nReinforcement Learning (RL) is a type of machine learning where an agent learns to make \ndecisions by interacting with an environment through trial and error. The agent takes actions \nin an environment, receives feedback in the form of rewards or penalties, and learns to maximize \ncumulative rewards over time. Think of it like training a dog - you give treats for good behavior \nand discourage bad behavior, and over time the dog learns what actions lead to rewards.\n'

In [55]:
doc_chunks

[Document(metadata={'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpisxiyz8r/doc_2.txt'}, page_content='Natural Language Processing (NLP)\n\n    NLP is a field of AI that focuses on the interaction between computers and human language. \n    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, \n    machine translation, and question answering. Modern NLP heavily relies on transformer \n    architectures like BERT, GPT, and T5. These models use attention mechanisms to understand \n    context and relationships between words in text.'),
 Document(metadata={'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpisxiyz8r/doc_0.txt'}, page_content='Machine Learning Fundamentals\n\n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: supervised learning, unsupervised lear

In [56]:
new_doc=Document(
    page_content=new_document,
    metadata={"source": "manual_addition", "topic":"reinforcement_learning"}
)
new_doc

Document(metadata={'source': 'manual_addition', 'topic': 'reinforcement_learning'}, page_content=' \nReinforcement learning in detail\n\nReinforcement Learning (RL) is a type of machine learning where an agent learns to make \ndecisions by interacting with an environment through trial and error. The agent takes actions \nin an environment, receives feedback in the form of rewards or penalties, and learns to maximize \ncumulative rewards over time. Think of it like training a dog - you give treats for good behavior \nand discourage bad behavior, and over time the dog learns what actions lead to rewards.\n')

In [58]:
### Create new chunks from the document
new_chunks=text_splitter.split_documents([new_doc])
new_chunks

[Document(metadata={'source': 'manual_addition', 'topic': 'reinforcement_learning'}, page_content='Reinforcement learning in detail\n\nReinforcement Learning (RL) is a type of machine learning where an agent learns to make \ndecisions by interacting with an environment through trial and error. The agent takes actions \nin an environment, receives feedback in the form of rewards or penalties, and learns to maximize \ncumulative rewards over time. Think of it like training a dog - you give treats for good behavior \nand discourage bad behavior, and over time the dog learns what actions lead to'),
 Document(metadata={'source': 'manual_addition', 'topic': 'reinforcement_learning'}, page_content='and over time the dog learns what actions lead to rewards.')]

In [59]:
# Add chunks to vector store
vectorstore.add_documents(new_chunks)

['2b9c6f27-ae0b-4a38-9da2-79e232dc79d4',
 '7346d1a6-9ccd-4628-be74-76df0b9ee514']

In [60]:
print(f"Added {len(new_chunks)} new chunks to vector store.")
print(f"Total vectors now: {vectorstore._collection.count()}")

Added 2 new chunks to vector store.
Total vectors now: 22


In [None]:
# Query with the updated vectorstore
new_question="What is reinforcement learning?"
new_ans = rag_chain_lcel.invoke(new_question)
new_ans

'Reinforcement Learning (RL) is a type of machine learning where an agent learns to make decisions by interacting with an environment through trial and error. The agent takes actions within the environment, receives feedback in the form of rewards or penalties, and aims to maximize cumulative rewards over time. The context likens this process to training a dog, where treats are given for good behavior and bad behavior is discouraged, allowing the dog to learn which actions yield positive outcomes.'

### 11. Advanced RAG - Conversational Memory

1. Create prompt with Chat History Placeholder
2. Create History aware retriever
3. Create document chain with history
4. Create conversational RAG chain with history_aware_retriever and history_aware_document_chain


In [63]:
from langchain_classic.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage

In [65]:
## Create a prompt that includes the chat history
contextualize_qa_system_prompt = """ Given a chat history and the latest user question which
might reference context in the chat history, formulate a standalone question which can be understood 
without the chat history. Do NOT answer the question, just reformulate it if needed and otherwise 
return it as is."""

contextualize_qa_prompt = ChatPromptTemplate.from_messages([
    ("system", contextualize_qa_system_prompt),
    MessagesPlaceholder("chat_history"),
    ("human", "{input}"),
])

In [70]:
## Create a history aware retriever
history_aware_retriever = create_history_aware_retriever(
    llm, retriever, contextualize_qa_prompt)
history_aware_retriever

RunnableBinding(bound=RunnableBranch(branches=[(RunnableLambda(lambda x: not x.get('chat_history', False)), RunnableLambda(lambda x: x['input'])
| VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x15670dd90>, search_kwargs={'k': 3}))], default=ChatPromptTemplate(input_variables=['chat_history', 'input'], input_types={'chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage, Tag(tag='chat')], typing.Annotated[langchain_core.messages.system.SystemMessage, Tag(tag='system')], typing.Annotated[langchain_core.messages.function.FunctionMessage, Tag(tag='function')], typing.Annotated[langchain_core.messages.tool.ToolMessage, Tag(tag='tool')], typing.Annotated[langchain_core.messages.ai.AIMessageChunk, Tag(tag='AIMe

In [72]:
## Create new document chain with history
qa_system_prompt = """You are an assistant for a question answering tasks.
Use the following pieces of retrieved context to answer the question.
If you don't know the answer, just say that you don't know.
Use three sentences maximum and keep the answer concise.

Context: {context}"""

qa_prompt = ChatPromptTemplate.from_messages([
    ('system', qa_system_prompt),
    MessagesPlaceholder("chat_history"),
    ('human', '{input}') ,
])

history_aware_document_chain = create_stuff_documents_chain(llm, qa_prompt)

conversation_rag_chain = create_retrieval_chain(
    history_aware_retriever,
    history_aware_document_chain
)
conversation_rag_chain

RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableBranch(branches=[(RunnableLambda(lambda x: not x.get('chat_history', False)), RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x15670dd90>, search_kwargs={'k': 3}))], default=ChatPromptTemplate(input_variables=['chat_history', 'input'], input_types={'chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage, Tag(tag='chat')], typing.Annotated[langchain_core.messages.system.SystemMessage, Tag(tag='system')], typing.Annotated[langchain_core.messages.function.FunctionMessage, Tag(tag='function')], typing.Annotated[langchain_core.messages.tool.ToolMessage, Tag(tag='tool')], typin

In [73]:
# Calling the Conversational RAG chain
chat_history = []
# First question
first_answer = conversation_rag_chain.invoke({
    "chat_history": chat_history,
    "input": "What is machine learning?"
})
print(f"Q: What is machine learning?\nA: {first_answer['answer']}")

Q: What is machine learning?
A: Machine learning is a subset of artificial intelligence that allows systems to learn and improve from experience without being explicitly programmed. It includes three main types: supervised learning, unsupervised learning, and reinforcement learning. Each type has different methods for processing data and learning from it.


In [75]:
chat_history

[]

In [76]:
chat_history.extend([
    HumanMessage(content="What is machine learning?"),
    AIMessage(content=first_answer['answer'])
])

In [77]:
chat_history

[HumanMessage(content='What is machine learning?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='Machine learning is a subset of artificial intelligence that allows systems to learn and improve from experience without being explicitly programmed. It includes three main types: supervised learning, unsupervised learning, and reinforcement learning. Each type has different methods for processing data and learning from it.', additional_kwargs={}, response_metadata={})]

In [78]:
second_answer = conversation_rag_chain.invoke({
    "chat_history": chat_history,
    "input": "What are its types?"
})
print(f"\nQ: What are its types?\nA: {second_answer['answer']}")


Q: What are its types?
A: The three main types of machine learning are supervised learning, unsupervised learning, and reinforcement learning. Supervised learning uses labeled data to train models, unsupervised learning identifies patterns in unlabeled data, and reinforcement learning learns through interactions with the environment.


In [79]:
second_answer

{'chat_history': [HumanMessage(content='What is machine learning?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='Machine learning is a subset of artificial intelligence that allows systems to learn and improve from experience without being explicitly programmed. It includes three main types: supervised learning, unsupervised learning, and reinforcement learning. Each type has different methods for processing data and learning from it.', additional_kwargs={}, response_metadata={})],
 'input': 'What are its types?',
 'context': [Document(metadata={'source': '/var/folders/6n/4n__nk894pn8cc6by3xc1xr00000gn/T/tmpo73gxp3b/doc_0.txt'}, page_content='Machine Learning Fundamentals\n\n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: supervised learning, unsupervised learning, and reinforcement \n    learning. Su

### 12. GROQ LLM

In [80]:
from langchain_groq import ChatGroq 
from langchain.chat_models.base import init_chat_model


In [81]:
os.environ['GROQ_API_KEY']=os.getenv("GROQ_API_KEY")

In [82]:
llm2=ChatGroq(model="gemma2-9b-it")
llm2

ChatGroq(profile={'max_input_tokens': 8192, 'max_output_tokens': 8192, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': False, 'tool_calling': True}, client=<groq.resources.chat.completions.Completions object at 0x165cee550>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x158230690>, model_name='gemma2-9b-it', model_kwargs={}, groq_api_key=SecretStr(''))

In [83]:
llm_init=init_chat_model("groq:gemma2-9b-it")
llm_init

ChatGroq(profile={'max_input_tokens': 8192, 'max_output_tokens': 8192, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': False, 'tool_calling': True}, client=<groq.resources.chat.completions.Completions object at 0x158212450>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x158212b50>, model_name='gemma2-9b-it', model_kwargs={}, groq_api_key=SecretStr(''))