## Building RAG System with Langchain and ChromaDB
### Langchain + ChromaDB + OpenAI (for embeddings)

In [3]:
import os
from dotenv import load_dotenv
load_dotenv()

True

In [4]:
# Lanchain Import
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import TextLoader
from langchain_openai import OpenAIEmbeddings
from langchain.schema import Document

## Vectorstores
from langchain_community.vectorstores import Chroma

## Utility imports
import numpy as np
from typing import List



In [5]:
# RAG Architecture Overview
print("""
RAG (Retrieval-Augmented Generation) Architecture:

1. Document Loading: Load documents from various sources
2. Document Splitting: Break documents into smaller chunks
3. Embedding Generation: Convert chunks into vector representations
4. Vector Storage: Store embeddings in ChromaDB
5. Query Processing: Convert user query to embedding
6. Similarity Search: Find relevant chunks from vector store
7. Context Augmentation: Combine retrieved chunks with query
8. Response Generation: LLM generates answer using context

Benefits of RAG:
- Reduces hallucinations
- Provides up-to-date information
- Allows citing sources
- Works with domain-specific knowledge
""")


RAG (Retrieval-Augmented Generation) Architecture:

1. Document Loading: Load documents from various sources
2. Document Splitting: Break documents into smaller chunks
3. Embedding Generation: Convert chunks into vector representations
4. Vector Storage: Store embeddings in ChromaDB
5. Query Processing: Convert user query to embedding
6. Similarity Search: Find relevant chunks from vector store
7. Context Augmentation: Combine retrieved chunks with query
8. Response Generation: LLM generates answer using context

Benefits of RAG:
- Reduces hallucinations
- Provides up-to-date information
- Allows citing sources
- Works with domain-specific knowledge



### 1. Sample Data

In [6]:
## create sample documents
sample_docs = [
    """
    Machine Learning Fundamentals
    
    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There are three main 
    types of machine learning: supervised learning, unsupervised learning, and reinforcement 
    learning. Supervised learning uses labeled data to train models, while unsupervised 
    learning finds patterns in unlabeled data. Reinforcement learning learns through 
    interaction with an environment using rewards and penalties.
    """,
    
    """
    Deep Learning and Neural Networks
    
    Deep learning is a subset of machine learning based on artificial neural networks. 
    These networks are inspired by the human brain and consist of layers of interconnected 
    nodes. Deep learning has revolutionized fields like computer vision, natural language 
    processing, and speech recognition. Convolutional Neural Networks (CNNs) are particularly 
    effective for image processing, while Recurrent Neural Networks (RNNs) and Transformers 
    excel at sequential data processing.
    """,
    
    """
    Natural Language Processing (NLP)
    
    NLP is a field of AI that focuses on the interaction between computers and human language. 
    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, 
    machine translation, and question answering. Modern NLP heavily relies on transformer 
    architectures like BERT, GPT, and T5. These models use attention mechanisms to understand 
    context and relationships between words in text.
    """
]

sample_docs


['\n    Machine Learning Fundamentals\n\n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: supervised learning, unsupervised learning, and reinforcement \n    learning. Supervised learning uses labeled data to train models, while unsupervised \n    learning finds patterns in unlabeled data. Reinforcement learning learns through \n    interaction with an environment using rewards and penalties.\n    ',
 '\n    Deep Learning and Neural Networks\n\n    Deep learning is a subset of machine learning based on artificial neural networks. \n    These networks are inspired by the human brain and consist of layers of interconnected \n    nodes. Deep learning has revolutionized fields like computer vision, natural language \n    processing, and speech recognition. Convolutional Neural Networks (CNNs) are particularly \n    effective f

In [7]:
## save sample documents to files
import tempfile
temp_dir=tempfile.mkdtemp()

for i,doc in enumerate(sample_docs):
    with open(f"{temp_dir}/doc_{i}.txt","w") as f:
        f.write(doc)

print(f"Sample document created in : {temp_dir}")

Sample document created in : /var/folders/qx/xqwt0y8n3vz4ycdfclqmx6zr0000gn/T/tmpt3mqr2pc


In [8]:
## save sample documents to files
import tempfile
temp_dir=tempfile.mkdtemp()

for i,doc in enumerate(sample_docs):
    with open(f"doc_{i+1}.txt","w") as f:
        f.write(doc)

In [9]:
temp_dir

'/var/folders/qx/xqwt0y8n3vz4ycdfclqmx6zr0000gn/T/tmphfdrarf2'

### 2.Document loading

In [10]:
# from langchain_community.document_loaders import DirectoryLoader,TextLoader
# from langchain_community

# #Load documents from the directory
# loader = DirectoryLoader(
#     "temp_dir",
#     glob="*.txt",
#     loader_cls=TextLoader,
#     loader_kwargs={"encoding": "utf-8"}
# )
# documents = loader.load()

# print(f"Loaded {len(documents)} documents")
# print(f"\nFirst document preview:")
# print(documents[0].page_content[:150]+ "...")

from langchain_community.document_loaders import DirectoryLoader,TextLoader

# Load documents from directory
loader = DirectoryLoader(
    "/Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores", 
    glob="*.txt", 
    loader_cls=TextLoader,
    loader_kwargs={'encoding': 'utf-8'}
)
documents = loader.load()

print(f"Loaded {len(documents)} documents")
print(f"\nFirst document preview:")
print(documents[0].page_content[:200] + "...")

Loaded 3 documents

First document preview:

    Natural Language Processing (NLP)

    NLP is a field of AI that focuses on the interaction between computers and human language. 
    Key tasks in NLP include text classification, named entity r...


### Document Splitting

In [11]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50,
    separators=[" "]
)
chunks = text_splitter.split_documents(documents)

print(f"Created {len(chunks)} chunks from documents ")
print(f"\n Chunk example:")
print(f"Content : {chunks[0].page_content[:150]}")
print(f"Metadata : {chunks[0].metadata}")

Created 5 chunks from documents 

 Chunk example:
Content : Natural Language Processing (NLP)

    NLP is a field of AI that focuses on the interaction between computers and human language. 
    Key tasks in NL
Metadata : {'source': '/Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_3.txt'}


### Embedding Models 

In [12]:
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

In [13]:
sample_text = "Neural Networks is fascinating"
embeddings = OpenAIEmbeddings()
embeddings

OpenAIEmbeddings(client=<openai.resources.embeddings.Embeddings object at 0x10fd6da90>, async_client=<openai.resources.embeddings.AsyncEmbeddings object at 0x10fd6e3c0>, model='text-embedding-ada-002', dimensions=None, deployment='text-embedding-ada-002', openai_api_version=None, openai_api_base=None, openai_api_type=None, openai_proxy=None, embedding_ctx_length=8191, openai_api_key=SecretStr('**********'), openai_organization=None, allowed_special=None, disallowed_special=None, chunk_size=1000, max_retries=2, request_timeout=None, headers=None, tiktoken_enabled=True, tiktoken_model_name=None, show_progress_bar=False, model_kwargs={}, skip_empty=False, default_headers=None, default_query=None, retry_min_seconds=4, retry_max_seconds=20, http_client=None, http_async_client=None, check_embedding_ctx_length=True)

In [14]:
vector = embeddings.embed_query(sample_text)
vector

[-0.027567453682422638,
 0.0169797632843256,
 0.005550317466259003,
 -0.014914833940565586,
 0.011758255772292614,
 0.0018281851662322879,
 0.007365350145846605,
 -0.01175167877227068,
 -0.013967860490083694,
 -0.04093030467629433,
 -0.004044366534799337,
 0.03640587255358696,
 -0.009239568375051022,
 0.0032157644163817167,
 0.004175890702754259,
 0.008844996802508831,
 0.04661214351654053,
 0.019163062795996666,
 -0.0043074144050478935,
 -0.002033691620454192,
 -0.03438040241599083,
 0.021162228658795357,
 -0.020294170826673508,
 -0.03427518159151077,
 -0.01010105200111866,
 0.01772945001721382,
 0.005247811786830425,
 -0.03206557780504227,
 -0.009377668611705303,
 -0.008923910558223724,
 0.02833029255270958,
 0.006865558680146933,
 -0.013461492955684662,
 -0.031776223331689835,
 0.003991756588220596,
 -0.012731533497571945,
 -0.006937896832823753,
 -0.01775575429201126,
 -0.007463993038982153,
 0.0169797632843256,
 0.02543676272034645,
 0.018505442887544632,
 -0.015217339619994164,
 

### Initialize the ChromaDB Vector Store and store the chunks in the Vector Representation

In [15]:
## Create a chromadb vectore store
persist_directory = './chroma_db'

# Initilize ChromaDB with OpenAI Embeddings
vectorstore = Chroma.from_documents(
    documents = chunks,
    embedding = OpenAIEmbeddings(),
    persist_directory=persist_directory,
    collection_name="rag_collection"

)
print(f"Vector Store created with {vectorstore._collection.count()} vectors")
print(f"Persisted to : {persist_directory}")


Vector Store created with 73 vectors
Persisted to : ./chroma_db


### Test Similarity Search

In [16]:
query = "What are the types of machine learning?"
similar_docs = vectorstore.similarity_search(query, k=3)
similar_docs

[Document(metadata={'source': '/Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_0.txt'}, page_content='Machine Learning Fundamentals\n\n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: supervised learning, unsupervised learning, and reinforcement \n    learning. Supervised learning uses labeled data to train models, while unsupervised \n    learning finds patterns in unlabeled data. Reinforcement learning learns through'),
 Document(metadata={'source': '/Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_1.txt'}, page_content='Machine Learning Fundamentals\n\n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: supervised learnin

In [17]:
query = "What is Deep Learning?"
similar_docs = vectorstore.similarity_search(query, k=3)
similar_docs

[Document(metadata={'source': '/Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_1.txt'}, page_content='Deep Learning and Neural Networks\n\n    Deep learning is a subset of machine learning based on artificial neural networks. \n    These networks are inspired by the human brain and consist of layers of interconnected \n    nodes. Deep learning has revolutionized fields like computer vision, natural language \n    processing, and speech recognition. Convolutional Neural Networks (CNNs) are particularly \n    effective for image processing, while Recurrent Neural Networks (RNNs) and Transformers'),
 Document(metadata={'source': '/Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_2.txt'}, page_content='Deep Learning and Neural Networks\n\n    Deep learning is a subset of machine learning based on artificial neural networks. \n    These networks are inspired by the human brain and consist of layers of interconnected \n    nodes. Deep learning has revolu

In [18]:
query = "What is NLP?"
similar_docs = vectorstore.similarity_search(query, k=3)
similar_docs

[Document(metadata={'source': '/Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_2.txt'}, page_content='Natural Language Processing (NLP)\n\n    NLP is a field of AI that focuses on the interaction between computers and human language. \n    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, \n    machine translation, and question answering. Modern NLP heavily relies on transformer \n    architectures like BERT, GPT, and T5. These models use attention mechanisms to understand \n    context and relationships between words in text.'),
 Document(metadata={'source': '/Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_3.txt'}, page_content='Natural Language Processing (NLP)\n\n    NLP is a field of AI that focuses on the interaction between computers and human language. \n    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, \n    machine translation, and question answering.

In [19]:
print(f"Query : {query}")
print(f"\n Top {len(similar_docs)} similar chunks :")
for i, doc in enumerate(similar_docs):
    print(f"\n--- Chunk {i+1} ---")
    print(doc.page_content[:200] + "...")
    print(f"Source : {doc.metadata.get('source','Unknown')}")

Query : What is NLP?

 Top 3 similar chunks :

--- Chunk 1 ---
Natural Language Processing (NLP)

    NLP is a field of AI that focuses on the interaction between computers and human language. 
    Key tasks in NLP include text classification, named entity recogn...
Source : /Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_2.txt

--- Chunk 2 ---
Natural Language Processing (NLP)

    NLP is a field of AI that focuses on the interaction between computers and human language. 
    Key tasks in NLP include text classification, named entity recogn...
Source : /Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_3.txt

--- Chunk 3 ---
Natural Language Processing (NLP)

    NLP is a field of AI that focuses on the interaction between computers and human language. 
    Key tasks in NLP include text classification, named entity recogn...
Source : /Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_3.txt


## Advanced Similarity Search with Scores

In [20]:
results_scores = vectorstore.similarity_search_with_score(query, k=3)
results_scores

[(Document(metadata={'source': '/Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_2.txt'}, page_content='Natural Language Processing (NLP)\n\n    NLP is a field of AI that focuses on the interaction between computers and human language. \n    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, \n    machine translation, and question answering. Modern NLP heavily relies on transformer \n    architectures like BERT, GPT, and T5. These models use attention mechanisms to understand \n    context and relationships between words in text.'),
  0.20716720819473267),
 (Document(metadata={'source': '/Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_3.txt'}, page_content='Natural Language Processing (NLP)\n\n    NLP is a field of AI that focuses on the interaction between computers and human language. \n    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, \n    machine translatio

## Initialize LLM, RAG Chain, Prompt template, Query the RAG system

In [21]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.2, max_tokens = 500)



In [22]:
test_response = llm.predict("What is Large Language Model?")
test_response

  test_response = llm.predict("What is Large Language Model?")


"A Large Language Model (LLM) is a type of artificial intelligence model that is trained on vast amounts of text data to understand and generate human language. These models use deep learning techniques, such as neural networks, to process and generate text in a way that mimics human language patterns and structures. LLMs have been used in a variety of natural language processing tasks, such as language translation, text generation, and sentiment analysis. Some well-known examples of LLMs include OpenAI's GPT-3 and Google's BERT."

In [23]:
from langchain.chat_models.base import init_chat_model
llm = init_chat_model("openai:gpt-3.5-turbo")



In [24]:
llm.invoke("What is AI ?")

AIMessage(content='AI stands for Artificial Intelligence, which refers to the simulation of human intelligence in machines that are programmed to think and act like humans. AI encompasses tasks such as learning, reasoning, problem-solving, perception, and language understanding. It is a rapidly advancing field that has the potential to revolutionize many aspects of society and industry.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 64, 'prompt_tokens': 11, 'total_tokens': 75, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-CK0O45y4b9aTbBWVKVvQQEt7Y1ayt', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='run--50ac61dd-1c7d-4b37-830d-4e2ccc2ad11c-0', usage_metadata={'input_tokens':

### Modern RAG Chain

In [25]:
from langchain.chains import create_retrieval_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain.chains.combine_documents import create_stuff_documents_chain


In [26]:
## Convert vectore store to retriever 
retriever = vectorstore.as_retriever(
    searhc_kwargs={"k":3} # Retrieve top 3 relevant chunks
)
retriever

VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x10fd6e900>, search_kwargs={})

In [27]:
## Create a prompt template
system_prompt ="""You are an assistant for question-answering tasks. 
Use the following pieces of retrieved context to answer the question. 
If you don't know the answer, just say that you don't know. 
Use three sentences maximum and keep the answer concise.

Context: {context}"""

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}")
    ]
)

In [28]:
#prompt
print(prompt)

input_variables=['context', 'input'] input_types={} partial_variables={} messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template="You are an assistant for question-answering tasks. \nUse the following pieces of retrieved context to answer the question. \nIf you don't know the answer, just say that you don't know. \nUse three sentences maximum and keep the answer concise.\n\nContext: {context}"), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, template='{input}'), additional_kwargs={})]


In [29]:
# Create a document chain
from langchain.chains.combine_documents import create_stuff_documents_chain
documents_chain = create_stuff_documents_chain(llm,prompt)
documents_chain

RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableLambda(format_docs)
}), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
| ChatPromptTemplate(input_variables=['context', 'input'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template="You are an assistant for question-answering tasks. \nUse the following pieces of retrieved context to answer the question. \nIf you don't know the answer, just say that you don't know. \nUse three sentences maximum and keep the answer concise.\n\nContext: {context}"), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, template='{input}'), additional_kwargs={})])
| ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x11c996c10>, async_client=<openai.resources.

##### What is create_stuff_documents_chain?
create_stuff_documents_chain creates a chain that "stuffs" (inserts) all retrieved documents into a single prompt and sends it to the LLM. It's called "stuff" because it literally stuffs all the documents into the context window at once.

This chain:

- Takes retrieved documents
- "Stuffs" them into the prompt's {context} placeholder
- Sends the complete prompt to the LLM
- Returns the LLM's response

In [30]:
from langchain.chains import create_retrieval_chain
rag_chain = create_retrieval_chain(retriever,documents_chain)
rag_chain

RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x10fd6e900>, search_kwargs={}), kwargs={}, config={'run_name': 'retrieve_documents'}, config_factories=[])
})
| RunnableAssign(mapper={
    answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
              context: RunnableLambda(format_docs)
            }), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
            | ChatPromptTemplate(input_variables=['context', 'input'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template="You are an assistant for question-answering tasks. \nUse the following pieces of retrieved context to answer the question. \nIf you don't know 

In [31]:
rag_chain.invoke({"input":"What is Data Science?"})

{'input': 'What is Data Science?',
 'context': [Document(metadata={'source': '/Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_1.txt'}, page_content='Deep Learning and Neural Networks\n\n    Deep learning is a subset of machine learning based on artificial neural networks. \n    These networks are inspired by the human brain and consist of layers of interconnected \n    nodes. Deep learning has revolutionized fields like computer vision, natural language \n    processing, and speech recognition. Convolutional Neural Networks (CNNs) are particularly \n    effective for image processing, while Recurrent Neural Networks (RNNs) and Transformers'),
  Document(metadata={'source': '/Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_2.txt'}, page_content='Deep Learning and Neural Networks\n\n    Deep learning is a subset of machine learning based on artificial neural networks. \n    These networks are inspired by the human brain and consist of layers of inte

Create RAG Chain Alternative - Using LCEL(Langchain expression Language)

In [32]:
# even more flexible approach using LCEL 
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough,RunnableSequence,RunnableParallel
from typing import Any
from pydantic import BaseModel, ImportString, ValidationError

In [33]:
# Create a custom prompt
custom_prompt = ChatPromptTemplate.from_template("""Use the following context to answer the question. 
If you don't know the answer based on the context, say you don't know.
Provide specific details from the context to support your answer.

Context:
{context}

Question: {question}

Answer:""")
custom_prompt


ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="Use the following context to answer the question. \nIf you don't know the answer based on the context, say you don't know.\nProvide specific details from the context to support your answer.\n\nContext:\n{context}\n\nQuestion: {question}\n\nAnswer:"), additional_kwargs={})])

In [34]:
retriever

VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x10fd6e900>, search_kwargs={})

In [35]:
def format_docs(docs):
    return "\n\n" .join(doc.page_content for doc in docs)

In [36]:
## Build the chain using LCEL
rag_chain_lcel = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | custom_prompt
    | llm
    | StrOutputParser()
)

rag_chain_lcel

{
  context: VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x10fd6e900>, search_kwargs={})
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="Use the following context to answer the question. \nIf you don't know the answer based on the context, say you don't know.\nProvide specific details from the context to support your answer.\n\nContext:\n{context}\n\nQuestion: {question}\n\nAnswer:"), additional_kwargs={})])
| ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x11c996c10>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x11c996990>, root_client=<openai.Op

In [37]:
response = rag_chain_lcel.invoke("What is Deep Learning ?")
response

'Deep Learning is a subset of machine learning based on artificial neural networks. It is inspired by the human brain and consists of layers of interconnected nodes. Deep learning has revolutionized fields like computer vision, natural language processing, and speech recognition.'

In [38]:
# Query using the LCEL approach - Fixed Version
def query_rag_lcel(question):
    print(f"Question : {question}")
    print("-" * 50)

    # Method 1 Pass String directly (when using RunnablePassthrough)
    answer = rag_chain_lcel.invoke(question)
    print(f"Answer : {answer}")

    # Get source documents seperately if needed
    docs = retriever.get_relevant_documents(question)
    print("\n Source Documents :")
    for i, doc in enumerate(docs):
        print(f"\n --- Source {i+1} ---")
        print(doc.page_content[:200] + "...")

In [39]:
print("testing LCEL Chain:")
query_rag_lcel("What are the key concepts in reinforcement learning?")

testing LCEL Chain:
Question : What are the key concepts in reinforcement learning?
--------------------------------------------------
Answer : The key concepts in reinforcement learning are states, actions, rewards, policies, and value functions.


  docs = retriever.get_relevant_documents(question)



 Source Documents :

 --- Source 1 ---
Reinforcement Learning in Detail

Reinforcement learning (RL) is a type of machine learning where an agent learns to make 
decisions by interacting with an environment. The agent receives rewards or p...

 --- Source 2 ---
Reinforcement Learning in Detail

Reinforcement learning (RL) is a type of machine learning where an agent learns to make 
decisions by interacting with an environment. The agent receives rewards or p...

 --- Source 3 ---
Reinforcement Learning in Detail

Reinforcement learning (RL) is a type of machine learning where an agent learns to make 
decisions by interacting with an environment. The agent receives rewards or p...

 --- Source 4 ---
Reinforcement Learning in Detail

Reinforcement learning (RL) is a type of machine learning where an agent learns to make 
decisions by interacting with an environment. The agent receives rewards or p...


### Add New Documents to Existing Vectore Store

In [40]:
vectorstore

<langchain_community.vectorstores.chroma.Chroma at 0x10fd6e900>

In [41]:
## Add new documents to existing vectore store
new_document = """
Reinforcement Learning in Detail

Reinforcement learning (RL) is a type of machine learning where an agent learns to make 
decisions by interacting with an environment. The agent receives rewards or penalties 
based on its actions and learns to maximize cumulative reward over time. Key concepts 
in RL include: states, actions, rewards, policies, and value functions. Popular RL 
algorithms include Q-learning, Deep Q-Networks (DQN), Policy Gradient methods, and 
Actor-Critic methods. RL has been successfully applied to game playing (like AlphaGo), 
robotics, and autonomous systems.
"""

In [42]:
new_doc = Document(
    page_content=new_document,
    metadata = {"source":"manual_addition","topic":"Reinforcement Learning"}
    
)

In [43]:
new_doc

Document(metadata={'source': 'manual_addition', 'topic': 'Reinforcement Learning'}, page_content='\nReinforcement Learning in Detail\n\nReinforcement learning (RL) is a type of machine learning where an agent learns to make \ndecisions by interacting with an environment. The agent receives rewards or penalties \nbased on its actions and learns to maximize cumulative reward over time. Key concepts \nin RL include: states, actions, rewards, policies, and value functions. Popular RL \nalgorithms include Q-learning, Deep Q-Networks (DQN), Policy Gradient methods, and \nActor-Critic methods. RL has been successfully applied to game playing (like AlphaGo), \nrobotics, and autonomous systems.\n')

In [44]:
# Split the document
new_chunks = text_splitter.split_documents([new_doc])
new_chunks

[Document(metadata={'source': 'manual_addition', 'topic': 'Reinforcement Learning'}, page_content='Reinforcement Learning in Detail\n\nReinforcement learning (RL) is a type of machine learning where an agent learns to make \ndecisions by interacting with an environment. The agent receives rewards or penalties \nbased on its actions and learns to maximize cumulative reward over time. Key concepts \nin RL include: states, actions, rewards, policies, and value functions. Popular RL \nalgorithms include Q-learning, Deep Q-Networks (DQN), Policy Gradient methods, and \nActor-Critic methods. RL has been'),
 Document(metadata={'source': 'manual_addition', 'topic': 'Reinforcement Learning'}, page_content='methods, and \nActor-Critic methods. RL has been successfully applied to game playing (like AlphaGo), \nrobotics, and autonomous systems.')]

In [45]:
## Add new documents to the existing vectore store
vectorstore.add_documents(new_chunks)
#vectorstore

['bf87beb9-d7ef-4e5d-9903-4a5feff04010',
 '6e1213ff-5839-4268-bb7b-708c63c8b044']

In [46]:
print(f" Added {len(new_chunks)} new chunks to the vector store.")
print(f" Total vectors now: {vectorstore._collection.count()}")

 Added 2 new chunks to the vector store.
 Total vectors now: 75


In [47]:
# Query with the updated vector
new_question = " What are the key concepts in reinforcement learning?"
result = query_rag_lcel(new_question)
result

Question :  What are the key concepts in reinforcement learning?
--------------------------------------------------
Answer : The key concepts in reinforcement learning are states, actions, rewards, policies, and value functions.

 Source Documents :

 --- Source 1 ---
Reinforcement Learning in Detail

Reinforcement learning (RL) is a type of machine learning where an agent learns to make 
decisions by interacting with an environment. The agent receives rewards or p...

 --- Source 2 ---
Reinforcement Learning in Detail

Reinforcement learning (RL) is a type of machine learning where an agent learns to make 
decisions by interacting with an environment. The agent receives rewards or p...

 --- Source 3 ---
Reinforcement Learning in Detail

Reinforcement learning (RL) is a type of machine learning where an agent learns to make 
decisions by interacting with an environment. The agent receives rewards or p...

 --- Source 4 ---
Reinforcement Learning in Detail

Reinforcement learning (RL) 

### Advanced RAG Techniques Conversational Memory

- create_history_aware_retriever: Makes the retriever understand conversation context
- MessagesPlaceholder: Placeholder for chat history in prompts
- HumanMessage/AIMessage: Structured message types for conversation history

In [48]:
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage

In [49]:
## Create a prompt that includes the chat history
contextualize_q_system_prompt = """Given a chat history and the latest user question 
which might reference context in the chat history, formulate a standalone question 
which can be understood without the chat history. Do NOT answer the question, 
just reformulate it if needed and otherwise return it as is."""

In [50]:
contextualize_q_prompt = ChatPromptTemplate.from_messages([
    ("system", contextualize_q_system_prompt),
    (MessagesPlaceholder("chat_history")),
    ("human", "{input}")
    
])

In [51]:
## Create a history-aware retriever
history_aware_retriever = create_history_aware_retriever(
    llm,retriever,contextualize_q_prompt
)
history_aware_retriever

RunnableBinding(bound=RunnableBranch(branches=[(RunnableLambda(lambda x: not x.get('chat_history', False)), RunnableLambda(lambda x: x['input'])
| VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x10fd6e900>, search_kwargs={}))], default=ChatPromptTemplate(input_variables=['chat_history', 'input'], input_types={'chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage, Tag(tag='chat')], typing.Annotated[langchain_core.messages.system.SystemMessage, Tag(tag='system')], typing.Annotated[langchain_core.messages.function.FunctionMessage, Tag(tag='function')], typing.Annotated[langchain_core.messages.tool.ToolMessage, Tag(tag='tool')], typing.Annotated[langchain_core.messages.ai.AIMessageChunk, Tag(tag='AIMessageC

In [52]:
# Create a new document chain with history
qa_system_prompt = """You are an assistant for question-answering tasks. 
Use the following pieces of retrieved context to answer the question. 
If you don't know the answer, just say that you don't know. 
Use three sentences maximum and keep the answer concise.

Context: {context}"""

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", qa_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human","{input}")
    ]
)

question_answer_chain = create_stuff_documents_chain(llm,qa_prompt)

#Create conversational RAG Chain
conversational_rag_chain = create_retrieval_chain(
    history_aware_retriever,
    question_answer_chain
)

print("Conversational RAG Chain is created ")

Conversational RAG Chain is created 


In [53]:
chat_history = []
# First Question
result1 = conversational_rag_chain.invoke({
    "chat_history" : chat_history,
    "input" : "What is machine learning ?"
})

print(f"Q : What is machine learning ?")
print(f"A : {result1['answer']}")

Q : What is machine learning ?
A : Machine learning is a subset of artificial intelligence that allows systems to learn and improve from experience without being explicitly programmed. It involves three main types: supervised learning, unsupervised learning, and reinforcement learning. Supervised learning uses labeled data, unsupervised learning finds patterns in unlabeled data, and reinforcement learning learns through experimentation and feedback.


## Append the result to the chat history list 

In [54]:
chat_history

[]

In [55]:
chat_history.extend([
    HumanMessage(content= "What is Machine Learning ?"),
    AIMessage(content=result1['answer'])
])

In [56]:
chat_history

[HumanMessage(content='What is Machine Learning ?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='Machine learning is a subset of artificial intelligence that allows systems to learn and improve from experience without being explicitly programmed. It involves three main types: supervised learning, unsupervised learning, and reinforcement learning. Supervised learning uses labeled data, unsupervised learning finds patterns in unlabeled data, and reinforcement learning learns through experimentation and feedback.', additional_kwargs={}, response_metadata={})]

In [57]:
result2 = conversational_rag_chain.invoke({
    "chat_history" : chat_history,
    "input" : "What are it's main types ?"
})

result2

{'chat_history': [HumanMessage(content='What is Machine Learning ?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='Machine learning is a subset of artificial intelligence that allows systems to learn and improve from experience without being explicitly programmed. It involves three main types: supervised learning, unsupervised learning, and reinforcement learning. Supervised learning uses labeled data, unsupervised learning finds patterns in unlabeled data, and reinforcement learning learns through experimentation and feedback.', additional_kwargs={}, response_metadata={})],
 'input': "What are it's main types ?",
 'context': [Document(metadata={'source': '/Users/jamadagnikotamsetty/MyCode/RAG_Pipelines/2-Vector-Stores/doc_0.txt'}, page_content='Machine Learning Fundamentals\n\n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    typ

In [58]:
result2['answer']

'The main types of machine learning are supervised learning, unsupervised learning, and reinforcement learning. Supervised learning uses labeled data to train models, unsupervised learning finds patterns in unlabeled data, and reinforcement learning learns through experimentation and rewards.'

### Using Groq

In [59]:
load_dotenv()

True

In [None]:
#os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")
#os.environ

In [60]:
from langchain_groq import ChatGroq
from langchain.chat_models import init_chat_model

In [61]:
os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")
#os.environ

In [64]:
llm = ChatGroq(model="gemma2-9b-it",api_key=os.getenv("GROQ_API_KEY"))
llm

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x11cfaead0>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x11cfaefd0>, model_name='gemma2-9b-it', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [65]:
llm=init_chat_model(model="groq:gemma2-9b-it")
llm

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x11d2e43e0>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x11d2e48a0>, model_name='gemma2-9b-it', model_kwargs={}, groq_api_key=SecretStr('**********'))