# Building a RAG system with OpenAI and ChromaDB

#### Introduction
Retrieval-Augmented Generation (RAG) is a powerful technique that combines the capabilities of large language models with external knowledge retrieval. This notebook will walk you through building a complete RAG system using:

- LangChain: A framework for developing applications powered by language models
- ChromaDB: An open-source vector database for storing and retrieving embeddings
- OpenAI: For embeddings and language model (you can substitute with other providers)

In [9]:
import os
from dotenv import load_dotenv

load_dotenv()

# langchain imports
from langchain_openai import OpenAIEmbeddings
from langchain_community.document_loaders import (
    PyPDFLoader,
    PyMuPDFLoader,
    TextLoader
)
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.documents import Document

# vector stores
from langchain_community.vectorstores import Chroma

# Utilities
import numpy as np
from typing import List

## RAG Architecture Overview
### RAG (Retrieval-Augmented Generation) Architecture:

1. Document Loading: Load documents from various sources
2. Document Splitting: Break documents into smaller chunks
3. Embedding Generation: Convert chunks into vector representations
4. Vector Storage: Store embeddings in ChromaDB
5. Query Processing: Convert user query to embedding
6. Similarity Search: Find relevant chunks from vector store
7. Context Augmentation: Combine retrieved chunks with query
8. Response Generation: LLM generates answer using context

Benefits of RAG:
- Reduces hallucinations
- Provides up-to-date information
- Allows citing sources
- Works with domain-specific knowledge

### Create sample data

In [10]:
sample_docs = [
    """
    Machine Learning Fundamentals
    
    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There are three main 
    types of machine learning: supervised learning, unsupervised learning, and reinforcement 
    learning. Supervised learning uses labeled data to train models, while unsupervised 
    learning finds patterns in unlabeled data. Reinforcement learning learns through 
    interaction with an environment using rewards and penalties.
    """,
    
    """
    Deep Learning and Neural Networks
    
    Deep learning is a subset of machine learning based on artificial neural networks. 
    These networks are inspired by the human brain and consist of layers of interconnected 
    nodes. Deep learning has revolutionized fields like computer vision, natural language 
    processing, and speech recognition. Convolutional Neural Networks (CNNs) are particularly 
    effective for image processing, while Recurrent Neural Networks (RNNs) and Transformers 
    excel at sequential data processing.
    """,
    
    """
    Natural Language Processing (NLP)
    
    NLP is a field of AI that focuses on the interaction between computers and human language. 
    Key tasks in NLP include text classification, named entity recognition, sentiment analysis, 
    machine translation, and question answering. Modern NLP heavily relies on transformer 
    architectures like BERT, GPT, and T5. These models use attention mechanisms to understand 
    context and relationships between words in text.
    """
]

In [11]:
# save to file
import tempfile

temp_dir = tempfile.mkdtemp()

for i, doc in enumerate(sample_docs):
    with open(f"{temp_dir}/doc_{i}.txt", "w") as f:
        f.write(doc)

print(f"Sample document created: {temp_dir}")

Sample document created: C:\Users\LSHIVA~1\AppData\Local\Temp\tmplfq91iub


### Document loading

In [12]:
from langchain_community.document_loaders import DirectoryLoader

In [13]:
loader = DirectoryLoader(
    path=temp_dir,
    glob="*.txt",
    loader_cls=TextLoader,
    loader_kwargs={'encoding':"utf-8"}
)

documents = loader.load()

print(f"Loaded {len(documents)} documents")
print(f"\nFirst document preview: ")
print(documents[0].page_content[:200] + "...")

Loaded 3 documents

First document preview: 

    Machine Learning Fundamentals
    
    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. ...


### Document splitting

In [14]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 500,
    chunk_overlap = 50,
    length_function = len,
    separators=["\n\n", "\n", ". ", " ", ""] # heirarchy of separators
)

chunks = text_splitter.split_documents(documents=documents)

print(f"Created {len(chunks)} chunks from {len(documents)} documents")
print(f"\nChunk example:")
print(f"Content: {chunks[0].page_content[:150]}...")
print(f"Metadata: {chunks[0].metadata}")

Created 5 chunks from 3 documents

Chunk example:
Content: Machine Learning Fundamentals
    
    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from exp...
Metadata: {'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmplfq91iub\\doc_0.txt'}


In [15]:
chunks

[Document(metadata={'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmplfq91iub\\doc_0.txt'}, page_content='Machine Learning Fundamentals\n    \n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: supervised learning, unsupervised learning, and reinforcement \n    learning. Supervised learning uses labeled data to train models, while unsupervised \n    learning finds patterns in unlabeled data. Reinforcement learning learns through'),
 Document(metadata={'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmplfq91iub\\doc_0.txt'}, page_content='interaction with an environment using rewards and penalties.'),
 Document(metadata={'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmplfq91iub\\doc_1.txt'}, page_content='Deep Learning and Neural Networks\n    \n    Deep learning is a subset of machine learning based on 

### Embeddings models

In [16]:
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

In [17]:
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

### Initialize chromadb vector store and store chunks in vector representation

In [18]:
# create chromadb vector store
persist_dir = "./chroma_db"

# initialize chromadb with openai embeddings
vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=OpenAIEmbeddings(),
    persist_directory=persist_dir,
    collection_name="rag_collection"
)

print(f"Vector store created with {vectorstore._collection.count()} vectors")
print(f"Persisted to: {persist_dir}")

Vector store created with 17 vectors
Persisted to: ./chroma_db


### Test similarity search

In [19]:
query = "What are the types of machine learning?"

similar_docs = vectorstore.similarity_search(query=query, k=3)
similar_docs

[Document(metadata={'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmpx6ltr9sk\\doc_0.txt'}, page_content='Machine Learning Fundamentals\n    \n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: supervised learning, unsupervised learning, and reinforcement \n    learning. Supervised learning uses labeled data to train models, while unsupervised \n    learning finds patterns in unlabeled data. Reinforcement learning learns through'),
 Document(metadata={'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmplfq91iub\\doc_0.txt'}, page_content='Machine Learning Fundamentals\n    \n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: supervised learning, unsup

In [20]:
print(f"Query: {query}")
print(f"\nTop {len(similar_docs)} similar chunks: ")
for i, doc in enumerate(similar_docs):
    print(f"\n--- Chunk {i+1} ---")
    print(doc.page_content[:200]+"...")
    print(f"Source: {doc.metadata.get('source', 'Unknown')}")

Query: What are the types of machine learning?

Top 3 similar chunks: 

--- Chunk 1 ---
Machine Learning Fundamentals
    
    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There...
Source: C:\Users\LSHIVA~1\AppData\Local\Temp\tmpx6ltr9sk\doc_0.txt

--- Chunk 2 ---
Machine Learning Fundamentals
    
    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There...
Source: C:\Users\LSHIVA~1\AppData\Local\Temp\tmplfq91iub\doc_0.txt

--- Chunk 3 ---
Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There are three main 
    types of machine l...
Source: C:\Users\LSHIVA~1\AppData\Local\Temp\tmp9cr5i63a\doc_0.txt


### Advanced similarity search with scores

In [21]:
results = vectorstore.similarity_search_with_score(query, k=3)
results

[(Document(metadata={'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmpx6ltr9sk\\doc_0.txt'}, page_content='Machine Learning Fundamentals\n    \n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: supervised learning, unsupervised learning, and reinforcement \n    learning. Supervised learning uses labeled data to train models, while unsupervised \n    learning finds patterns in unlabeled data. Reinforcement learning learns through'),
  0.24102914333343506),
 (Document(metadata={'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmplfq91iub\\doc_0.txt'}, page_content='Machine Learning Fundamentals\n    \n    Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: 

#### Understanding Similarity Scores
The similarity score represents how closely related a document chunk is to your query. The scoring depends on the distance metric used:

ChromaDB default: Uses L2 distance (Euclidean distance)

- Lower scores = MORE similar (closer in vector space)
- Score of 0 = identical vectors
- Typical range: 0 to 2 (but can be higher)


Cosine similarity (if configured):

- Higher scores = MORE similar
- Range: -1 to 1 (1 being identical)

### Initialize the LLM (OpenAI), RAG chain, Prompt template, query the RAG system

In [22]:
from langchain_openai import ChatOpenAI
import os
from dotenv import load_dotenv

load_dotenv()

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")


llm = ChatOpenAI(
    model_name = "gpt-3.5-turbo",
    temperature=0.2,
    max_tokens=500
)

In [23]:
test_response = llm.invoke("What is Large language models?")
test_response

AIMessage(content='Large language models are a type of artificial intelligence model that are trained on vast amounts of text data in order to generate human-like text. These models are capable of understanding and generating natural language text, and are used in a variety of applications such as language translation, text generation, and natural language processing tasks. Some examples of large language models include GPT-3 (Generative Pre-trained Transformer 3) and BERT (Bidirectional Encoder Representations from Transformers).', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 93, 'prompt_tokens': 13, 'total_tokens': 106, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-Ce3aqVYB

In [24]:
from langchain.chat_models.base import init_chat_model

llm = init_chat_model(
    model="openai:gpt-3.5-turbo"
)

# llm = init_chat_model(
#     model="groq:"
# )

test_response = llm.invoke("What is Large language models?")
test_response

AIMessage(content='Large language models are artificial intelligence algorithms that are trained on massive amounts of text data to understand and generate human language. These models have the ability to process and generate human-like text, often indistinguishable from text written by humans. Some well-known examples of large language models include GPT-3 (Generative Pre-trained Transformer 3) and BERT (Bidirectional Encoder Representations from Transformers). These models have a wide range of applications, such as natural language processing, machine translation, and text generation.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 102, 'prompt_tokens': 13, 'total_tokens': 115, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-3.5-turbo-0

### Modern RAG chain

In [25]:
from langchain_classic.chains import create_retrieval_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain_classic.chains.combine_documents import create_stuff_documents_chain

In [26]:
## Convert vector store to retiever
retriever = vectorstore.as_retriever(
    search_kwarg = {'k':3} ## Retrieve top 3 relevant chunks
)
retriever

VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x0000019ADCC11C30>, search_kwargs={})

In [27]:
## Create a prompt template
system_prompt = """
You are an assistant for question-answering tasks.
Use the following pieces of retrived context to answer the question.
If you dont know the answer, just say that you dont know.
Use three sentences maximum and keep the answer concise.

Context: {context}
"""

prompt = ChatPromptTemplate.from_messages(
    [("system", system_prompt),
     ("human", "{input}")]
)

In [28]:
prompt.to_json()

{'lc': 1,
 'type': 'constructor',
 'id': ['langchain', 'prompts', 'chat', 'ChatPromptTemplate'],
 'kwargs': {'input_variables': ['context', 'input'],
  'messages': [SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='\nYou are an assistant for question-answering tasks.\nUse the following pieces of retrived context to answer the question.\nIf you dont know the answer, just say that you dont know.\nUse three sentences maximum and keep the answer concise.\n\nContext: {context}\n'), additional_kwargs={}),
   HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, template='{input}'), additional_kwargs={})]},
 'name': 'ChatPromptTemplate'}

### Create document chain

In [29]:
document_chain = create_stuff_documents_chain(llm, prompt)
document_chain

RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableLambda(format_docs)
}), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
| ChatPromptTemplate(input_variables=['context', 'input'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='\nYou are an assistant for question-answering tasks.\nUse the following pieces of retrived context to answer the question.\nIf you dont know the answer, just say that you dont know.\nUse three sentences maximum and keep the answer concise.\n\nContext: {context}\n'), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, template='{input}'), additional_kwargs={})])
| ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x0000019ADD6AE290>, async_client=<openai.resou

This chain:

- Takes retrieved documents
- "Stuffs" them into the prompt's {context} placeholder
- Sends the complete prompt to the LLM
- Returns the LLM's response

#### What is create_retrieval_chain?
create_retrieval_chain is a function that combines a retriever (which fetches relevant documents) with a document chain (which processes those documents with an LLM) to create a complete RAG pipeline.

In [30]:
rag_chain = create_retrieval_chain(retriever, document_chain)
rag_chain

RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x0000019ADCC11C30>, search_kwargs={}), kwargs={}, config={'run_name': 'retrieve_documents'}, config_factories=[])
})
| RunnableAssign(mapper={
    answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
              context: RunnableLambda(format_docs)
            }), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
            | ChatPromptTemplate(input_variables=['context', 'input'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='\nYou are an assistant for question-answering tasks.\nUse the following pieces of retrived context to answer the question.\nIf you dont 

In [31]:
response = rag_chain.invoke({"input": "What is machine learning?"})
response

{'input': 'What is machine learning?',
 'context': [Document(metadata={'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmp9cr5i63a\\doc_0.txt'}, page_content='Machine learning is a subset of artificial intelligence that enables systems to learn \n    and improve from experience without being explicitly programmed. There are three main \n    types of machine learning: supervised learning, unsupervised learning, and reinforcement \n    learning. Supervised learning uses labeled data to train models, while unsupervised \n    learning finds patterns in unlabeled data. Reinforcement learning learns through'),
  Document(metadata={'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmp9cr5i63a\\doc_0.txt'}, page_content='Machine Learning Fundamentals'),
  Document(metadata={'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmpx6ltr9sk\\doc_0.txt'}, page_content='Machine Learning Fundamentals\n    \n    Machine learning is a subset of artificial intelligence that enables systems t

In [32]:
response["answer"]

'Machine learning is a subset of artificial intelligence that allows systems to learn and improve from experience without explicit programming. It involves three main types: supervised learning, unsupervised learning, and reinforcement learning, each serving different purposes in data analysis and pattern recognition. Supervised learning uses labeled data, unsupervised learning finds patterns in unlabeled data, and reinforcement learning learns through a trial-and-error process.'

In [33]:
# Function to query the modern RAG system
def query_rag_modern(question):
    print(f"Question: {question}")
    print("-" * 50)
    
    # Using create_retrieval_chain approach
    result = rag_chain.invoke({"input": question})
    
    print(f"Answer: {result['answer']}")
    print("\nRetrieved Context:")
    for i, doc in enumerate(result['context']):
        print(f"\n--- Source {i+1} ---")
        print(doc.page_content[:200] + "...")
    
    return result

# Test queries
test_questions = [
    "What are the three types of machine learning?",
    "What is deep learning and how does it relate to neural networks?",
    "What are CNNs best used for?"
]

for question in test_questions:
    result = query_rag_modern(question)
    print("\n" + "="*80 + "\n")

Question: What are the three types of machine learning?
--------------------------------------------------
Answer: The three types of machine learning are supervised learning, unsupervised learning, and reinforcement learning.

Retrieved Context:

--- Source 1 ---
Machine Learning Fundamentals
    
    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There...

--- Source 2 ---
Machine Learning Fundamentals
    
    Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There...

--- Source 3 ---
Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. There are three main 
    types of machine l...

--- Source 4 ---
Machine Learning Fundamentals...


Question: What is deep learning an

# Create RAG chain alternative: LCEL (LangChain Expression Language)

In [34]:
# Even a more flexible approach using LECL
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough, RunnableParallel
from langchain_core.prompts import ChatPromptTemplate

In [35]:
# create a custom prompt
custom_prompt = ChatPromptTemplate.from_template("""Use the following context to answer the question.
                                                 If you dont know the answer based on the context, say you don't know.
                                                 Provide specific details from the context to support your answer.
                                                 
                                                 Context:
                                                 {context}

                                                 Question: {question}
                                                 
                                                 Answer:""")


In [36]:
# Format the output documents for the prompt
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [37]:
# Build chain using LCEL
rag_chain_lcel = (
    {
        'context': retriever| format_docs,
        'question': RunnablePassthrough()
    }
     | custom_prompt
     | llm
     | StrOutputParser()
)
rag_chain_lcel

{
  context: VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x0000019ADCC11C30>, search_kwargs={})
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="Use the following context to answer the question.\n                                                 If you dont know the answer based on the context, say you don't know.\n                                                 Provide specific details from the context to support your answer.\n                                                 \n                                                 Context:\n                                                 {context}\n\n                                   

In [38]:
rag_chain_lcel.invoke("What the hell is deep learning?")

'Deep learning is a subset of machine learning based on artificial neural networks. These networks consist of layers of interconnected nodes and are inspired by the human brain. Deep learning has revolutionized fields like computer vision, natural language processing, and speech recognition.'

In [41]:
retriever.invoke("What is deep learning?")

[Document(metadata={'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmp9cr5i63a\\doc_1.txt'}, page_content='Deep Learning and Neural Networks'),
 Document(metadata={'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmp9cr5i63a\\doc_1.txt'}, page_content='Deep learning is a subset of machine learning based on artificial neural networks. \n    These networks are inspired by the human brain and consist of layers of interconnected \n    nodes. Deep learning has revolutionized fields like computer vision, natural language \n    processing, and speech recognition. Convolutional Neural Networks (CNNs) are particularly \n    effective for image processing, while Recurrent Neural Networks (RNNs) and Transformers'),
 Document(metadata={'source': 'C:\\Users\\LSHIVA~1\\AppData\\Local\\Temp\\tmpx6ltr9sk\\doc_1.txt'}, page_content='Deep Learning and Neural Networks\n    \n    Deep learning is a subset of machine learning based on artificial neural networks. \n    These networks are inspire

In [42]:
def query_rag_lcel(question):
    print(f"Question: {question}")
    print("-"*50)

    # method 1: pass string directly (when using RunnablePassThrough)
    answer = rag_chain_lcel.invoke(question)
    print(f"Answer: {answer}")

    # get source documents seperately if needed
    docs = retriever.invoke(question)
    print("\nSOurce Documents: ")
    for i, doc in enumerate(docs):
        print(f"\n--- Source {i+1} ---")
        print(doc.page_content[:200]+"...")

In [43]:
# Test LCEL chain
query_rag_lcel("whats all this about?")

Question: whats all this about?
--------------------------------------------------
Answer: The context is discussing machine learning and deep learning fundamentals, specifically mentioning interaction with an environment using rewards and penalties. This is referring to reinforcement learning, a type of machine learning where an algorithm learns to perform a task through trial and error by receiving rewards for successful actions and penalties for unsuccessful actions.

SOurce Documents: 

--- Source 1 ---
Deep Learning and Neural Networks...

--- Source 2 ---
Machine Learning Fundamentals...

--- Source 3 ---
interaction with an environment using rewards and penalties....

--- Source 4 ---
interaction with an environment using rewards and penalties....


# Add New Documents to existing Vector Store

In [44]:
new_document = """
Reinforcement Learning in Detail

Reinforcement learning (RL) is a type of machine learning where an agent learns to make
decisions by interacting with an environment. The agnet receives rewards or penalties
based on its actions and learns to maximize cumulative rewards over time. Key concepts
in RL include: states, actions, rewards, policies, and value functions. Popular RL 
algorithms include Q-learning, Deep Q-Networks (DQN), Policy gradient methods, and
actor-critic methods. RL has been sucessfully applied to game playing (like AlphaGo),
robotics, and autonomous systems.
"""

In [45]:
new_doc = Document(
    page_content=new_document,
    metadata={
        "source":"manual_addition",
        "topic":"reinforcement learning"
    }
)
new_doc

Document(metadata={'source': 'manual_addition', 'topic': 'reinforcement learning'}, page_content='\nReinforcement Learning in Detail\n\nReinforcement learning (RL) is a type of machine learning where an agent learns to make\ndecisions by interacting with an environment. The agnet receives rewards or penalties\nbased on its actions and learns to maximize cumulative rewards over time. Key concepts\nin RL include: states, actions, rewards, policies, and value functions. Popular RL \nalgorithms include Q-learning, Deep Q-Networks (DQN), Policy gradient methods, and\nactor-critic methods. RL has been sucessfully applied to game playing (like AlphaGo),\nrobotics, and autonomous systems.\n')

In [46]:
new_chunks = text_splitter.split_documents([new_doc])
new_chunks

[Document(metadata={'source': 'manual_addition', 'topic': 'reinforcement learning'}, page_content='Reinforcement Learning in Detail'),
 Document(metadata={'source': 'manual_addition', 'topic': 'reinforcement learning'}, page_content='Reinforcement learning (RL) is a type of machine learning where an agent learns to make\ndecisions by interacting with an environment. The agnet receives rewards or penalties\nbased on its actions and learns to maximize cumulative rewards over time. Key concepts\nin RL include: states, actions, rewards, policies, and value functions. Popular RL \nalgorithms include Q-learning, Deep Q-Networks (DQN), Policy gradient methods, and'),
 Document(metadata={'source': 'manual_addition', 'topic': 'reinforcement learning'}, page_content='actor-critic methods. RL has been sucessfully applied to game playing (like AlphaGo),\nrobotics, and autonomous systems.')]

In [47]:
# add new documents to vector store
vectorstore.add_documents(new_chunks)

['a2726eab-d2bd-4a8f-b895-d4805aa4083e',
 '1be427a5-5cb2-480c-ab06-bca06f4ce00f',
 '5bbea8f1-6ed1-4e7e-9c43-4b70d79dd453']

In [49]:
vectorstore._collection.count()

20

In [50]:
new_question = "what are key concepts in reinforcement learning"
query_rag_lcel(new_question)

Question: what are key concepts in reinforcement learning
--------------------------------------------------
Answer: Key concepts in reinforcement learning include states, actions, rewards, policies, and value functions.

SOurce Documents: 

--- Source 1 ---
Reinforcement Learning in Detail...

--- Source 2 ---
Reinforcement learning (RL) is a type of machine learning where an agent learns to make
decisions by interacting with an environment. The agnet receives rewards or penalties
based on its actions and l...

--- Source 3 ---
actor-critic methods. RL has been sucessfully applied to game playing (like AlphaGo),
robotics, and autonomous systems....

--- Source 4 ---
Machine Learning Fundamentals...


# Advanced RAG Techniques: Conversational Memory

In [51]:
from langchain_classic.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage

In [54]:
# create a prompt that includes the chat history
contextualize_q_system_prompt = """Given a chat history and the latest user question
which might reference context in the chat history, formulate a standalone question
which can be understood without the chat history. DO not answer the question,
just reformulate it if needed and otherwise return as it is."""

contextualize_q_prompt = ChatPromptTemplate.from_messages([
    ('system', contextualize_q_system_prompt),
    MessagesPlaceholder("Chat_history"),
    ('human', "{input}")
])

In [55]:
history_aware_retiever = create_history_aware_retriever(
    llm, retriever, contextualize_q_prompt
)

history_aware_retiever

RunnableBinding(bound=RunnableBranch(branches=[(RunnableLambda(lambda x: not x.get('chat_history', False)), RunnableLambda(lambda x: x['input'])
| VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x0000019ADCC11C30>, search_kwargs={}))], default=ChatPromptTemplate(input_variables=['Chat_history', 'input'], input_types={'Chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage, Tag(tag='chat')], typing.Annotated[langchain_core.messages.system.SystemMessage, Tag(tag='system')], typing.Annotated[langchain_core.messages.function.FunctionMessage, Tag(tag='function')], typing.Annotated[langchain_core.messages.tool.ToolMessage, Tag(tag='tool')], typing.Annotated[langchain_core.messages.ai.AIMessageChunk, Tag(tag='AIM

In [58]:
qa_system_prompt = """You are an assistant for question-answering tasks
Use the following pieces of retrieved context to answer the question.
If you dont know the answer, just say that you dont know.
Use three sentences maximum and keep the answer concise.

Context: {context}"""

qa_prompt = ChatPromptTemplate.from_messages([
    ("system", qa_system_prompt),
    MessagesPlaceholder("chat_history"),
    ("human", "{input}")
])

question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)

In [59]:
conversational_rag_chain = create_retrieval_chain(
    history_aware_retiever,
    question_answer_chain
)

In [60]:
chat_history = []
result1 = conversational_rag_chain.invoke({
    "chat_history": chat_history,
    "input": "What is machine learning?"}
)

print(f"Q: What is machine learning?")
print(f"A: {result1['answer']}")

Q: What is machine learning?
A: Machine learning is a subset of artificial intelligence that allows systems to learn and improve from experience without being explicitly programmed. There are three main types of machine learning: supervised learning, unsupervised learning, and reinforcement learning. Supervised learning uses labeled data, unsupervised learning finds patterns in unlabeled data, and reinforcement learning learns through a trial-and-error process.


In [62]:
chat_history.extend([HumanMessage(content="What is machine learning?"),
                     AIMessage(content=result1['answer'])])
chat_history

[HumanMessage(content='What is machine learning?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='Machine learning is a subset of artificial intelligence that allows systems to learn and improve from experience without being explicitly programmed. There are three main types of machine learning: supervised learning, unsupervised learning, and reinforcement learning. Supervised learning uses labeled data, unsupervised learning finds patterns in unlabeled data, and reinforcement learning learns through a trial-and-error process.', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='What is machine learning?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='Machine learning is a subset of artificial intelligence that allows systems to learn and improve from experience without being explicitly programmed. There are three main types of machine learning: supervised learning, unsupervised learning, and reinforcement learning. Supervised learni

In [63]:
result2 = conversational_rag_chain.invoke({
    "chat_history": chat_history,
    "input": ""}
)

print(f"Q: ")
print(f"A: {result1['answer']}")

KeyError: "Input to ChatPromptTemplate is missing variables {'Chat_history'}.  Expected: ['Chat_history', 'input'] Received: ['chat_history', 'input']\nNote: if you intended {Chat_history} to be part of the string and not a variable, please escape it with double curly braces like: '{{Chat_history}}'.\nFor troubleshooting, visit: https://docs.langchain.com/oss/python/langchain/errors/INVALID_PROMPT_INPUT "