# **Introduction to LangChain**

In [1]:
from dotenv import load_dotenv
load_dotenv()

True

### 1. Chat Models

- Basic Conversations
- Alternatives
- Conversation with Message History

In [2]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o-mini") # load the model

How to simply call the model and ask a question

In [3]:
# invoke the model with a simple message
result = model.invoke("Hey, tell me about yourself in a single line.")
print(result.content)

I'm an AI language model designed to assist with information and creative tasks across a wide range of topics.


Different types of messages in LangChain, how to use `messages` list to store conversations

In [4]:
# import various types of messages
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage

messages = [
    SystemMessage("You are a helpful assistant."), # message to guide the AI
    HumanMessage("What is 2 + 2?"), # message from the user
    AIMessage("2 + 2 = 4"), # output message from AI
    HumanMessage("Now tell me what is 5 + 5")
]

result = model.invoke(messages) # invoke the model with multiple messages / a conversation
print(result.content)

5 + 5 = 10


How the LangChain API keeps the process unified and makes it easier to migrate from a single model to another.

```Python
# this is markdown code, not runnable
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_anthropic import ChatAnthropic

model_gemini = ChatGoogleGenerativeAI(model="gemini-1.5-flash-8b")
model_claude = ChatAnthropic(model="claude-3-sonnet-20240229")

result_openai = model.invoke("What is 5+5?")
print("ChatGPT:",result_openai.content)

result_gemini = model_gemini.invoke("What is 5+5?")
print("Gemini:", result_gemini.content)

result_claude = model_claude.invoke("What is 5+5?")
print("Claude:", result_claude.content)
```

How to have real-time conversations

In [5]:
chat_history = [
    SystemMessage("You are a helpful assistant."),
]

while True:
    query=input()
    print("You:",query)
    if query.lower() in ("exit","quit"): break
    chat_history.append(HumanMessage(query))

    result = model.invoke(chat_history)
    print("AI:",result.content)
    chat_history.append(AIMessage(result.content))

print("--- MESSAGE HISTORY ---")
for msg in chat_history: print(msg)

You: hey im pranav
AI: Hi Pranav! How can I assist you today?
You: what is my name?
AI: Your name is Pranav. How can I help you today?
You: exit
--- MESSAGE HISTORY ---
content='You are a helpful assistant.' additional_kwargs={} response_metadata={}
content='hey im pranav' additional_kwargs={} response_metadata={}
content='Hi Pranav! How can I assist you today?' additional_kwargs={} response_metadata={}
content='what is my name?' additional_kwargs={} response_metadata={}
content='Your name is Pranav. How can I help you today?' additional_kwargs={} response_metadata={}


**Using `ConversationChain` to implement conversations and use different types of 'Conversation Memory' to add memory to the AI**

In [6]:
from langchain.chains import ConversationChain
from langchain.chains.conversation.memory import *

memory1 = ConversationBufferMemory() # simple buffer memory
conversation = ConversationChain(
    llm=model,
    verbose=True,
    memory=memory1
)

  memory1 = ConversationBufferMemory() # simple buffer memory
  conversation = ConversationChain(


In [7]:
conversation.predict(input="Hey, good evening, how are you?")
conversation.predict(input="What are you good at? Tell me in one line")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hey, good evening, how are you?
AI:[0m

[1m> Finished chain.[0m


[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hey, good evening, how are you?
AI: Good evening! I'm doing well, thank you for asking. How about you? What’s on your mind tonight?
Human: What are you good at? Tell me in one line
AI:[0m

[1m> Finished chain.[0m


"I'm good at providing information, answering questions, and engaging in friendly conversations on a wide range of topics!"

In [8]:
print(conversation.memory.buffer)

Human: Hey, good evening, how are you?
AI: Good evening! I'm doing well, thank you for asking. How about you? What’s on your mind tonight?
Human: What are you good at? Tell me in one line
AI: I'm good at providing information, answering questions, and engaging in friendly conversations on a wide range of topics!


**Here, we can see that the messages are directly stored in a buffer memory.**

There are a few other memory classes - 
- `ConversationSummaryMemory` : Summarizes the conversation to save memory.
- `ConversationBufferWindowMemory` : Just like buffer memory, with a `k=<count>` parameter to store last 'k' conversations.
- `ConversationSummaryBufferMemory` : A hybrid memory which summarizes the conversation of each member. Has a `max_token_limit` parameter.

```python
# Try these out as well
ConversationSummaryMemory(llm=model) 

ConversationBufferWindowMemory(llm=model,k=3) # last 3 msgs

ConversationSummaryBufferMemory(llm=model, max_token_limit=100)

```

**A few advanced memory classes are - `EntityMemory` and `ConversationKnowledgeGraphMemory`**
- "EntityMemory" class is used to make the LLM retain history in the form of entity/object context.

For example, if you send "John ate an apple", it will understand that 'John' and 'apple' are entities.

- "ConversationKnowledgeGraphMemory" class is used to make the LLM understand the context as well using knowledge graphs.

For example, if you send "John ate an apple", then "An apple is a fruit", then ask "What kind of food did John eat?", then the LLM will understand it as - 
"John -> [eats] -> Apple -> [is] -> Fruit" 
        

### 2. Prompt Templates

- Basic Chat Prompt Template with template strings
- Prompts with Multiple Placeholders
- Prompts with System and Human Messages

In [9]:
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage

# Using template strings
template_str = "Tell me a good one-liner joke about {topic}" # adding more variables makes the prompt use multiple placeholders
prompt_template = ChatPromptTemplate.from_template(template_str)
prompt = prompt_template.invoke(input={"topic":"AI"})
result = model.invoke(prompt)
print(result.content) # The joke probably won't be funny, but it works atleast

Why did the AI go broke? Because it couldn't find its cache!


In [10]:
# using different type of messages
messages = [
    SystemMessage("You are a helpful assistant."),
    HumanMessage("Hello"),
    AIMessage("Hello, how can I help you?"),
    ("human","Tell me a joke about {topic}") # use tuples if string interpolation is needed (for prompt variables)
]

prompt_template = ChatPromptTemplate.from_messages(messages)
prompt = prompt_template.invoke(input={"topic":"Nature"})
result = model.invoke(prompt)
print(result.content) # once again, not funny, and yes again, it works

Sure! Here’s a nature joke for you:

Why did the tree go to therapy?

Because it couldn't stop "leafing" its problems behind!


### 3. Chains

- Basic Chains, learning to use Parsers
- Parallel Chain Execution
- Branching

In [11]:
from langchain.schema.output_parser import StrOutputParser # helps display the output in a readable format

messages = [
    SystemMessage("You are a helpful assistant."),
    HumanMessage("Hello"),
    AIMessage("Hello, how can I help you?"),
    ("human","Tell me one line about {topic}") # use tuples if string interpolation is needed (for prompt variables)
]

prompt_template = ChatPromptTemplate.from_messages(messages)

# USE PIPE NOTATION TO CREATE CHAINS
chain = prompt_template | model # yay, your very first chain! 

# you don't need to generate the prompt from the template, the chain does it for you... make sure you pass in the dict in "invoke()"
result = chain.invoke(input={"topic":"Nature"}) 
result #unreadable

AIMessage(content='Nature is a breathtaking tapestry of life, showcasing the beauty and interconnectedness of all living things.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 20, 'prompt_tokens': 40, 'total_tokens': 60, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_06737a9306', 'finish_reason': 'stop', 'logprobs': None}, id='run-5d6fde96-6846-483f-bad5-5ad56e17f29a-0', usage_metadata={'input_tokens': 40, 'output_tokens': 20, 'total_tokens': 60, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [12]:
chain = prompt_template | model | StrOutputParser()

result = chain.invoke(input={"topic":"Nature"}) 
result #readable

'Nature is a breathtaking tapestry of life, showcasing the intricate balance and beauty of ecosystems that sustain our planet.'

**RUNNABLES - The fundamentals of chains**

- Everything revolves around "Runnable" in LangChain.
- `Runnable` is the superclass of all Runnable classes, and `RunnableSequence` is a virtual chain of Runnables.
- Every `Runnable` object has `invoke()` method, something we're already familiar with.
- A sequence of runnables, will each have the 'invoke' function which sends the output as input to the next Runnable node.

In [13]:
from langchain_core.runnables import RunnableSequence, RunnableLambda

# Let's take the previous nature example 

messages = [
    SystemMessage("You are a helpful assistant."),
    HumanMessage("Hello"),
    AIMessage("Hello, how can I help you?"),
    ("human","Tell me one line about {topic}") # use tuples if string interpolation is needed (for prompt variables)
]

prompt_template = ChatPromptTemplate.from_messages(messages)

runnable_prompt = RunnableLambda(lambda x: prompt_template.format_prompt(**x))
runnable_model = RunnableLambda(lambda x: model.invoke(x.to_messages()))
runnable_parser = RunnableLambda(lambda x: StrOutputParser().invoke(x))

runnable_sequence = RunnableSequence(first = runnable_prompt, middle=[runnable_model], last=runnable_parser)
result = runnable_sequence.invoke(input={"topic":"Nature"})
print(result)

Nature is a beautiful and intricate tapestry of life, showcasing the harmony and balance of the ecosystems that sustain our planet.


In [14]:
# How to create longer chains? Nothing much, just add more runnables

# Now we have a runnable sequence (which is a chain) that gives us a line on a given topic
# Let's split the output into words, then count the number of words

word_splitter = RunnableLambda(lambda x: x.split())
word_counter = RunnableLambda(lambda x: len(x))

final_chain = runnable_sequence | word_splitter | word_counter
result = final_chain.invoke(input={"topic":"Nature"})
print(result)

19


There are a few other Runnables, which can be quite useful:

- `RunnablePassthrough` : sends the entire input back as output (nothing changed)
- `RunnablePick` : picks a specific key and returns its value from the input
- `RunnableAssign` : assigns values for some of the prompt variables in the input

Now we'll be looking at how to create parallel chains and branched chains!

**Parallel Chains**

In [15]:
from langchain_core.runnables import RunnableParallel, RunnableBranch

messages = [
    SystemMessage("You are an expert product reviewer."),
    ("human","Give me a brief but general review about the product: {product}")
]

prompt_template = ChatPromptTemplate.from_messages(messages)

# write functions for the parallel chains

def analyze_pros(features):
    pros_template = ChatPromptTemplate.from_messages([
        ("system","You are an expert product reviewer."),
        ("human","Given these features: {features}, what are 2-3 pros of the product?")
    ])
    return pros_template.format_prompt(features=features)

def analyze_cons(features):
    cons_template = ChatPromptTemplate.from_messages([
        ("system","You are an expert product reviewer."),
        ("human","Given these features: {features}, what are 2-3 cons of the product?")
    ])
    return cons_template.format_prompt(features=features)

def final_output(pros, cons): return f"\nPros: \n{pros}\n\nCons: \n{cons}"

# create the chains

pros_branch = RunnableLambda(lambda x: analyze_pros(x)) | model | StrOutputParser()
cons_branch = RunnableLambda(lambda x: analyze_cons(x)) | model | StrOutputParser()

# NOTICE HOW A PARALLEL CHAIN IS CREATED (it is also a runnable, obviously)
final_chain = (
    prompt_template | model | StrOutputParser() | 
    RunnableParallel(branches={"pros":pros_branch,"cons":cons_branch}) | 
    RunnableLambda(lambda x: final_output(x["branches"]["pros"],x["branches"]["cons"]))
)

result = final_chain.invoke(input={"product": "Samsung Galaxy M14 5G"})
print(result)


Pros: 
Based on the features outlined for the Samsung Galaxy M14 5G, here are 2-3 pros of the product:

1. **Impressive Battery Life**: The 6000 mAh battery capacity is a standout feature, allowing users to enjoy extended usage without the need for frequent recharging. This is particularly beneficial for those who are heavy users or often on the go.

2. **Strong Display Quality**: The 6.6-inch FHD+ display provides vibrant colors and decent viewing angles, making it ideal for media consumption, whether you're streaming videos or browsing social media.

3. **5G Connectivity**: With support for 5G networks, the Galaxy M14 5G is a future-proof option for users looking to benefit from faster internet speeds, ensuring that the device remains relevant as network technologies evolve.

Cons: 
While the Samsung Galaxy M14 5G has many strengths, there are a few cons to consider:

1. **Camera Performance in Low Light**: Although the 50 MP main sensor performs well in good lighting conditions, it

**Branched Chains**

In [16]:
messages = [
    SystemMessage("You are an expert in classifying customer feedbacks. You have to classify and just return the category name!"),
    ("human","Categorize this feedbacks: {feedback} - into one of the following categories: {categories}.")
]

prompt_template = ChatPromptTemplate.from_messages(messages)

main_chain = prompt_template | model | StrOutputParser()

categories = ["positive","negative","neutral"]

categorical_prompt_templates = {
    "positive": ChatPromptTemplate.from_messages([
        ("system", "You are a helpful assistant."),
        ("human", "Generate a thank you response for this positive feedback: {feedback}")
    ]),
    "negative": ChatPromptTemplate.from_messages([
        ("system", "You are a helpful assistant."),
        ("human", "Generate a response addressing this negative feedback: {feedback}")
    ]),
    "neutral": ChatPromptTemplate.from_messages([
        ("system", "You are a helpful assistant."),
        ("human", "Generate a request response for this neutral feedback: {feedback}")
    ])
}

# create branches for different categories
branches = RunnableBranch(
    (
        lambda x: "positive" in x,
        categorical_prompt_templates["positive"] | model | StrOutputParser()
    ),
    (
        lambda x: "negative" in x,
        categorical_prompt_templates["negative"] | model | StrOutputParser()
    ),
    categorical_prompt_templates["neutral"] | model | StrOutputParser() # don't forget this default case!
)

final_chain = main_chain | branches # create the final chain

f = "This was a great experience and I loved the product!"
result = final_chain.invoke({"feedback":f,"categories":categories})
print(result)

Subject: Thank You for Your Kind Words!

Dear [Recipient's Name],

Thank you so much for your positive feedback! We truly appreciate you taking the time to share your thoughts. It’s always wonderful to hear that our efforts are making a difference.

Your support motivates us to continue striving for excellence. If you have any suggestions or further feedback, please don’t hesitate to reach out. We’re here to help!

Thanks again, and have a fantastic day!

Best regards,  
[Your Name]  
[Your Position]  
[Your Company]  


### 4. RAG - Retrieval Augmented Generation 

- RAG Basics, Vector Stores, RAG Metadata
- Text Splitting, Embedding
- One-off Question
- Conversational RAG
- Web Scraping

In [17]:
import os
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

file_path = r"data/animals.txt"
persist_directory = r"db/chroma_db"
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# No need to create if already exists
if not os.path.exists(persist_directory):
    print("--- Initializing Vector Store ---")

    if not os.path.exists(file_path):
        raise FileNotFoundError(f"File not found at {file_path}")
    
    # initialize the text loader, load the documents, split them into chunks
    loader = TextLoader(file_path)
    documents = loader.load()

    text_splitter = CharacterTextSplitter(chunk_size=500,chunk_overlap=50)
    docs = text_splitter.split_documents(documents)

    print(f"Number of chunks: {len(docs)}")

    # create embeddings, initialize the vector store
    embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
    db = Chroma.from_documents(docs, embeddings, persist_directory=persist_directory)

    print("Embeddings ready, vector store initialized.")
else: print("Vector store already exists.")

Vector store already exists.


In [18]:
# Now let's perform RAG 
db = Chroma(persist_directory=persist_directory,embedding_function=embeddings)
query = "Tell me about 'amphibians'"

# we need a 'retriever' to perform the R in RAG
retriever = db.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={"k":1,"score_threshold":0.3} # give me the most relevant document
)
relevant_docs = retriever.invoke(query)

print("Relevant docs:\n")
for i,doc in enumerate(relevant_docs,1):
    print(f"Document: {i}\n{doc.page_content}\n")


  db = Chroma(persist_directory=persist_directory,embedding_function=embeddings)


Relevant docs:

Document: 1
#### c) Reptiles  
- Cold-blooded animals with scales or bony plates.  
- Most lay eggs, while some give birth to live young.  
- Examples: Snakes, Lizards, Turtles, and Crocodiles.  

#### d) Amphibians  
- Cold-blooded animals that live both in water and on land.  
- Have moist skin that helps in respiration.  
- Examples: Frogs, Salamanders, and Toads.



Above is one single document chunk which contains content about "amphibians". 
(The similarity score is a bit lower than expected, but it does the job)

In [19]:
# Now let's see why adding metadata to RAG is important

data_folder = r"data"
persist_directory = r"db/chroma_db_with_metadata"

# No need to create if already exists
if not os.path.exists(persist_directory):
    print("--- Initializing Vector Store ---")

    if not os.path.exists(file_path):
        raise FileNotFoundError(f"File not found at {file_path}")
    
    files = [f for f in os.listdir(data_folder) if f.endswith(".txt")]
    documents = []

    for file in files:
        file_path = os.path.join(data_folder,file)
        loader = TextLoader(file_path)
        text_docs = loader.load()
        for doc in text_docs:
            doc.metadata = {"Source":file} # add metadata for the document
            documents.append(doc)

    text_splitter = CharacterTextSplitter(chunk_size=500,chunk_overlap=50)
    docs = text_splitter.split_documents(documents)

    print(f"Number of chunks: {len(docs)}")

    # create embeddings, initialize the vector store
    embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
    db = Chroma.from_documents(docs, embeddings, persist_directory=persist_directory)

    print("Embeddings ready, vector store initialized.")
else: print("Vector store already exists.")

Vector store already exists.


In [20]:
db = Chroma(persist_directory=persist_directory,embedding_function=embeddings)
query = "Give me a two-liner on hip-hop music"

retriever = db.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={"k":1,"score_threshold":0.2}
)
relevant_docs = retriever.invoke(query)

print("Relevant docs:\n")
for i,doc in enumerate(relevant_docs,1):
    print(f"Document: {i}\n{doc.page_content}\n")
    if doc.metadata:
        print("Source: ",doc.metadata["Source"])

Relevant docs:

Document: 1
### **3. Rock and Pop**
- Rock emerged in the 1950s, evolving into subgenres like punk, metal, and alternative rock.
- Pop music is characterized by catchy melodies and broad appeal.
- Icons: The Beatles, Michael Jackson, Madonna.

### **4. Hip-Hop and Rap**
- Originated in the 1970s as a voice for social issues and storytelling.
- Features rhythmic beats, poetry, and rap lyrics.
- Influencers: Tupac Shakur, The Notorious B.I.G., Kendrick Lamar.

Source:  music.txt


Here, we can see "Source: music.txt" - which means we know what document/source the model retrieves the data from.

**There are many kinds of text splitters, which can be used for various occasions**

- `CharacterTextSplitter` - The usual one, useful for consistent chunking regardless of context
- `SentenceTransformersTokenTextSplitter` - Splitting sentences, to maintain semantic coherence
- `TokenTextSplitter` - Used when there are strict token limits
- `RecursiveCharacterTextSplitter` - Balances between maintaining semantic coherence and adhering to token limits

and this is how you can use the `TextSplitter` superclass to create a custom text splitter - 

```python
class MyTextSplitter(TextSplitter):
    def split_text(self, text):
        # text split logic
        return text.split("\n\n") # split by paragraphs
```

Similarly, even for Embeddings, we can use different OpenAI embedding models, or even use `HuggingFaceEmbeddings` and so on...

*Now let's see how to get a proper response from the model, like having a conversation*

In [21]:
db = Chroma(persist_directory=persist_directory,embedding_function=embeddings)
query = "Give me a two-liner on hip-hop music"

retriever = db.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={"k":1,"score_threshold":0.2}
)
relevant_docs = retriever.invoke(query)

print("Relevant docs:\n")
for i,doc in enumerate(relevant_docs,1):
    print(f"Document: {i}\n{doc.page_content}\n")
    if doc.metadata:
        print("Source: ",doc.metadata["Source"])

combined_input = (
    f"""
    Here are some documents that will help you answer the user queries:

    Documents:
    {"\n\n".join([doc.page_content for doc in relevant_docs])}

    Provide answer only based on the documents provided. If the answer is not available, say that you don't know. Don't generate answers.
    """
)

messages = [
    ("system",combined_input),
    ("human",query)
]

result = model.invoke(messages) # you can create a prompt template, add parsers, chain it together, blah blah blah.. but this is enough
print("\n\n",result.content)

Relevant docs:

Document: 1
### **3. Rock and Pop**
- Rock emerged in the 1950s, evolving into subgenres like punk, metal, and alternative rock.
- Pop music is characterized by catchy melodies and broad appeal.
- Icons: The Beatles, Michael Jackson, Madonna.

### **4. Hip-Hop and Rap**
- Originated in the 1970s as a voice for social issues and storytelling.
- Features rhythmic beats, poetry, and rap lyrics.
- Influencers: Tupac Shakur, The Notorious B.I.G., Kendrick Lamar.

Source:  music.txt


 Hip-hop originated in the 1970s as a voice for social issues and storytelling, featuring rhythmic beats and rap lyrics. Influencers include Tupac Shakur, The Notorious B.I.G., and Kendrick Lamar.


That was basically "one-off question" - like basically querying one by one. Now let's have proper conversations

In [22]:
from langchain.chains.combine_documents.stuff import create_stuff_documents_chain
from langchain.chains.retrieval import create_retrieval_chain
from langchain.chains import create_history_aware_retriever

retriever = db.as_retriever(
    search_type="similarity",
    search_kwargs={"k":1}
) # we're using the same "chroma_db_with_metadata" vector store db

context_prompt = """
    Use the chat history to contextualize the user prompt as a standalone question WITHOUT ANSWERING THE QUESTION!
"""

context_prompt_template = ChatPromptTemplate.from_messages([
    ("system",context_prompt),
    MessagesPlaceholder("chat_history"),
    ("human","{input}") # usually can be 'user_input' or 'user_query', but has to be 'input' here for history_aware_retriever
])

system_prompt = """
    Use the given context to answer the user query. If the answer is not available, say that you don't know.
    {context}

    Keep the answer short and simple.
"""

prompt_template = ChatPromptTemplate.from_messages([
    ("system",system_prompt),
    MessagesPlaceholder("chat_history"),
    ("human","{input}") # usually can be 'user_input' or 'user_query', but has to be 'input' here for history_aware_retriever
])

history_aware_retriever = create_history_aware_retriever(model, retriever, context_prompt_template)

qa_chain = create_stuff_documents_chain(model, prompt_template)
rag_chain = create_retrieval_chain(history_aware_retriever, qa_chain)

In [23]:
def start_conversation():
    print("Start chatting! Type 'exit'/'quit' to end the conversation.\n\n")
    
    chat_history = []

    while True:
        query  = input()
        print(f"You: {query}")
        if query.lower() in ("exit","quit"): break
        result = rag_chain.invoke({
            "input":query,
            "chat_history":chat_history
        })
        print(f"AI: {result['answer']}")

        chat_history.append(HumanMessage(content=query))
        chat_history.append(AIMessage(content=result['answer']))

start_conversation()

Start chatting! Type 'exit'/'quit' to end the conversation.


You: tell me about reptiles
AI: Reptiles are cold-blooded animals with scales or bony plates. Most of them lay eggs, although some give birth to live young. Examples include snakes, lizards, turtles, and crocodiles.
You: what class do they belong to
AI: Reptiles belong to the class Reptilia.
You: exit


**Web-Scraping RAG**

In [24]:
from langchain_community.document_loaders import FireCrawlLoader

def create_vector_store():
    """Use web content to create a vector store"""

    persist_directory = r"db/chroma_db_web"

    if not os.path.exists(persist_directory):
        print("--- Initializing Vector Store ---")
        # don't use any private/personal/protected sites here - I've used a public sandbox site!!
        loader = FireCrawlLoader(url="https://www.scrapethissite.com/pages/simple/", mode="scrape")
        documents = loader.load()

        print(f"Number of documents: {len(documents)}")

        # Convert metadata into strings (web metadata can be lists)
        for doc in documents:
            for k,v in doc.metadata.items():
                if isinstance(v, list):
                    doc.metadata[k] = ", ".join(map(str,v))

        text_splitter = CharacterTextSplitter(chunk_size=500,chunk_overlap=0)
        docs = text_splitter.split_documents(documents)

        print(f"Number of document chunks: {len(docs)}")

        embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
        db = Chroma.from_documents(docs, embeddings, persist_directory=persist_directory)

        print("Embeddings ready, vector store initialized.")
    else: print("Vector store already exists")

create_vector_store()

Vector store already exists


In [25]:
def query_vector_store(query):
    db = Chroma(persist_directory=r"db/chroma_db_web",embedding_function=embeddings)
    retriever = db.as_retriever(
        search_type="similarity",
        search_kwargs={"k":1}
    )

    relevant_docs = retriever.invoke(query)

    print("Relevant docs:\n")
    for i,doc in enumerate(relevant_docs,1):
        print(f"Document: {i}\n{doc.page_content}\n")

query_vector_store("What is the population of India?")

Relevant docs:

Document: 1
**Capital:** Douglas

**Population:** 75049

**Area (km2):** 572.0

### India

**Capital:** New Delhi

**Population:** 1173108018

**Area (km2):** 3287590.0

### British Indian Ocean Territory

**Capital:** None

**Population:** 4000

**Area (km2):** 60.0

### Iraq

**Capital:** Baghdad

**Population:** 29671605

**Area (km2):** 437072.0

### Iran

**Capital:** Tehran

**Population:** 76923300

**Area (km2):** 1648000.0

### Iceland

**Capital:** Reykjavik

**Population:** 308910



Here, we see how the correct relevant doc has been retrieved.

### 5. Agents and Tools

- Agent and Tool Basics
- ReAct Prompts
- Tool Decorator

**Agent Basics**

In [38]:
from langchain import hub
from langchain.agents import AgentExecutor, create_react_agent, create_structured_chat_agent, create_tool_calling_agent
from langchain_core.tools import StructuredTool, tool
from pydantic import BaseModel
from typing import Any

In [27]:
# create a function that can be used as a tool
@tool
def get_current_time():
    """Returns current time in HH:MM:SS format""" # docstrings needed
    from datetime import datetime
    return datetime.now().strftime("%H:%M:%S")

# set list of tools
tools = [get_current_time]

# react prompt
prompt = hub.pull("hwchase17/react")

# let's create our react agent
agent = create_react_agent(
    llm=model,
    tools=tools,
    prompt=prompt,
    stop_sequence=True
)



In [28]:
# we need an agent executor to run the agent, combine with tools
agent_executor = AgentExecutor.from_agent_and_tools(
    agent=agent,
    tools=tools,
    verbose=True
)

response = agent_executor.invoke({"input":"What is the current time?"})
print(response)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find out the current time.  
Action: get_current_time  
Action Input: None  [0m[36;1m[1;3m19:32:13[0m[32;1m[1;3mI now know the final answer  
Final Answer: The current time is 19:32:13.[0m

[1m> Finished chain.[0m
{'input': 'What is the current time?', 'output': 'The current time is 19:32:13.'}


So now we know how to simply create an agent, add a tool to it and use "ReAct Prompting" to invoke the model and get response.

Next we'll add a few more tools and create a ReAct Chat conversation agent.

In [29]:
# NOTE: the @tool decorator turns the function into a 'StructuredTool' object

# let's create a class for our arguments (args_schema) and create our own StructuredTools
class DefaultQuerySchema(BaseModel):
    query: str

def search_wikipedia(query):
    """Returns content from wikipedia""" 
    from wikipedia import summary
    try: return summary(query, sentences=2)
    except: return "Couldn't find any information on that."

search_wiki_tool = StructuredTool.from_function(
    name="Wiki",
    func=search_wikipedia,
    description="Returns Wikipedia content",
    args_schema=DefaultQuerySchema
)

def calculate_expression(query):
    """Calculates the expression"""
    return str(eval(query))

calculate_tool = StructuredTool.from_function(
    name="Calc",
    func=calculate_expression,
    description="Returns evaluated result",
    args_schema=DefaultQuerySchema
)

tools = [get_current_time, search_wiki_tool, calculate_tool]

# react prompt for chat agent
prompt = hub.pull("hwchase17/structured-chat-agent")

# we need chat history memory for the agent
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

# let's create our react chat agent
agent = create_structured_chat_agent(
    llm=model,
    tools=tools,
    prompt=prompt
)



In [30]:
agent_executor = AgentExecutor.from_agent_and_tools(
    agent=agent,
    tools=tools,
    verbose=True,
    memory=memory,
    handle_parsing_errors=True
)

initial_system_message = """
    You are a helpful AI assistant. Use ONLY the tools to provide answers. 
    For tools that require a query input, provide ONLY a string for the 'query' parameter.
    Do not use dictionaries with 'title' or other keys as input.
    Run the tool not more than thrice to decide on an output.
    If the tools are not capable of providing necessary responses, then say that you can't provide a response.
"""
memory.chat_memory.add_message(SystemMessage(content=initial_system_message))

In [31]:
# Begin chatting!
while True:
    user_input = input()
    if user_input.lower() in ("exit","quit"): break
    print(f"You: {user_input}")
    memory.chat_memory.add_message(HumanMessage(content=user_input))
    response = agent_executor.invoke({"input":user_input})
    print(f"AI: {response["output"]}")
    memory.chat_memory.add_message(AIMessage(content=response["output"]))

You: one line about john cena


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m```
{
  "action": "Wiki",
  "action_input": "John Cena"
}
```[0m[33;1m[1;3mCouldn't find any information on that.[0m[32;1m[1;3m```
{
  "action": "Final Answer",
  "action_input": "John Cena is an American professional wrestler, actor, and television presenter known for his time in WWE."
}
```[0m

[1m> Finished chain.[0m
AI: John Cena is an American professional wrestler, actor, and television presenter known for his time in WWE.
You: give me the exact current time


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m```
{
  "action": "get_current_time",
  "action_input": {}
}
```[0m[36;1m[1;3m19:32:59[0m[32;1m[1;3m```
{
  "action": "Final Answer",
  "action_input": "The current time is 19:32:59."
}
```[0m

[1m> Finished chain.[0m
AI: The current time is 19:32:59.


Great, it works! Now let's start combining our learnings...
Let's now create a "RAG" "Agent"!

**Creating a RAG Agent**

In [32]:
# for the vector store, we'll be using the "chroma_db_with_metadata" which we already have
db = Chroma(persist_directory=r"db/chroma_db_with_metadata", embedding_function=embeddings)
retriever = db.as_retriever(
    search_type="similarity",
    search_kwargs={"k":3}
)

# we'll also use the same context and system prompts, history aware retrievers, etc.
context_prompt = """
    Use the chat history to contextualize the user prompt as a standalone question WITHOUT ANSWERING THE QUESTION!
"""

context_prompt_template = ChatPromptTemplate.from_messages([
    ("system",context_prompt),
    MessagesPlaceholder("chat_history"),
    ("human","{input}") # usually can be 'user_input' or 'user_query', but has to be 'input' here for history_aware_retriever
])

system_prompt = """
    Use the given context to answer the user query. If the answer is not available, say that you don't know.
    {context}

    Keep the answer short and simple.
"""

prompt_template = ChatPromptTemplate.from_messages([
    ("system",system_prompt),
    MessagesPlaceholder("chat_history"),
    ("human","{input}") # usually can be 'user_input' or 'user_query', but has to be 'input' here for history_aware_retriever
])

history_aware_retriever = create_history_aware_retriever(model, retriever, context_prompt_template)

qa_chain = create_stuff_documents_chain(model, prompt_template)
rag_chain = create_retrieval_chain(history_aware_retriever, qa_chain)

# this will be a ReAct RAG Agent, so let's get our prompt
react_prompt = hub.pull("hwchase17/react")



In [34]:
# Now let's create the functions and tools necessary

qafn = lambda input, **kwargs: rag_chain.invoke({
    "input": input,
    "chat_history": kwargs.get("chat_history",[])
})

qa_tool = StructuredTool.from_function(
    name="QA",
    func=qafn,
    description="Useful when questions have to be answered only from the given context",
    args_schema=DefaultQuerySchema # the same args_schema as before
)

tools = [qa_tool]

rag_agent = create_react_agent(
    llm=model,
    tools=tools,
    prompt=react_prompt
)

rag_agent_executor = AgentExecutor.from_agent_and_tools(
    agent=rag_agent,
    tools=tools,
    handle_parsing_errors=True
)

In [37]:
# Let's now talk with the RAG Agent

chat_history = []
while True:
    query = input()
    print(f"You: {query}")
    if query.lower() in ("exit","quit"): break
    response = rag_agent_executor.invoke({"input":query,"chat_history":chat_history})
    print(f"AI: {response['output']}")
    chat_history.append(HumanMessage(content=query))
    chat_history.append(AIMessage(content=response['output']))

You: tell me one line about reptiles
AI: Reptiles are cold-blooded animals with scales or bony plates, most of which lay eggs.
You: what class do birds belong to?
AI: Birds belong to the class Aves.
You: what is the recent film that was released?
AI: I don't know.
You: quit


Now we have an agent that performs RAG tasks for us! That's nice. Finally, we'll see how "tool calling agents" are made.

In [39]:
# We'll use the same 2 tools we had created already - the current time one, and the calculator one

tools = [get_current_time, calculate_tool]
tc_prompt = hub.pull("hwchase17/openai-tools-agent")

tc_agent = create_tool_calling_agent(
    llm=model,
    tools=tools,
    prompt=tc_prompt
)

tc_agent_executor = AgentExecutor.from_agent_and_tools(
    agent=tc_agent,
    tools=tools,
    verbose=True,
    handle_parsing_errors=True
)



In [None]:
# Let's test it out now! 
while True:
    query = input()
    print(f"You: {query}")
    if query.lower() in ("exit","quit"): break
    response = tc_agent_executor.invoke({"input":query})
    print(f"AI: {str(response['output'])}")

You: what is 10+20


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `Calc` with `{'query': '10+20'}`


[0m[33;1m[1;3m30[0m[32;1m[1;3mThe result of \( 10 + 20 \) is \( 30 \).[0m

[1m> Finished chain.[0m
AI: The result of \( 10 + 20 \) is \( 30 \).
You: what is the time right now?


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `get_current_time` with `{}`


[0m[36;1m[1;3m20:00:00[0m[32;1m[1;3mThe current time is 20:00:00.[0m

[1m> Finished chain.[0m
AI: The current time is 20:00:00.
You: exit
