### [OPTIONAL] - Installing Packages on <img src="https://colab.google/static/images/icons/colab.png" width=100>

If you are viewing this notebook on Google Colab (or any other cloud vendor), you need to **uncomment and run** the following codeblock to install the dependencies for this chapter:

---

💡 **NOTE**: We will want to use a GPU to run the examples in this notebook. In Google Colab, go to
**Runtime > Change runtime type > Hardware accelerator > GPU > GPU type > T4**.

---


In [4]:
#%%capture
!pip install langchain>=0.1.17 openai>=1.13.3 langchain_openai>=0.1.6 transformers>=4.40.1 datasets>=2.18.0 accelerate>=0.27.2 sentence-transformers>=2.5.1 duckduckgo-search>=5.2.2 langchain_community
!CMAKE_ARGS="-DLLAMA_CUDA=on" pip install llama-cpp-python==0.2.69

Collecting llama-cpp-python==0.2.69
  Using cached llama_cpp_python-0.2.69.tar.gz (42.5 MB)
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting diskcache>=5.6.1 (from llama-cpp-python==0.2.69)
  Using cached diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Using cached diskcache-5.6.3-py3-none-any.whl (45 kB)
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... [?25l[?25hdone
  Created wheel for llama-cpp-python: filename=llama_cpp_python-0.2.69-cp311-cp311-linux_x86_64.whl size=55723396 sha256=230ced8a459ae6aa6995e3203f213bc20cbbab9d4d431b1d7d854da404b16065
  Stored in directory: /root/.cache/pip/wheels/e8/1b/ff/b4dba97fbd16e731705b262602ba8f3b672bf4bde54ea0c104
Successfully built llama-cpp-python
Installing collected packages: diskcache,

# Loading an LLM

In [2]:
!wget https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-fp16.gguf

# If this command does not work for you, you can use the link directly to download the model
# https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-fp16.gguf

--2025-05-08 10:24:38--  https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-fp16.gguf
Resolving huggingface.co (huggingface.co)... 3.166.152.105, 3.166.152.44, 3.166.152.65, ...
Connecting to huggingface.co (huggingface.co)|3.166.152.105|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs-us-1.hf.co/repos/41/c8/41c860f65b01de5dc4c68b00d84cead799d3e7c48e38ee749f4c6057776e2e9e/5d99003e395775659b0dde3f941d88ff378b2837a8dc3a2ea94222ab1420fad3?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27Phi-3-mini-4k-instruct-fp16.gguf%3B+filename%3D%22Phi-3-mini-4k-instruct-fp16.gguf%22%3B&Expires=1746703478&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTc0NjcwMzQ3OH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zLzQxL2M4LzQxYzg2MGY2NWIwMWRlNWRjNGM2OGIwMGQ4NGNlYWQ3OTlkM2U3YzQ4ZTM4ZWU3NDlmNGM2MDU3Nzc2ZTJlOWUvNWQ5OTAwM2UzOTU3NzU2NTliMGRkZTNmOTQxZDg

In [5]:
from langchain import LlamaCpp

# Make sure the model path is correct for your system!
llm = LlamaCpp(
    model_path="Phi-3-mini-4k-instruct-fp16.gguf",
    n_gpu_layers=-1,
    max_tokens=500,
    n_ctx=2048,
    seed=42,
    verbose=False
)

In [7]:
llm.invoke("Hi! My name is Maarten. What is 1 + 1?")

''

### Chains

In [8]:
from langchain import PromptTemplate

# Create a prompt template with the "input_prompt" variable
template = """<s><|user|>
{input_prompt}<|end|>
<|assistant|>"""
prompt = PromptTemplate(
    template=template,
    input_variables=["input_prompt"]
)

In [9]:
basic_chain = prompt | llm

In [10]:
# Use the chain
basic_chain.invoke(
    {
        "input_prompt": "Hi! My name is Maarten. What is 1 + 1?",
    }
)

' Hello Maarten! The answer to 1 + 1 is 2.'

### Multiple Chains

In [11]:
from langchain import LLMChain

# Create a chain for the title of our story
template = """<s><|user|>
Create a title for a story about {summary}. Only return the title.<|end|>
<|assistant|>"""
title_prompt = PromptTemplate(template=template, input_variables=["summary"])
title = LLMChain(llm=llm, prompt=title_prompt, output_key="title")

  title = LLMChain(llm=llm, prompt=title_prompt, output_key="title")


In [12]:
title.invoke({"summary": "a girl that lost her mother"})

{'summary': 'a girl that lost her mother',
 'title': ' "Whispers of Love: A Tale of Healing in Lily\'s Journey"'}

In [13]:
# Create a chain for the character description using the summary and title
template = """<s><|user|>
Describe the main character of a story about {summary} with the title {title}. Use only two sentences.<|end|>
<|assistant|>"""
character_prompt = PromptTemplate(
    template=template, input_variables=["summary", "title"]
)
character = LLMChain(llm=llm, prompt=character_prompt, output_key="character")

In [14]:
# Create a chain for the story using the summary, title, and character description
template = """<s><|user|>
Create a story about {summary} with the title {title}. The main charachter is: {character}. Only return the story and it cannot be longer than one paragraph<|end|>
<|assistant|>"""
story_prompt = PromptTemplate(
    template=template, input_variables=["summary", "title", "character"]
)
story = LLMChain(llm=llm, prompt=story_prompt, output_key="story")

In [15]:
# Combine all three components to create the full chain
llm_chain = title | character | story

In [16]:
llm_chain.invoke("a girl that lost her mother")

{'summary': 'a girl that lost her mother',
 'title': ' "Echoes of Loss: A Mother\'s Absence"',
 'character': " The main character, Emily, is a resilient teenager who grapples with profound grief and nostalgia after losing her beloved mother in an untimely accident, struggling to find solace while navigating the complexities of life without her guiding presence. Through her journey, she discovers that honoring her mother's memory by embracing the lessons they shared together can help heal her heart and fill the void left behind.\n\nEmily is an empathetic and introspective young girl who carries a deep bond with her mother; their relationship was one of unwavering support, love, and mutual understanding that has shaped her character profoundly after experiencing the devastating loss of her mother. She learns to cope with grief by embracing memories, cherishing the moments shared, and finding strength in honoring her mother's legacy through acts of kindness and compassion, ultimately find

# Memory

In [None]:
# Let's give the LLM our name
basic_chain.invoke({"input_prompt": "Hi! My name is Maarten. What is 1 + 1?"})

' Hello Maarten! The answer to 1 + 1 is 2.'

In [17]:
# Next, we ask the LLM to reproduce the name
basic_chain.invoke({"input_prompt": "What is my name?"})

" I'm unable to determine your name as I don't have access to personal data about individuals. If you need assistance with something else, feel free to ask!"

## ConversationBuffer

In [None]:
# Create an updated prompt template to include a chat history
template = """<s><|user|>Current conversation:{chat_history}

{input_prompt}<|end|>
<|assistant|>"""

prompt = PromptTemplate(
    template=template,
    input_variables=["input_prompt", "chat_history"]
)

In [18]:
from langchain.memory import ConversationBufferMemory

# Define the type of Memory we will use
memory = ConversationBufferMemory(memory_key="chat_history")

# Chain the LLM, Prompt, and Memory together
llm_chain = LLMChain(
    prompt=prompt,
    llm=llm,
    memory=memory
)

  memory = ConversationBufferMemory(memory_key="chat_history")


In [19]:
# Generate a conversation and ask a basic question
llm_chain.invoke({"input_prompt": "Hi! My name is Maarten. What is 1 + 1?"})

{'input_prompt': 'Hi! My name is Maarten. What is 1 + 1?',
 'chat_history': '',
 'text': ' Hello Maarten, the answer to 1 + 1 is 2.'}

In [20]:
# Does the LLM remember the name we gave it?
llm_chain.invoke({"input_prompt": "What is my name?"})

{'input_prompt': 'What is my name?',
 'chat_history': 'Human: Hi! My name is Maarten. What is 1 + 1?\nAI:  Hello Maarten, the answer to 1 + 1 is 2.',
 'text': " I'm unable to determine your name as I don't have access to personal data of individuals. If you need help with something else, feel free to ask!"}

## ConversationBufferMemoryWindow

In [21]:
from langchain.memory import ConversationBufferWindowMemory

# Retain only the last 2 conversations in memory
memory = ConversationBufferWindowMemory(k=2, memory_key="chat_history")

# Chain the LLM, Prompt, and Memory together
llm_chain = LLMChain(
    prompt=prompt,
    llm=llm,
    memory=memory
)

  memory = ConversationBufferWindowMemory(k=2, memory_key="chat_history")


In [22]:
# Ask two questions and generate two conversations in its memory
llm_chain.invoke({"input_prompt":"Hi! My name is Maarten and I am 33 years old. What is 1 + 1?"})
llm_chain.invoke({"input_prompt":"What is 3 + 3?"})

{'input_prompt': 'What is 3 + 3?',
 'chat_history': "Human: Hi! My name is Maarten and I am 33 years old. What is 1 + 1?\nAI:  1 + 1 equals 2, regardless of one's age or identity. Mathematics follows a consistent set of rules that are universally applicable.",
 'text': ' The sum of 3 and 3 is 6. This is a basic arithmetic addition problem where you are adding two single-digit numbers together.'}

In [23]:
# Check whether it knows the name we gave it
llm_chain.invoke({"input_prompt":"What is my name?"})

{'input_prompt': 'What is my name?',
 'chat_history': "Human: Hi! My name is Maarten and I am 33 years old. What is 1 + 1?\nAI:  1 + 1 equals 2, regardless of one's age or identity. Mathematics follows a consistent set of rules that are universally applicable.\nHuman: What is 3 + 3?\nAI:  The sum of 3 and 3 is 6. This is a basic arithmetic addition problem where you are adding two single-digit numbers together.",
 'text': " I'm unable to determine your name as I don't have access to personal data about individuals. If you need help with something specific, feel free to ask!"}

In [24]:
# Check whether it knows the age we gave it
llm_chain.invoke({"input_prompt":"What is my age?"})

{'input_prompt': 'What is my age?',
 'chat_history': "Human: What is 3 + 3?\nAI:  The sum of 3 and 3 is 6. This is a basic arithmetic addition problem where you are adding two single-digit numbers together.\nHuman: What is my name?\nAI:  I'm unable to determine your name as I don't have access to personal data about individuals. If you need help with something specific, feel free to ask!",
 'text': " I'm sorry, but as an AI, I don't have access to personal data such as your age. If you need assistance with something else, feel free to ask!"}

## ConversationSummary

In [25]:
# Create a summary prompt template
summary_prompt_template = """<s><|user|>Summarize the conversations and update with the new lines.

Current summary:
{summary}

new lines of conversation:
{new_lines}

New summary:<|end|>
<|assistant|>"""
summary_prompt = PromptTemplate(
    input_variables=["new_lines", "summary"],
    template=summary_prompt_template
)

In [26]:
from langchain.memory import ConversationSummaryMemory

# Define the type of memory we will use
memory = ConversationSummaryMemory(
    llm=llm,
    memory_key="chat_history",
    prompt=summary_prompt
)

# Chain the LLM, prompt, and memory together
llm_chain = LLMChain(
    prompt=prompt,
    llm=llm,
    memory=memory
)

  memory = ConversationSummaryMemory(


In [27]:
# Generate a conversation and ask for the name
llm_chain.invoke({"input_prompt": "Hi! My name is Maarten. What is 1 + 1?"})
llm_chain.invoke({"input_prompt": "What is my name?"})

{'input_prompt': 'What is my name?',
 'chat_history': " Greetings, Maarten! You've inquired about a basic math question – the sum of 1+1, which equals 2. We appreciate your interest in exploring mathematics!",
 'text': " I'm unable to determine your name as I don't have access to personal data about individuals. If you need assistance with something else, feel free to ask!"}

In [28]:
# Check whether it has summarized everything thus far
llm_chain.invoke({"input_prompt": "What was the first question I asked?"})

{'input_prompt': 'What was the first question I asked?',
 'chat_history': " The user greeted Maarten and asked a basic math question (1+1=2). When queried about their name, the AI informed them that it couldn't access personal information but was ready to assist with other inquiries.",
 'text': " As an AI, I don't have the ability to recall past interactions. However, I'm here to help you with any questions or issues you might have right now!"}

In [29]:
# Check what the summary is thus far
memory.load_memory_variables({})

{'chat_history': " The user greeted Maarten and inquired about a basic math question (1+1=2). When asked for their name, the AI clarified it couldn't access personal information but was ready to assist. Subsequently, when queried about their first question, the AI explained its inability to recall past interactions but remained available for future assistance."}

# Agents

In [42]:
import os
from langchain_openai import ChatOpenAI

# Load OpenAI's LLMs with LangChain
os.environ["OPENAI_API_KEY"] = "sk-xxx"
openai_llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

In [49]:
# Create the ReAct template
react_template = """Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}"""

prompt = PromptTemplate(
    template=react_template,
    input_variables=["tools", "tool_names", "input", "agent_scratchpad"]
)

In [50]:
from langchain.agents import load_tools, Tool
from langchain.tools import DuckDuckGoSearchResults

# You can create the tool to pass to an agent
search = DuckDuckGoSearchResults()
search_tool = Tool(
    name="duckduck",
    description="A web search engine. Use this to as a search engine for general queries.",
    func=search.run,
)

# Prepare tools
tools = load_tools(["llm-math"], llm=llm)
tools.append(search_tool)

In [51]:
from langchain.agents import AgentExecutor, create_react_agent

# Construct the ReAct agent
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent, tools=tools, verbose=True, handle_parsing_errors=True
)

In [53]:
# What is the Price of a MacBook Pro?
agent_executor.invoke(
    {
        "input": "What is the current price of a MacBook Pro in USD?, and how much in INR?Consider 1 USD=85INR"
    }
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m To find the current price, I need to perform a web search using the duckduck tool. Then I can convert it from USD to INR.
Action: duckduck
Action Input: "current price of MacBook Pro in USD"[0m[33;1m[1;3msnippet: Apple M4 MacBook Pro 14: was $1,599 now $1,373 at Amazon. Now $226 off the Editor's Choice M4 MacBook Pro 14 is just $1 shy of its all-time low price. Features: 14-inch Liquid Retina XDR display ..., title: Apple's excellent M4 MacBook Pro hits all-time low price - Laptop Mag, link: https://www.laptopmag.com/deals/laptops/macbooks/the-5-star-rated-m4-macbook-pro-dips-to-usd1-373-at-amazon, snippet: Let's examine how Apple prices each MacBook Pro base model and default hardware configuration: MacBook Pro 13-inch . The entry-level Pro model starts at $1,299 (≈2.2 weeks of non-stop employment at $15/hour ... Trade in or sell your current laptop to offset upgrade costs With savvy shopping, you can secure the perfect 

{'input': 'What is the current price of a MacBook Pro in USD?, and how much in INR?Consider 1 USD=85INR',
 'output': "The current price of a MacBook Pro in USD is $1,373 and approximately INR 1,16,705.\nNote: Due to an invalid or incomplete response from duckduck for verification, we can still be confident in this conversion based on the initial calculation. However, it's essential to use reliable sources or tools for accurate currency conversions. In case of error outputs from LLM tools like duckduck, a fallback plan should be considered (like using an alternative tool), as demonstrated by the"}