# Build a Chatbot

### Let's Cover Some Prerequisits to Build ChatBot

####  Chat Models

Modern LLMs are typically accessed through a chat model interface that takes a list of messages as input and returns a message as output.

#### Tool Calling
Chat models can call tools to perform tasks such as fetching data from a database, making API requests, or running custom code.

#### Structured outputs
Chat models can be requested to respond in a particular format (e.g., JSON or matching a particular schema).

#### Multimodality
Large Language Models (LLMs) are not limited to processing text. They can also be used to process other types of data, such as images, audio, and video. This is known as multimodality.

### Context window
A chat model's context window refers to the maximum size of the input sequence the model can process at one time.

#### Prompt templates
Prompt templates help to translate user input and parameters into instructions for a language model. This can be used to guide a model's response, helping it understand the context and generate relevant and coherent language-based output.

In [1]:
from langchain_core.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template("Tell me a joke about {topic}")

prompt_template.invoke({"topic": "cats"})

StringPromptValue(text='Tell me a joke about cats')

#### ChatPromptTemplates
These prompt templates are used to format a list of messages. 

In [3]:
from langchain_core.prompts import ChatPromptTemplate

prompt_temp = ChatPromptTemplate.from_messages(
    [
        ('system','Translate the following to urdu:'),
        ('user','{text}'),
    ]
)

prompt_temp.invoke({'text':"Where are you going?"})

ChatPromptValue(messages=[SystemMessage(content='Translate the following to urdu:', additional_kwargs={}, response_metadata={}), HumanMessage(content='Where are you going?', additional_kwargs={}, response_metadata={})])

In [4]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage

prompt_template = ChatPromptTemplate([
    ("system", "You are a helpful assistant"),
    MessagesPlaceholder("msgs")
])

prompt_template.invoke({"msgs": [HumanMessage(content="hi!")]})

ChatPromptValue(messages=[SystemMessage(content='You are a helpful assistant', additional_kwargs={}, response_metadata={}), HumanMessage(content='hi!', additional_kwargs={}, response_metadata={})])

# Build a Chatbot

In [3]:
import getpass
import os

os.environ['GROQ_API_KEY']=getpass.getpass()

In [4]:
from langchain_groq import ChatGroq
model = ChatGroq(model='llama3-8b-8192', api_key=os.environ["GROQ_API_KEY"])

In [5]:
from langchain_core.messages import HumanMessage, SystemMessage

message = [
    SystemMessage(content='Translate the following into Urdu:'),
    HumanMessage(content='Where are you going bro?'),
]


model.invoke(message)

AIMessage(content='کہاں جائے گا بھائی?\n\n(Kahaan jaaye ga bhai?)', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 26, 'prompt_tokens': 27, 'total_tokens': 53, 'completion_time': 0.021666667, 'prompt_time': 0.001156882, 'queue_time': 0.012016616, 'total_time': 0.022823549}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_179b0f92c9', 'finish_reason': 'stop', 'logprobs': None}, id='run-e4268066-3007-4e0e-b15c-53fdac32ea72-0', usage_metadata={'input_tokens': 27, 'output_tokens': 26, 'total_tokens': 53})

In [6]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

# response = model.invoke(message)
# parser.invoke(response)

In [7]:
chain = model | parser

chain.invoke(message)

'کہاں جارہے بھائی؟ (Kahaan jaareh bhai?)'

In [15]:
from langchain_core.messages import HumanMessage

model.invoke([HumanMessage(content="Hi! I'm Bob")])

AIMessage(content="Hi Bob! It's nice to meet you. Is there something I can help you with or would you like to chat?", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 26, 'prompt_tokens': 15, 'total_tokens': 41, 'completion_time': 0.021666667, 'prompt_time': 0.000142649, 'queue_time': 0.014490630000000001, 'total_time': 0.021809316}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_179b0f92c9', 'finish_reason': 'stop', 'logprobs': None}, id='run-b9da5b5d-fcf9-4ad3-a2c5-be3c30c2b049-0', usage_metadata={'input_tokens': 15, 'output_tokens': 26, 'total_tokens': 41})

In [16]:
model.invoke([HumanMessage(content="What's my name?")])

AIMessage(content="I apologize, but I'm a large language model, I don't have any information about your name. I'm a new conversation every time you interact with me, and I don't retain any personal information about individuals. If you'd like to share your name with me, I'd be happy to know it!", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 64, 'prompt_tokens': 15, 'total_tokens': 79, 'completion_time': 0.053333333, 'prompt_time': 0.002308046, 'queue_time': 0.011451444, 'total_time': 0.055641379}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_6a6771ae9c', 'finish_reason': 'stop', 'logprobs': None}, id='run-b52b4066-d60d-4d41-850d-e1d4897348c8-0', usage_metadata={'input_tokens': 15, 'output_tokens': 64, 'total_tokens': 79})

In [18]:
from langchain_core.messages import AIMessage

model.invoke(
    [
        HumanMessage(content="Hello! I am Salman"),
        AIMessage(content="Hello Salman! Nice to meet you. How can i assist you today?"),
        HumanMessage(content="What's my name?")
    ]
)

AIMessage(content='I remember! Your name is Salman!', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 44, 'total_tokens': 53, 'completion_time': 0.0075, 'prompt_time': 0.00557453, 'queue_time': 0.00770922, 'total_time': 0.01307453}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_6a6771ae9c', 'finish_reason': 'stop', 'logprobs': None}, id='run-83c84f9c-2a99-4546-b0f5-70c1a415292e-0', usage_metadata={'input_tokens': 44, 'output_tokens': 9, 'total_tokens': 53})

### Message persistence

LangGraph implements a built-in persistence layer, making it ideal for chat applications that support multiple conversational turns.

Wrapping our chat model in a minimal LangGraph application allows us to automatically persist the message history, simplifying the development of multi-turn applications.

In [8]:
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import MessagesState, StateGraph, START


In [14]:
workflow = StateGraph(state_schema=MessagesState)

def call_model(state: MessagesState):
    response = model.invoke(state['messages'])
    return {'messages':response}

workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

### A Quick Look At async 

In [23]:
import time

def task1():
    print("Task 1 started")
    time.sleep(2)  # Simulate long-running task
    print("Task 1 finished")

def task2():
    print("Task 2 started")
    time.sleep(3)  # Simulate long-running task
    print("Task 2 finished")

task1()
task2()


Task 1 started
Task 1 finished
Task 2 started
Task 2 finished


In [26]:
import asyncio
import time
async def task1():
    print("Task 1 started")
    await asyncio.sleep(2)  # Simulate long-running task
    print("Task 1 finished")

async def task2():
    print("Task 2 started")
    await asyncio.sleep(3)  # Simulate long-running task
    print("Task 2 finished")

async def main():
    start_time = time.time()
    await asyncio.gather(task1(), task2())
    print(f"Total time: {time.time() - start_time} seconds")

# asyncio.run(main())
await main()



Task 1 started
Task 2 started
Task 1 finished
Task 2 finished
Total time: 3.000969171524048 seconds


### Building Chat Model Continue...

In [16]:
config = {'configurable':{'thread_id':'abc123'}}


query = "Hi! I am Salman."
input_messages = [HumanMessage(content=query)]
output = app.invoke(
    {'messages':input_messages}, config
)

output['messages'][-1].pretty_print() # output contains all messages in state


Hi Salman! Nice to meet you!


In [17]:
query = "what is my name?"

input_messages = [HumanMessage(content=query)]

output = app.invoke(
    {'messages':input_messages}, config
)

output['messages'][-1].pretty_print()


According to our conversation, your name is Salman!


In [18]:
config = {"configurable": {"thread_id": "abc234"}}

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


I apologize, but I'm a large language model, I don't have the ability to know your name. I'm a new conversation each time you interact with me, and I don't retain any information about you or your identity. If you'd like to share your name with me, I'd be happy to chat with you and remember it for our conversation!


In [19]:
config = {"configurable": {"thread_id": "abc234"}}
query = "tell me a joke"
input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Here's one:

Why couldn't the bicycle stand up by itself?

(Wait for it...)

Because it was two-tired!

Hope that made you laugh!


However, we can always go back to the original conversation (since we are persisting it in a database)

In [20]:
config = {"configurable": {"thread_id": "abc123"}}
query = "What is my name?"
input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Your name is Salman!


### ChatPrompts

In [26]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        ('system', "You're a comedian. You will be roasting everyone with jokes."),
        MessagesPlaceholder(variable_name='messages')
    ]
)

In [27]:
workflow = StateGraph(state_schema=MessagesState)

def call_model(state: MessagesState):
    chain = prompt | model
    response = chain.invoke(state)
    return {'messages':response}

workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

In [29]:
config = {"configurable": {"thread_id": "abc345"}}
query = "Hi! I'm Jim."

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Great to meet you, Jim! I noticed you introduced yourself with a lot of enthusiasm... or was that just the excitement of knowing you're about to be roasted?


Making our chat prompt more complex

In [30]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

### Some Notes on typing and typing_extensions

In [43]:
from typing import Set, Dict, Tuple, List, Callable, Optional, Union, Literal, Any

# Callable
def execute_operation(operation: Callable[[int, int], int], a: int, b: int) -> int:
    return operation(a, b)
print(execute_operation(lambda x,y:x+y, 5,7))

from typing import Dict, List, Tuple, Set

def analyze_data(data: Dict[str, int], items: List[str], points: Tuple[int, int], flags: Optional[Set[str]]=None) -> None:
    print(data, items, points, flags)

analyze_data({'age':22}, [295, 310], (1062, 1035), {'age', 'college_rn', 'uni_rn'})

12
{'age': 22} [295, 310] (1062, 1035) {'age', 'college_rn', 'uni_rn'}


In [49]:
from typing_extensions import Final

API_URL: Final = "https://linkedin.com/in/msalmanai62/"
API_URL
# This value cannot be changed or overridden


'https://linkedin.com/in/msalmanai62/'

In [54]:
from typing_extensions import Annotated

def process_data(data: Annotated[int, "Must be positive"]) -> int:
    return data * 2

process_data(2)

4

In [57]:
from typing import Sequence

from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages
from typing_extensions import Annotated, TypedDict


class State(TypedDict):
    messages: Annotated[Sequence[BaseMessage], add_messages]
    language: str


workflow = StateGraph(state_schema=State)


def call_model(state: State):
    chain = prompt | model
    response = chain.invoke(state)
    return {"messages": [response]}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

In [58]:
config = {"configurable": {"thread_id": "abc456"}}
query = "Hi! I'm salman."
language = "urdu"

input_messages = [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


Asalamu alaikum Salman! Main Salman ki madad karnay ki taiyari mein hoon. Kya aap mera madad chah rahi hain?


In [59]:
query = "What is my name?"

input_messages = [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages},
    config,
)
output["messages"][-1].pretty_print()


Salman yaar, aapki nam Salman hai!


### Managing Conversation History

In [63]:
from langchain_core.messages import SystemMessage, trim_messages
from langchain_core.messages import AIMessage

trimmer = trim_messages(
    max_tokens=65,
    strategy="last",
    token_counter=model,
    include_system=True,
    allow_partial=False,
    start_on="human",
)

messages = [
    SystemMessage(content="you're a good assistant"),
    HumanMessage(content="hi! I'm bob"),
    AIMessage(content="hi!"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]

trimmer.invoke(messages)

[SystemMessage(content="you're a good assistant", additional_kwargs={}, response_metadata={}),
 HumanMessage(content="hi! I'm bob", additional_kwargs={}, response_metadata={}),
 AIMessage(content='hi!', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='I like vanilla ice cream', additional_kwargs={}, response_metadata={}),
 AIMessage(content='nice', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='whats 2 + 2', additional_kwargs={}, response_metadata={}),
 AIMessage(content='4', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='thanks', additional_kwargs={}, response_metadata={}),
 AIMessage(content='no problem!', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='having fun?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='yes!', additional_kwargs={}, response_metadata={})]

In [64]:
workflow = StateGraph(state_schema=State)


def call_model(state: State):
    chain = prompt | model
    trimmed_messages = trimmer.invoke(state["messages"])
    response = chain.invoke(
        {"messages": trimmed_messages, "language": state["language"]}
    )
    return {"messages": [response]}


workflow.add_edge(START, "model")
workflow.add_node("model", call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

In [65]:
config = {"configurable": {"thread_id": "abc567"}}
query = "What is my name?"
language = "English"

input_messages = messages + [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


Your name is Bob!


In [69]:
config = {"configurable": {"thread_id": "abc789"}}
query = "Hi I'm Todd, please tell me a joke."
language = "English"

input_messages = [HumanMessage(query)]
for chunk, metadata in app.stream(
    {"messages": input_messages, "language": language},
    config,
    stream_mode="messages",
):
    if isinstance(chunk, AIMessage):  # Filter to just model responses
        print(chunk.content, end="")

Hi Todd! Here's one for you:

Why couldn't the bicycle stand up by itself?

(Wait for it...)

Because it was two-tired!

Hope that made you laugh! Do you want to hear another one?