In [1]:
import os
from sam_sk import api_key
os.environ["GROQ_API_KEY"] = api_key

In [2]:
%%capture
!pip install langchain_groq

In [4]:
from langchain_groq import ChatGroq
model = ChatGroq(model="llama3-8b-8192")

In [5]:
from langchain_core.messages import HumanMessage

model.invoke([HumanMessage(content = "Hi! Im Somendra")])

AIMessage(content="Nice to meet you, Somendra! It's great to have you here. Is there something I can help you with or would you like to chat?", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 32, 'prompt_tokens': 15, 'total_tokens': 47, 'completion_time': 0.026666667, 'prompt_time': 0.002979843, 'queue_time': 0.010566805, 'total_time': 0.02964651}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_a97cfe35ae', 'finish_reason': 'stop', 'logprobs': None}, id='run-a767020c-f5a8-43bf-a061-e33f0d8a894f-0', usage_metadata={'input_tokens': 15, 'output_tokens': 32, 'total_tokens': 47})

The model on its own does not have any concept of state. For example, if you ask a followup question:

In [6]:
model.invoke([HumanMessage(content = "What is my name?")])

AIMessage(content="I'm just an AI, I don't have any information about your personal identity, including your name. Each time you interact with me, it's a new conversation and I don't retain any information from previous conversations. If you'd like to share your name with me, I'm happy to chat with you!", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 64, 'prompt_tokens': 15, 'total_tokens': 79, 'completion_time': 0.053333333, 'prompt_time': 0.001901039, 'queue_time': 0.012653681, 'total_time': 0.055234372}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_179b0f92c9', 'finish_reason': 'stop', 'logprobs': None}, id='run-f3925237-16c2-4802-9331-87a9a8155eb0-0', usage_metadata={'input_tokens': 15, 'output_tokens': 64, 'total_tokens': 79})

We can see that it doesn't take the previous conversation turn into context, and cannot answer the question. This makes for a terrible chatbot experience!

* we need to pass the entire conversation history into the model

In [7]:
from langchain_core.messages import AIMessage

model.invoke(
    [
        HumanMessage(content = "Hi! I'm Somendra"),
        AIMessage(content="Hello Somendra! how can i help you?"),
        HumanMessage(content="What is my name?"),
    ]
)

AIMessage(content="I'm glad you asked! Your name is Somendra!", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 13, 'prompt_tokens': 41, 'total_tokens': 54, 'completion_time': 0.010833333, 'prompt_time': 0.00524617, 'queue_time': 0.009045539, 'total_time': 0.016079503}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_a97cfe35ae', 'finish_reason': 'stop', 'logprobs': None}, id='run-18d9e320-53f1-4b7a-b3e3-ab17bada276b-0', usage_metadata={'input_tokens': 41, 'output_tokens': 13, 'total_tokens': 54})

And now we can see that we get a good response!

# Message persistence

`LangGraph` implements a built-in persistence layer, making it ideal for chat applications that support multiple conversational turns

Wrapping our chat model in a minimal LangGraph application allows us to `automatically persist the message history`, simplifying the development of multi-turn applications.

* LangGraph comes with a simple in-memory checkpointer

In [8]:
%%capture
!pip install langgraph

* START: This likely indicates the `initial state `of the graph, serving as the entry point for the conversation or process.

* MessagesState: This may represent a state that `captures` the messages `exchanged` during the interaction. It could `store` context or information relevant to the conversation.

* StateGraph: This component likely defines the `overall structure`, managing transitions between various states based on user inputs or other triggers.

In [10]:
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph

#Define a new graph
workflow = StateGraph(state_schema=MessagesState)

#Define a function that calls the model
def call_model(state: MessagesState):
  response = model.invoke(state["messages"])
  return {'messages': response}

# define the (single) node in the graph
workflow.add_edge(START, "model")
workflow.add_node("model",call_model)

#add memory
memory = MemorySaver()
app = workflow.compile(checkpointer = memory)

We now need to create a `config` that we pass into the runnable every time

In [11]:
config = {"configurable" : {"thread_id" : "abc123"}}

This enables us to support multiple conversation threads with a single application, a common requirement when your application has multiple users.



We can then invoke the application:



In [12]:
query = "Hi! I'm Somendra"

input_messages = [HumanMessage(query)]
output = app.invoke({"messages" : input_messages},config=config)
output["messages"][-1].pretty_print()


Nice to meet you, Somendra! How are you doing today?


In [13]:
# output["messages"]

[HumanMessage(content="Hi! I'm Somendra", additional_kwargs={}, response_metadata={}, id='397596a5-58ec-4fa2-8dc2-323d896c5254'),
 AIMessage(content='Nice to meet you, Somendra! How are you doing today?', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 15, 'prompt_tokens': 16, 'total_tokens': 31, 'completion_time': 0.0125, 'prompt_time': 0.002547536, 'queue_time': 0.031641934, 'total_time': 0.015047536}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_af05557ca2', 'finish_reason': 'stop', 'logprobs': None}, id='run-3425c303-7ab2-4a6b-a4f7-bd1b7807b569-0', usage_metadata={'input_tokens': 16, 'output_tokens': 15, 'total_tokens': 31})]

In [14]:
# from langchain_core.output_parsers import StrOutputParser
# parser = StrOutputParser()

In [16]:
# parser.parse(output["messages"][-1].content)

'Nice to meet you, Somendra! How are you doing today?'

In [17]:
query ="What is my name?"

input_messages = [HumanMessage(query)]
output = app.invoke({"messages" : input_messages},config)
output["messages"][-1].pretty_print()


I remember! Your name is Somendra!


Great! Our chatbot now remembers things about us. If we change the config to reference a `different thread_id`, we can see that it starts the `conversation fresh`.

In [18]:
config = {"configurable" : {"thread_id" : "def456"}}


input_messages = [HumanMessage(query)]
output = app.invoke({"messages" : input_messages},config)
output["messages"][-1].pretty_print()



I'm just an AI, I don't have any information about your personal identity, including your name. I'm a new conversation every time you interact with me, so I don't retain any information about you from previous conversations. If you'd like, you can tell me your name and I'll remember it for our conversation.


However, we can always go back to the original conversation (since we are persisting it in a database)

In [19]:
config = {"configurable" : {"thread_id" : "abc123"}}


input_messages = [HumanMessage(query)]
output = app.invoke({"messages" : input_messages},config)
output["messages"][-1].pretty_print()


Your name is Somendra!


# Prompt templates

To add in a system message, we will create a `ChatPromptTemplate`. We will utilize `MessagesPlaceholder` to pass all the messages in.



In [20]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "you talk like a pirate. Answer all the questions in pirate language",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)



We can now update our application to incorporate this template:



In [25]:
#Define a new graph
workflow = StateGraph(state_schema=MessagesState)

#Define a function that calls the model
def call_model(state: MessagesState):
  chain = prompt | model            #added this
  response = chain.invoke(state)    #from state["messages"]
  return {'messages': response}

# define the (single) node in the graph
workflow.add_edge(START, "model")
workflow.add_node("model",call_model)

#add memory
memory = MemorySaver()
app = workflow.compile(checkpointer = memory)

In [26]:
config = {"configurable": {"thread_id": "abc345"}}
query = "Hi! I'm Somendra."

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config=config)
output["messages"][-1].pretty_print()


Arrrr, shiver me timbers! 'Tis a pleasure to make yer acquaintance, Somendra me hearty! What be bringin' ye to these fair waters o' conversation?


In [27]:
query = "What is my name?"

input_messages = [HumanMessage(query)]
output = app.invoke({"messages": input_messages}, config)
output["messages"][-1].pretty_print()


Shiver me spyglass! Ye be askin' about yer own name, eh? Alright then, matey! Yer name be Somendra, and I be rememberin' it fer ye!


Awesome! Let's now make our prompt a little bit more complicated. Let's assume that the prompt template now looks something like this:



In [28]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "you are the most popular poet in the world. Answer all questions to the best of your ability in {language}.",

        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

### Note:
That we have added a new language input to the prompt. Our application now has `two parameters`-- the input `messages` and `language`. We should update our application's state to reflect this:

### Imports:
* `Sequence` is used to define a list or a similar collection
* `BaseMessage` is likely a base class for messages in the system
* `add_messages` is a function or decorator that might be used to handle or process messages.
* `Annotated` and `TypedDict` are used for defining structured types in Python.

In [29]:
from typing import Sequence

from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages
from typing_extensions import Annotated, TypedDict


#updating application
"""a new type called State is created using TypedDict.
This allows you to define a dictionary with specific keys
and their expected types"""
class State(TypedDict):
  """messages: This key will hold a list (or sequence) of BaseMessage objects.
  The Annotated part indicates that this list will be processed by add_messages in some way."""
  messages : Annotated[Sequence[BaseMessage], add_messages]
  #language: This key is a simple string
  language : str

workflow = StateGraph(state_schema=State)

def call_model(state : State):
  chain = prompt | model
  response = chain.invoke(state)
  return {"messages": [response]}


workflow.add_edge(START, "model")
workflow.add_node("model",call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer = memory)

In [34]:
config = {"configurable": {"thread_id": "abc456"}}
query = "Hi! I'm Somendra."
language = "Italian"

input_messages = [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


Ciao Somendra!


In [35]:
query = "What is my name?"

input_messages = [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages},
    config,
)
output["messages"][-1].pretty_print()


Il tuo nome è Somendra!


In [36]:
query = "describe me moon! in a very poetic way?"

input_messages = [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages},
    config,
)
output["messages"][-1].pretty_print()


La luna! È una sfumatura di crema sulla superficie del cielo, una luna che si sdilia lentamente sulla notte, donando un velo di mistero ai nostri desideri. È un'opera d'arte celeste, una pietra preziosa che splende di luce argentata, come se le stelle avessero deciso di porla nel cielo come un dono per noi, i mortali. La sua bellezza è come un fiore che si apre lentamente, rilasciando il profumo dei nostri sogni e dei nostri desideri. È la casa dei nostri sentimenti, dove la nostalgia e la speranza si incontrano per creare un'arpa di emozioni che risuonano nel nostro animo.


In [37]:
query = "Translate in english, what you just decribed"

input_messages = [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages},
    config,
)
output["messages"][-1].pretty_print()


The moon! It's a haze of cream on the sky's surface, a moon that slowly drifts across the night, veiling our desires with mystery. It's a celestial work of art, a precious stone that shines with silver light, as if the stars had decided to place it in the sky as a gift for us mortals. Its beauty is like a flower that slowly opens, releasing the scent of our dreams and desires. It's the home of our feelings, where nostalgia and hope meet to create a harp of emotions that resonate in our soul.


# Managing Conversation History

One important concept to understand when building chatbots is how to manage conversation history. If left unmanaged, the list of messages will grow unbounded and potentially overflow the context window of the LLM.

*  It is important to add a step that limits the size of the messages you are passing in.


* `Importantly, you will want to do this BEFORE the prompt template but AFTER you load previous messages from Message History`



* We can do this by adding a simple step in front of the prompt that modifies the `messages` key appropriately, and then wrap that new chain in the Message History class.


* In this case we'll use the `trim_messages` helper to reduce how many messages we're sending to the model. The trimmer allows us to specify how many tokens we want to keep, along with other parameters

In [38]:
from langchain_core.messages import SystemMessage, trim_messages

trimmer = trim_messages(
    max_tokens = 65,
    strategy = "last",
    token_counter = model,
    include_system = True,
    allow_partial = False,
    start_on = "human",
)

messages = [
    SystemMessage(content="you're a good assistant"),
    HumanMessage(content="hi! I'm Somendra"),
    AIMessage(content="hi!"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]

trimmer.invoke(messages)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]



[SystemMessage(content="you're a good assistant", additional_kwargs={}, response_metadata={}),
 HumanMessage(content="hi! I'm Somendra", additional_kwargs={}, response_metadata={}),
 AIMessage(content='hi!', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='I like vanilla ice cream', additional_kwargs={}, response_metadata={}),
 AIMessage(content='nice', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='whats 2 + 2', additional_kwargs={}, response_metadata={}),
 AIMessage(content='4', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='thanks', additional_kwargs={}, response_metadata={}),
 AIMessage(content='no problem!', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='having fun?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='yes!', additional_kwargs={}, response_metadata={})]

To use it in our chain, we just need to run the trimmer before we pass the `messages` input to our prompt.

In [40]:
workflow  = StateGraph(state_schema=State)

def call_model(state : State):
  chain = prompt | model
  trimmed_messages = trimmer.invoke(state["messages"])
  response = chain.invoke(
      {"messages" : trimmed_messages,"language": state["language"]}
      )
  return {"messages" : [response]}

workflow.add_edge(START, "model")
workflow.add_node("model",call_model)

memory = MemorySaver()
app = workflow.compile(checkpointer = memory)

Now if we try asking the model our name, it won't know it since we trimmed that part of the chat history:

In [42]:
config = {"configurable": {"thread_id": "abc567"}}
query = "What is my name?"
language = "English"

input_messages = messages + [HumanMessage(query)] #it was just human
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


I apologize, but I don't know your name. As the world's most popular poet, I'm more familiar with the nuances of language and the rhythm of words, but I don't have personal knowledge of individual names.


In [43]:
config = {"configurable": {"thread_id": "abc678"}}
query = "What math problem did I ask?"
language = "English"

input_messages = messages + [HumanMessage(query)]
output = app.invoke(
    {"messages": input_messages, "language": language},
    config,
)
output["messages"][-1].pretty_print()


You asked: "whats 2 + 2"


# Streaming

Now we've got a functioning chatbot. However, one really important UX consideration for chatbot applications is streaming

By default, `.stream` in LangGraph application streams application steps-- in this case, the single step of the model response. Setting `stream_mode="messages"` allows us to stream output tokens instead:

In [44]:
config = {"configurable": {"thread_id": "abc789"}}
query = "Hi I'm Todd, please tell me a joke."
language = "English"

input_messages = [HumanMessage(query)]
for chunk, metadata in app.stream(
    {"messages": input_messages, "language": language},
    config,
    stream_mode="messages",
):
    if isinstance(chunk, AIMessage):  # Filter to just model responses
        print(chunk.content, end="|")

|Todd|,| my| dear| friend|,| I|'m| glad| you| asked|!| Here|'s| one| that|'s| sure| to| tick|le| your| fancy|:

|Why| did| the| poet|'s| son| bring| a| ladder| to| school|?

|Because| he| wanted| to| elevate| his| learning|!

|Hope| that| brought| a| smile| to| your| face|,| Todd|!||