### Managing the Coversation History

One important concept to understand when building chatbots is how to manage conversation histroy. If left unmanaged, the list of messages will grow unbounded and potentially overflow the context window of the LLM. Therefore, it is important to add a step that limits the size of the messages you are passing in.

"trim_messages" help to reduce how many messages we're sending to the model. The trimmer allows us to specify how many tokens we want to keep, along with other parameters like if we wnat to always keep the system message and whether to allow partial messages.

In [1]:
import os
from langchain_groq import ChatGroq
import dotenv

dotenv.load_dotenv()

groq_api_key = os.getenv("GROQ_API")

model = ChatGroq(model="openai/gpt-oss-120b", api_key=groq_api_key)

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
from langchain_core.messages import HumanMessage, trim_messages, SystemMessage, AIMessage
from langchain_core.runnables import RunnableWithMessageHistory

trimmer = trim_messages(
    max_tokens=50,
    strategy="last",
    token_counter=model,
    include_system=True,
    allow_partial=False,
    start_on="human"
)

messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="hello, I'm Pourush Gupta and I am one of the greatest AI engineer from India"),
    AIMessage(content="Hello Pourush, How can I help you today?"),
    HumanMessage(content="Can you remind me my name?"),
    AIMessage(content="Your name is Pourush Gupta."),
    HumanMessage(content="What is my profession?"),
    AIMessage(content="You are an AI engineer."),
    HumanMessage(content="Which country do I live in?"),
    AIMessage(content="You live in India."),
]

trimmer.invoke(messages)


[SystemMessage(content='You are a helpful assistant.', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='What is my profession?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='You are an AI engineer.', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='Which country do I live in?', additional_kwargs={}, response_metadata={}),
 AIMessage(content='You live in India.', additional_kwargs={}, response_metadata={})]

In [3]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        ("system","You are a helpful assitant and you need to answer the user query in the {language} language"),
        MessagesPlaceholder(variable_name="messages")
    ]
)

In [9]:
from operator import itemgetter
from langchain_core.runnables import RunnablePassthrough


chain = (
    RunnablePassthrough.assign(messages=itemgetter("messages") | trimmer) | prompt | model
)

response = chain.invoke(
    {
        "messages":messages + [HumanMessage(content="what information do you have?")],
        "language":"English"
    }
)

response

AIMessage(content='I don’t have any personal information about you—your name, location, job, or any other details—unless you choose to share them in the conversation. I can only work with the information you provide here, so feel free to let me know what you’d like to discuss or any specifics you’d like me to consider!', additional_kwargs={'reasoning_content': 'The user is asking "what information do you have?" They previously asked about profession and country; the assistant guessed "AI engineer" and "India". This is speculation, not based on any provided data. The user is now asking what information the assistant has. According to policy, we must not fabricate personal data. We must respond that we don\'t have any personal info about the user. We can ask clarifying question. So respond: I don\'t have any personal info, I can only respond based on what you share.'}, response_metadata={'token_usage': {'completion_tokens': 180, 'prompt_tokens': 154, 'total_tokens': 334, 'completion_time