# Messages
source: https://python.langchain.com/docs/how_to/#messages

Langchain **Messages** are the input and output of chat models.  
They have some content and a role, which describes the source of the message.

Types of usage:
- trim messages
- filter messages
- merge consecutive messages of the same type


## Trim messages

All models have finite context windows, meaning there's a limit to how many tokens they can take as input. If you have very long messages or a chain/agent that accumulates a long message history, you'll need to manage the length of the messages you're passing in to the model.

**trim_message** can be used to reduce the size of a chat history to a specified token count or specified message count.

In [None]:
from langchain_core.messages.utils import count_tokens_approximately
from pprint import pprint
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage, ToolMessage, trim_messages

def get_msgs():
    msg = [
        SystemMessage("you're a good assistant, you always respond with a joke."),
        HumanMessage("i wonder why it's called langchain"),
        AIMessage('Well, I guess they thought "WordRope" and "SentenceString" just didn\'t have the same ring to it!'),
        HumanMessage("and who is harrison chasing anyways"),
        AIMessage( "Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"),
        HumanMessage("what do you call a speechless parrot"),
    ]
    return msg

messages = get_msgs()

pprint(f'--- BEFORE TRIM: \n{get_msgs}')
print('\n\n--- ATER TRIM:')

trim_messages(
    messages,
    # Keep the last <= n_count tokens of the messages.
    strategy="last",
    
    # Remember to adjust based on your model or else pass a custom token_counter
    token_counter=count_tokens_approximately,
    
    # Most chat models expect that chat history starts with either:
    # (1) a HumanMessage or
    # (2) a SystemMessage followed by a HumanMessage
    
    # Remember to adjust based on the desired conversation length
    max_tokens=45,
    
    # Most chat models expect that chat history starts with either:
    # (1) a HumanMessage or
    # (2) a SystemMessage followed by a HumanMessage
    start_on="human",
    
    # Most chat models expect that chat history ends with either:
    # (1) a HumanMessage or
    # (2) a ToolMessage
    end_on=("human", "tool"),
    
    # Usually, we want to keep the SystemMessage if it's present in the original history.
    # The SystemMessage has special instructions for the model.
    include_system=True,
    allow_partial=False,
)


'--- BEFORE TRIM: \n<function get_msgs at 0x7101dfdc6520>'


--- ATER TRIM:


[SystemMessage(content="you're a good assistant, you always respond with a joke.", additional_kwargs={}, response_metadata={}),
 HumanMessage(content='what do you call a speechless parrot', additional_kwargs={}, response_metadata={})]

## Chaining

In [16]:
from pprint import pprint
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage, ToolMessage, trim_messages
from langchain_ollama import ChatOllama

# llm = ChatOpenAI(model="gpt-4o")
llm = ChatOllama(base_url="http://localhost:11434", model="gpt-oss:20b", temperature=0.)


# Notice we don't pass in messages. This creates
# a RunnableLambda that takes messages as input
trimmer = trim_messages(
    token_counter=llm,
    # Keep the last <= n_count tokens of the messages.
    strategy="last",
    # When token_counter=len, each message
    # will be counted as a single token.
    # Remember to adjust for your use case
    max_tokens=45,
    # Most chat models expect that chat history starts with either:
    # (1) a HumanMessage or
    # (2) a SystemMessage followed by a HumanMessage
    start_on="human",
    # Most chat models expect that chat history ends with either:
    # (1) a HumanMessage or
    # (2) a ToolMessage
    end_on=("human", "tool"),
    # Usually, we want to keep the SystemMessage
    # if it's present in the original history.
    # The SystemMessage has special instructions for the model.
    include_system=True,
)

chain = trimmer | llm

messages = get_msgs()
chain.invoke(messages)

  from .autonotebook import tqdm as notebook_tqdm
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.


AIMessage(content='What do you call a speechless parrot?  \nA *parrot* that’s “*mutt*‑ed” into silence—because even birds can have a *mute*‑ary!', additional_kwargs={}, response_metadata={'model': 'gpt-oss:20b', 'created_at': '2025-09-08T09:56:46.245168838Z', 'done': True, 'done_reason': 'stop', 'total_duration': 61396747112, 'load_duration': 9643177903, 'prompt_eval_count': 96, 'prompt_eval_duration': 383365400, 'eval_count': 7115, 'eval_duration': 51295714130, 'model_name': 'gpt-oss:20b'}, id='run--83ba9e80-b073-4f5b-9aca-81cdd63db5ae-0', usage_metadata={'input_tokens': 96, 'output_tokens': 7115, 'total_tokens': 7211})