## Chat models

LangChain provides a consistent interface for working with chat models from different providers while offering additional features for monitoring, debugging, and optimizing the performance of applications that use LLMs.

### Standard Parameters

Many chat models have standardized parameters that can be used to configure the model:

| **Parameter**   | **Description** |
|-----------------|-----------------|
| `model`         | The name or identifier of the specific AI model you want to use (e.g., `"gpt-3.5-turbo"` or `"gpt-4"`). |
| `temperature`   | Controls the randomness of the model's output. A higher value (e.g., `1.0`) makes responses more creative, while a lower value (e.g., `0.0`) makes them more deterministic and focused. |
| `timeout`       | The maximum time (in seconds) to wait for a response from the model before canceling the request. Ensures the request doesn’t hang indefinitely. |
| `max_tokens`    | Limits the total number of tokens (words and punctuation) in the response. This controls how long the output can be. |
| `stop`          | Specifies stop sequences that indicate when the model should stop generating tokens. For example, you might use specific strings to signal the end of a response. |
| `max_retries`   | The maximum number of attempts the system will make to resend a request if it fails due to issues like network timeouts or rate limits. |
| `api_key`       | The API key required for authenticating with the model provider. This is usually issued when you sign up for access to the model. |
| `base_url`      | The URL of the API endpoint where requests are sent. This is typically provided by the model's provider and is necessary for directing your requests. |
| `rate_limiter`  | An optional `BaseRateLimiter` to space out requests to avoid exceeding rate limits. See rate-limiting below for more details. |


In [1]:
from dotenv import load_dotenv
load_dotenv()
from langchain_openai import ChatOpenAI

In [2]:
model = ChatOpenAI(temperature=0.0)

result = model.invoke('Hi! How are you?')
print(result)
print(type(result))

content="Hello! I'm just a computer program, so I don't have feelings, but I'm here to help you. How can I assist you today?" additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 32, 'prompt_tokens': 13, 'total_tokens': 45, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-BEmmK799zYy0EurV9lWFfF5HEtO9h', 'finish_reason': 'stop', 'logprobs': None} id='run-2ae2725a-d9db-4f01-96ef-0f03075f6a44-0' usage_metadata={'input_tokens': 13, 'output_tokens': 32, 'total_tokens': 45, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}
<class 'langchain_core.messages.ai.AIMessage'>


## Messages

Messages are the unit of communication in chat models. They are used to represent the input and output of a chat model, as well as any additional context or metadata that may be associated with a conversation.

Each message has a role (e.g., "user", "assistant") and content (e.g., text, multimodal data) with additional metadata that varies depending on the chat model provider.

LangChain provides a unified message format that can be used across chat models, allowing users to work with different chat models without worrying about the specific details of the message format used by each model provider.

### What is inside a message?
A message typically consists of the following pieces of information:

- **Role**: The role of the message (e.g., "user", "assistant").
- **Content**: The content of the message (e.g., text, multimodal data).
- **Additional metadata**: id, name, token usage and other model-specific metadata.

### Role

Roles are used to distinguish between different types of messages in a conversation and help the chat model understand how to respond to a given sequence of messages.

| **Role**         | **Description** |
|------------------|-----------------|
| `system`         | Used to tell the chat model how to behave and provide additional context. Not supported by all chat model providers. |
| `user`           | Represents input from a user interacting with the model, usually in the form of text or other interactive input. |
| `assistant`      | Represents a response from the model, which can include text or a request to invoke tools. |
| `tool`           | A message used to pass the results of a tool invocation back to the model after external data or processing has been retrieved. Used with chat models that support tool calling. |
| `function (legacy)` | This is a legacy role, corresponding to OpenAI's legacy function-calling API. `tool` role should be used instead. |


### HumanMessage
The HumanMessage corresponds to the "user" role. A human message represents input from a user interacting with the model.

In [3]:
from langchain_core.messages import HumanMessage

model.invoke([HumanMessage(content="Hello, how are you?")])

AIMessage(content="Hello! I'm just a computer program, so I don't have feelings, but I'm here to help you. How can I assist you today?", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 32, 'prompt_tokens': 13, 'total_tokens': 45, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-BEmnKPB7KoWfw9ZFgBNMZrrS4m5Ib', 'finish_reason': 'stop', 'logprobs': None}, id='run-ec6b2027-ac90-44c1-be9d-52a7262c59d5-0', usage_metadata={'input_tokens': 13, 'output_tokens': 32, 'total_tokens': 45, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [4]:
from langchain_core.messages import HumanMessage
ai_message = model.invoke([HumanMessage("Tell me a joke")])
ai_message # <-- AIMessage

AIMessage(content="Why couldn't the bicycle stand up by itself?\n\nBecause it was two tired!", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 11, 'total_tokens': 28, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-BEmnL8qhmNtldsaKAItz5sMvrvbhS', 'finish_reason': 'stop', 'logprobs': None}, id='run-acbb3db4-721a-48c5-a09d-20002f052c79-0', usage_metadata={'input_tokens': 11, 'output_tokens': 17, 'total_tokens': 28, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

### AIMessage
**AIMessage** is used to represent a message with the role **assistant**: this is the response from the model, which can include text or a request to invoke tools.

#### AIMessage Attributes

An `AIMessage` has the following attributes. The attributes marked as **Standardized** are ones that LangChain attempts to unify across different chat model providers. **Raw** fields are specific to the model provider and may vary.

| **Attribute**         | **Standardized/Raw** | **Description** |
|------------------------|----------------------|-----------------|
| `content`              | Raw                  | Usually a string, but can be a list of content blocks. See content for details. |
| `tool_calls`           | Standardized         | Tool calls associated with the message. See tool calling for details. |
| `invalid_tool_calls`   | Standardized         | Tool calls with parsing errors associated with the message. See tool calling for details. |
| `usage_metadata`       | Standardized         | Usage metadata for a message, such as token counts. See Usage Metadata API Reference. |
| `id`                   | Standardized         | An optional unique identifier for the message, ideally provided by the provider/model that created the message. |
| `response_metadata`    | Raw                  | Response metadata, e.g., response headers, logprobs, token counts. |


In [5]:
from langchain_core.messages import HumanMessage
ai_message = model.invoke([HumanMessage("Tell me a joke")])
ai_message # <-- AIMessage

AIMessage(content="Why couldn't the bicycle stand up by itself?\n\nBecause it was two tired!", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 11, 'total_tokens': 28, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-BEmnPutG9QDCbaqAqM3E5PRxrNUEL', 'finish_reason': 'stop', 'logprobs': None}, id='run-2cb510ad-a117-4adb-a006-bbf0bc1fac11-0', usage_metadata={'input_tokens': 11, 'output_tokens': 17, 'total_tokens': 28, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

### AIMessageChunk
It is common to stream responses for the chat model as they are being generated, so the user can see the response in real-time instead of waiting for the entire response to be generated before displaying it.

It is returned from the `stream`, `astream` and `astream_events` methods of the chat model.

In [6]:
chunks = []
for chunk in model.stream([HumanMessage("what color is the sky?")]):
    chunks.append(chunk)
    print(chunk)

content='' additional_kwargs={} response_metadata={} id='run-d95651aa-0efa-4150-ab6c-614db45f3545'
content='The' additional_kwargs={} response_metadata={} id='run-d95651aa-0efa-4150-ab6c-614db45f3545'
content=' color' additional_kwargs={} response_metadata={} id='run-d95651aa-0efa-4150-ab6c-614db45f3545'
content=' of' additional_kwargs={} response_metadata={} id='run-d95651aa-0efa-4150-ab6c-614db45f3545'
content=' the' additional_kwargs={} response_metadata={} id='run-d95651aa-0efa-4150-ab6c-614db45f3545'
content=' sky' additional_kwargs={} response_metadata={} id='run-d95651aa-0efa-4150-ab6c-614db45f3545'
content=' can' additional_kwargs={} response_metadata={} id='run-d95651aa-0efa-4150-ab6c-614db45f3545'
content=' vary' additional_kwargs={} response_metadata={} id='run-d95651aa-0efa-4150-ab6c-614db45f3545'
content=' depending' additional_kwargs={} response_metadata={} id='run-d95651aa-0efa-4150-ab6c-614db45f3545'
content=' on' additional_kwargs={} response_metadata={} id='run-d95651

#### Aggregating
`AIMessageChunks` support the `+` operator to merge them into a single `AIMessage`. This is useful when you want to display the final response to the user.

In [7]:
ai_message = chunks[0] + chunks[1] + chunks[2] + chunks[3] + chunks[4] + chunks[5]
print(ai_message)

content='The color of the sky' additional_kwargs={} response_metadata={} id='run-d95651aa-0efa-4150-ab6c-614db45f3545'


### ToolMessage
This represents a message with role `tool`, which contains the result of calling a tool.
We'll check it out later.