[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/04-langchain-chat.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/04-langchain-chat.ipynb)

In [1]:
!pip install -qU \
  langchain==0.3.25 \
  langchain-openai==0.3.22


[notice] A new release of pip is available: 23.1.2 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
import os
from getpass import getpass
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain.schema import (
    SystemMessage,
    HumanMessage,
    AIMessage
)
from langchain.prompts.chat import (
    HumanMessagePromptTemplate,
    ChatPromptTemplate,
    SystemMessagePromptTemplate
)


We'll start by initializing the `ChatOpenAI` object. For this we'll need an [OpenAI API key](https://platform.openai.com/account/api-keys). Note that there is naturally a small cost to running this notebook due to the paid nature of OpenAI's API access.

In [3]:
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") \
    or getpass("Enter your OpenAI API key: ")

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

Initialize the `ChatOpenAI` object. We'll set `temperature=0` to minimize randomness and make outputs repeatable.

In [4]:
# Initialize the chat model
chat = ChatOpenAI(
    openai_api_key=OPENAI_API_KEY,
    temperature=0,
    model='gpt-4.1-mini'
)

Chats with the Chat-GPT model `gpt-4.1-mini` are typically structured like so:

```
System: You are a helpful assistant.

User: Hi AI, how are you today?

Assistant: I'm great thank you. How can I help you?

User: I'd like to understand string theory.
```

The final `"Assistant:"` without a response is what would prompt the model to continue the conversation. In the official OpenAI `ChatCompletion` endpoint these would be passed to the model in a format like:

```json
[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hi AI, how are you today?"},
    {"role": "assistant", "content": "I'm great thank you. How can I help you?"},
    {"role": "user", "content": "I'd like to understand string theory."}
]
```

In LangChain there is a slightly different format. We use three *message* objects like so:

In [5]:


# Create the message templates
system_template = "You are a helpful assistant."
human_template = "{input}"

# Create the prompt template
prompt = ChatPromptTemplate.from_messages([
    ("system", system_template),
    ("human", human_template)
])

# Create the chain
chain = prompt | chat | StrOutputParser()

# Test the chain
result = chain.invoke({"input": "Hi AI, how are you today?"})
print(result)

Hello! I'm doing great, thank you for asking. How can I assist you today?


The format is very similar, we're just swapping the role of `"user"` for `HumanMessage`, and the role of `"assistant"` for `AIMessage`.

We generate the next response from the AI by passing these messages to the `ChatOpenAI` object.

In [6]:
# Create initial messages
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("human", "Hi AI, how are you today?"),
    ("ai", "I'm great thank you. How can I help you?"),
    ("human", "I'd like to understand string theory.")
])

# Create the chain using LCEL pipe syntax
chain = prompt | chat | StrOutputParser()

# Get response using LCEL
res = chain.invoke({})
res

'Certainly! String theory is a theoretical framework in physics that attempts to reconcile quantum mechanics and general relativity, aiming to provide a unified description of all fundamental forces and particles.\n\nHere’s a basic overview:\n\n1. **Fundamental Idea**: Instead of viewing the basic building blocks of the universe as point-like particles (like electrons or quarks), string theory proposes that these are tiny, one-dimensional "strings" that can vibrate at different frequencies. Each vibration mode corresponds to a different particle.\n\n2. **Dimensions**: While we experience the universe in 3 spatial dimensions plus time, string theory requires additional spatial dimensions—typically 10 or 11 total dimensions—to be mathematically consistent. These extra dimensions are thought to be compactified or curled up at very small scales.\n\n3. **Types of Strings**: Strings can be open (with two endpoints) or closed (forming loops). Different string types and their vibrations give r

In response we get another AI message object. We can print it more clearly like so:

In [7]:
print(res)

Certainly! String theory is a theoretical framework in physics that attempts to reconcile quantum mechanics and general relativity, aiming to provide a unified description of all fundamental forces and particles.

Here’s a basic overview:

1. **Fundamental Idea**: Instead of viewing the basic building blocks of the universe as point-like particles (like electrons or quarks), string theory proposes that these are tiny, one-dimensional "strings" that can vibrate at different frequencies. Each vibration mode corresponds to a different particle.

2. **Dimensions**: While we experience the universe in 3 spatial dimensions plus time, string theory requires additional spatial dimensions—typically 10 or 11 total dimensions—to be mathematically consistent. These extra dimensions are thought to be compactified or curled up at very small scales.

3. **Types of Strings**: Strings can be open (with two endpoints) or closed (forming loops). Different string types and their vibrations give rise to di

Because `res` is just another `AIMessage` object, we can append it to `messages`, add another `HumanMessage`, and generate the next response in the conversation.

In [8]:
# For the follow-up question, we can extend the existing prompt
prompt.extend([
    ("ai", res),  # Previous AI response
    ("human", "Why do physicists believe it can produce a 'unified theory'?")
])

# Create the chain using LCEL pipe syntax
chain = prompt | chat | StrOutputParser()

# Get response
result = chain.invoke({})
result

'Great question! Physicists believe string theory has the potential to produce a "unified theory"—often called a "Theory of Everything"—because of several key reasons:\n\n1. **Incorporation of Gravity**:  \n   Traditional quantum field theories successfully describe three of the four fundamental forces (electromagnetic, weak, and strong nuclear forces) but struggle to include gravity in a consistent quantum framework. String theory naturally includes a particle that behaves like the graviton—the hypothetical quantum particle that mediates gravity—making it a promising candidate to unify gravity with the other forces.\n\n2. **Single Fundamental Entity**:  \n   Instead of treating particles and forces as fundamentally different, string theory models everything as different vibrational modes of the same fundamental object: the string. This means particles and forces emerge from one underlying principle, rather than being separate phenomena.\n\n3. **Mathematical Consistency**:  \n   String

## New Prompt Templates

Alongside what we've seen so far there are also three new prompt templates that we can use. Those are the `SystemMessagePromptTemplate`, `AIMessagePromptTemplate`, and `HumanMessagePromptTemplate`.

These are simply an extension of [Langchain's prompt templates](https://www.pinecone.io/learn/langchain-prompt-templates/) that modify the returning "prompt" to be a `SystemMessage`, `AIMessage`, or `HumanMessage` object respectively.

For now, there are not a huge number of use-cases for these objects. However, they can be useful if:
- You want different types of response; AND
- The types of response should depend on a set of pre-determined input values; AND
- You want to save tokens by not explicitly specifying every possible type of input value in the prompts.

This will make more sense with an example. Suppose you want to tailor responses to people from a wide variety of countries. E.g. an LLM powered worldwide translator! 

Some of the languages listed have been commented out as this is just an illustrative example, but the idea is that we can have many languages and dynamically alter the `HumanMessage` prompt so that we don't have to list all of them every time. 

In [9]:
languages = [
    "English",
    "Esperanto",
    "Spanish",
    # "French",
    # "German",
    # "Italian",
    # "Portuguese",
    # "Dutch",
    # "Russian",
    # "Chinese (Simplified)",
    # "Chinese (Traditional)",
    # "Japanese",
    # "Korean",
    # "Arabic",
    # "Hindi",
    # "Turkish",
    # "Swedish",
    # "Danish",
    # "Norwegian",
    # "Finnish",
    # "Polish",
    # "Czech",
    # "Hungarian",
    # "Greek",
    # "Hebrew",
    # "Vietnamese",
    # "Thai"
]

First let's see what the prompt looks like with single example.

In [10]:


# Create the prompt template
human_template = HumanMessagePromptTemplate.from_template(
    "Translate this input <INPUT_START> {input} <INPUT_END>  into {language}. Do not include any other text in your response."
)
chat_prompt = ChatPromptTemplate.from_messages([human_template])

# Format with dynamic input
chat_prompt_value = chat_prompt.format_prompt(
    input="I hope when you come the weather will be clement.", # Extra points if you get the reference.
    language="Esperanto"
)

chat_prompt_value


ChatPromptValue(messages=[HumanMessage(content='Translate this input <INPUT_START> I hope when you come the weather will be clement. <INPUT_END>  into Esperanto. Do not include any other text in your response.', additional_kwargs={}, response_metadata={})])

Note that to use `HumanMessagePromptTemplate` as typical a prompt templates with the `.format_prompt` method, we needed to pass it through a `ChatPromptTemplate` object. This is case for all of the new chat-based prompt templates.

Using this we return a `ChatPromptValue` object. This can be formatted into a list or string like so:

In [11]:
chat_prompt_value.to_messages()

[HumanMessage(content='Translate this input <INPUT_START> I hope when you come the weather will be clement. <INPUT_END>  into Esperanto. Do not include any other text in your response.', additional_kwargs={}, response_metadata={})]

In [12]:
chat_prompt_value.to_string()

'Human: Translate this input <INPUT_START> I hope when you come the weather will be clement. <INPUT_END>  into Esperanto. Do not include any other text in your response.'

Okay, let's see this new approach in action with our list of languages.

In [13]:
# Create the prompt template
human_template = HumanMessagePromptTemplate.from_template(
    "Translate this input '{input}' into {language}. Do not include any other text in your response."
)
system_template = SystemMessagePromptTemplate.from_template("You are a helpful assistant.")

# Create the chain using LCEL pipe syntax
chain = (
    ChatPromptTemplate.from_messages([system_template, human_template]) 
    | chat 
    | StrOutputParser()
)

# Loop through each language
for language in languages:
    print(f"\n=== Response in {language} ===")
    
    # Invoke the chain with our inputs
    result = chain.invoke({
        "input": "I hope when you come the weather will be clement.",
        "language": language
    })
    
    print(result)
    print("=" * 50)  # Separator for readability


=== Response in English ===
I hope when you come the weather will be mild.

=== Response in Esperanto ===
Mi esperas, ke kiam vi venos, la vetero estos milda.

=== Response in Spanish ===
Espero que cuando vengas el clima sea benigno.


Excellent! 

As you can see, it's successfully translated into different languages based on our inputs, *and we didn't have to use unnecessary tokens by inserting the entire language list into the prompt.*

What if the outputs we need are more complicated? For example, what if the input information is technical information that needs to be formatted in a very specific way for the output? 

E.g. Say that we want to:
1. Input technical information. 
2. Only translate part of the technical information, not all of the text. 
3. Maintain the same input structure in the output structure.

We can use the prompt templates approach for building an initial system message with a few examples for the chatbot to follow — few-shot training via examples. Let's see what that looks like.

In [14]:
# Create few-shot examples for technical content formatting
system_template = SystemMessagePromptTemplate.from_template(
    """You are a technical translator. You must maintain the exact same format and structure in your translations.
    Only translate the explanatory text, keeping all technical terms, numbers, and formatting unchanged.
    
    Example input and output pairs:
    
    Input: "Error 404: Page not found"
    Output: "Error 404: Página no encontrada"
    
    Input: "Status: 200 OK
    Response: {{
        'data': 'success',
        'message': 'Operation completed'
    }}"
    Output: "Status: 200 OK
    Response: {{
        'data': 'success',
        'message': 'Operación completada'
    }}"
    """
)

# Example of a technical input
human_template = HumanMessagePromptTemplate.from_template(
    """Translate this technical information to {language}:
    
    Status: 500 Internal Server Error
    Response: {{
        'error': 'Database connection failed',
        'code': 'DB_001',
        'timestamp': '2024-03-20T10:30:00Z'
    }}
    
    Technical Note: This error occurs when the application cannot connect to the database.
    """
)

# Create the chain using LCEL pipe syntax
chain = (
    ChatPromptTemplate.from_messages([system_template, human_template])
    | chat
    | StrOutputParser()
)

# Loop through each language
for language in languages:
    print(f"\n=== Technical Translation in {language} ===")
    
    # Invoke the chain with our input
    result = chain.invoke({"language": language})
    
    print(result)
    print("=" * 80)  # Separator for readability


=== Technical Translation in English ===
Status: 500 Internal Server Error
Response: {
    'error': 'Database connection failed',
    'code': 'DB_001',
    'timestamp': '2024-03-20T10:30:00Z'
}

Technical Note: This error occurs when the application cannot connect to the database.

=== Technical Translation in Esperanto ===
Status: 500 Internal Server Error
Response: {
    'error': 'Database connection failed',
    'code': 'DB_001',
    'timestamp': '2024-03-20T10:30:00Z'
}

Technical Note: Ĉi tiu eraro okazas kiam la aplikaĵo ne povas konekti al la datumbazo.

=== Technical Translation in Spanish ===
Status: 500 Internal Server Error
Response: {
    'error': 'Database connection failed',
    'code': 'DB_001',
    'timestamp': '2024-03-20T10:30:00Z'
}

Nota técnica: Este error ocurre cuando la aplicación no puede conectarse a la base de datos.


Perfect, we seem to get a good response!

Now, it's arguable as to whether all of the above is better than simple f-strings like:

In [15]:

# Create the system message with examples
system_message = SystemMessage(content="""You are a technical translator. You must maintain the exact same format and structure in your translations.
Only translate the explanatory text, keeping all technical terms, numbers, and formatting unchanged.

Example input and output pairs:

Input: "Error 404: Page not found"
Output: "Error 404: Página no encontrada"

Input: "Status: 200 OK
Response: {
    'data': 'success',
    'message': 'Operation completed'
}"
Output: "Status: 200 OK
Response: {
    'data': 'success',
    'message': 'Operación completada'
}"
""")

# Loop through each language
for language in languages:
    print(f"\n=== Technical Translation in {language} ===")
    
    # Create the human message using f-string
    human_message = HumanMessage(content=f"""Translate this technical information to {language}:
    
    Status: 500 Internal Server Error
    Response: {{
        'error': 'Database connection failed',
        'code': 'DB_001',
        'timestamp': '2024-03-20T10:30:00Z'
    }}
    
    Technical Note: This error occurs when the application cannot connect to the database.
    """)
    
    # Create messages list
    messages = [system_message, human_message]
    
    # Get response
    res = chat.invoke(messages)
    
    print(res.content)
    print("=" * 80)  # Separator for readability


=== Technical Translation in English ===
Status: 500 Internal Server Error
Response: {
    'error': 'Database connection failed',
    'code': 'DB_001',
    'timestamp': '2024-03-20T10:30:00Z'
}

Technical Note: This error occurs when the application cannot connect to the database.

=== Technical Translation in Esperanto ===
    Status: 500 Internal Server Error
    Response: {
        'error': 'Database connection failed',
        'code': 'DB_001',
        'timestamp': '2024-03-20T10:30:00Z'
    }

    Technical Note: Ĉi tiu eraro okazas kiam la aplikaĵo ne povas konekti al la datumbazo.

=== Technical Translation in Spanish ===
    Status: 500 Internal Server Error
    Response: {
        'error': 'Database connection failed',
        'code': 'DB_001',
        'timestamp': '2024-03-20T10:30:00Z'
    }

    Nota técnica: Este error ocurre cuando la aplicación no puede conectarse a la base de datos.
    


In this example, the above is far simpler. So we wouldn't necessarily recommend using prompt templates over f-strings in all scenarios. 

One example where Prompt Templates might prove useful is in interpreting specific template format types. For example, suppose a project uses lots of `jinja` templates. Rather than writing our functions that handle the input values, f-strings and which renders the jinja template, LangChain Prompt Templates do all of this for us:

In [16]:
# Create few-shot examples for technical content formatting
system_template = SystemMessagePromptTemplate.from_template(
    """You are a technical translator. You must maintain the exact same format and structure in your translations.
    Only translate the explanatory text, keeping all technical terms, numbers, and formatting unchanged.
    
    Example input and output pairs:
    
    Input: "Error 404: Page not found"
    Output: "Error 404: Página no encontrada"
    
    Input: "Status: 200 OK
    Response: {% raw %}{{
        'data': 'success',
        'message': 'Operation completed'
    }}{% endraw %}"
    Output: "Status: 200 OK
    Response: {% raw %}{{
        'data': 'success',
        'message': 'Operación completada'
    }}{% endraw %}"
    """,
    template_format="jinja2"
)

# Example of a technical input using Jinja2's control structures and filters
human_template = HumanMessagePromptTemplate.from_template(
    """Translate this technical information to {{ language|upper }}:
    
    Status: 500 Internal Server Error
    Response: {% raw %}{{
        'error': 'Database connection failed',
        'code': 'DB_001',
        'timestamp': '2024-03-20T10:30:00Z'
    }}{% endraw %}
    
    Technical Note: This error occurs when the application cannot connect to the database.
    
    {% if language == 'spanish' %}
    Note: Please use formal Spanish for technical documentation.
    {% elif language == 'french' %}
    Note: Please use formal French for technical documentation.
    {% else %}
    Note: Please maintain a formal tone in the translation.
    {% endif %}
    
    {% for term in technical_terms %}
    Keep the term "{{ term }}" unchanged in the translation.
    {% endfor %}
    """,
    template_format="jinja2"
)

# Create the chain using LCEL pipe syntax
chain = (
    ChatPromptTemplate.from_messages([system_template, human_template])
    | chat
    | StrOutputParser()
)

# Loop through each language
for language in languages:
    print(f"\n=== Technical Translation in {language} ===")
    
    # Invoke the chain with our inputs
    result = chain.invoke({
        "language": language,
        "technical_terms": ['DB_001', 'Internal Server Error', 'Database connection']
    })
    
    print(result)
    print("=" * 80)  # Separator for readability


=== Technical Translation in English ===
Status: 500 Internal Server Error
Response: {{
    'error': 'Database connection failed',
    'code': 'DB_001',
    'timestamp': '2024-03-20T10:30:00Z'
}}

Technical Note: This error occurs when the application cannot connect to the database.

=== Technical Translation in Esperanto ===
Status: 500 Internal Server Error
Response: {{
    'error': 'Database connection failed',
    'code': 'DB_001',
    'timestamp': '2024-03-20T10:30:00Z'
}}

Technical Note: Ĉi tiu eraro okazas kiam la aplikaĵo ne povas konekti al la datumbazo.

=== Technical Translation in Spanish ===
Status: 500 Internal Server Error
Response: {{
    'error': 'Fallo en la conexión a la Database',
    'code': 'DB_001',
    'timestamp': '2024-03-20T10:30:00Z'
}}

Nota Técnica: Este error ocurre cuando la aplicación no puede conectarse a la Database.


Let's see what the prompts look like after LangChain interprets the Jinja2 templates. This demonstrates how LangChain automatically handles the template interpretation for us:

In [17]:
# Get the formatted prompt for Spanish
print("\n=== Formatted Prompt for Spanish ===")

# Format the prompts with our inputs
formatted_prompt = ChatPromptTemplate.from_messages([
    system_template,
    human_template
]).format_prompt(
    language='spanish',
    technical_terms=['DB_001', 'Internal Server Error', 'Database connection']
)

# Print the formatted messages
for message in formatted_prompt.to_messages():
    print(f"\n{message.type.upper()} MESSAGE:")
    print("-" * 40)
    print(message.content)
    print("=" * 80)


=== Formatted Prompt for Spanish ===

SYSTEM MESSAGE:
----------------------------------------
You are a technical translator. You must maintain the exact same format and structure in your translations.
    Only translate the explanatory text, keeping all technical terms, numbers, and formatting unchanged.

    Example input and output pairs:

    Input: "Error 404: Page not found"
    Output: "Error 404: Página no encontrada"

    Input: "Status: 200 OK
    Response: {{
        'data': 'success',
        'message': 'Operation completed'
    }}"
    Output: "Status: 200 OK
    Response: {{
        'data': 'success',
        'message': 'Operación completada'
    }}"
    

HUMAN MESSAGE:
----------------------------------------
Translate this technical information to SPANISH:

    Status: 500 Internal Server Error
    Response: {{
        'error': 'Database connection failed',
        'code': 'DB_001',
        'timestamp': '2024-03-20T10:30:00Z'
    }}

    Technical Note: This error oc

Let's break down how LangChain automatically interpreted the Jinja2 templates in our prompts:

1. **Language Filter and Variable**:
   - Original: `{{ language|upper }}`
   - Interpreted as: `SPANISH`
   - The `|upper` filter automatically converted the language to uppercase

2. **Conditional Logic**:
   - Original:
     ```jinja2
     {% if language == 'spanish' %}
     Note: Please use formal Spanish for technical documentation.
     {% elif language == 'french' %}
     Note: Please use formal French for technical documentation.
     {% else %}
     Note: Please maintain a formal tone in the translation.
     {% endif %}
     ```
   - Interpreted as: `Note: Please use formal Spanish for technical documentation.`
   - The `if` statement automatically selected the Spanish-specific note

3. **Loop Structure**:
   - Original:
     ```jinja2
     {% for term in technical_terms %}
     Keep the term "{{ term }}" unchanged in the translation.
     {% endfor %}
     ```
   - Interpreted as three separate lines, one for each technical term:
     ```
     Keep the term "DB_001" unchanged in the translation.
     Keep the term "Internal Server Error" unchanged in the translation.
     Keep the term "Database connection" unchanged in the translation.
     ```
   - The `for` loop automatically iterated through our list of technical terms

4. **Raw JSON Blocks**:
   - Original: `{% raw %}{{ ... }}{% endraw %}`
   - Interpreted as: `{{ ... }}`
   - The `raw` tags were automatically removed while preserving the JSON structure