# LangChain Chat History Management
In the [LangChain Expression Language (LCEL)](./lcel.ipynb), we covered LCEL at a high level, demonstrating specifically how to chain a prompt engineered chat prompt with an LLM, namely MLX. What I failed to demonstrate in that notebook was how to think about memory (aka chat conversation management), and to be honest, I underestimated how "involved" of a thing this got to be! 😅 To be clear, what we will be covering in this notebook is less of a technical concern and more of a business logic concern.

While LangChain offers many mechanisms for handling chat conversations (aka memory) correctly, I found some of the higher level ones to not be satisfactory for our purposes. Specifically, since I want us to adhere to a fixed schema, the high level abstraction objects provided by LangChain simply don't operate in the ideal way in which we need them to. No worries! We can still work around this without having to abandon LangChain. We're just going to need to do some special stuff throughout this notebook!

## High Level Flow
Before we get into the code itself, let's talk about how we want to think about the flow. For simplicity's sake, we are going to be ultimately saving this chat history as a JSON file. This JSON file should look like the schema that we've defined in the file `data/schema.json`.

Let's say that the user is loading the MLX Gradio UI interface, either for the first time ever or as a returning user. Here is the flow of how we should be thinking about our data:

1. **Loading the chat history from file**: Just as it sounds, we will want to load the chat history from file so that the user can interact with their historical conversations if they would like. Now, it's possible that this is the user's first time interacting with the chatbot, so it may be that we need to create this file from scratch!
2. **Setting a new conversation ID**: Regardless if the user is new or returning, we are going to make the assumption that the user will want to begin with a new conversation. This means that we will need to instantiate a new conversation ID so that we can keep appending new conversation interactions to that same conversation thread.
3. **Managing conversation back-and-forth**: As the conversation proceeds, we will want to continually update our conversation schema with any new human and AI interactions. This will include also autosaving them to file for the user's convenience.
4. **Starting a new conversation / loading an existing conversation**: At any point, the user may want to pivot from their current conversation to either a new conversation or to continue another historical conversation loaded from our file as part of step 1. If this is the case, we will need to ensure that our backend system is referencing the correct conversation interaction.

To really drive home the point, we will actually jump back and forth between each of these use cases to ensure that everything works seamlessly!

## Notebook Setup
In this section, we'll do all our usual set ups. We'll also set up the LangChain MLX model using the new ChatMLX implementation. All these are things we've already explored in other notebooks.

In [1]:
# Importing the necessary Python libraries
import os
import json
import uuid
import pandas as pd
from langchain_core.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, MessagesPlaceholder
from langchain_community.llms.mlx_pipeline import MLXPipeline
from langchain_community.chat_models.mlx import ChatMLX
from langchain_core.runnables import RunnableLambda
from langchain_community.chat_message_histories.in_memory import ChatMessageHistory
from langchain_core.messages.human import HumanMessage
from langchain_core.messages.ai import AIMessage
from langchain_core.messages.system import SystemMessage
from langchain_core.prompt_values import ChatPromptValue

In [2]:
# Setting a default system prompt
DEFAULT_SYSTEM_PROMPT = 'You are a helpful assistant.'

class MLXModelParameters():

    def __init__(self, temp = 0.7, max_tokens = 1000, system_prompt = DEFAULT_SYSTEM_PROMPT):
        self.temp = temp
        self.max_tokens = max_tokens
        self.system_prompt = system_prompt

    def __str__(self):
        return f'Temperature: {self.temp}\nMax Tokens: {self.max_tokens}\nSystem Prompt: {self.system_prompt}'
    
    def __repr__(self):
        return f'Temperature: {self.temp}\nMax Tokens: {self.max_tokens}\System Prompt: {self.system_prompt}'

    def update_temp(self, new_temp):
        self.temp = new_temp

    def update_max_tokens(self, new_max_tokens):
        self.max_tokens = new_max_tokens

    def update_system_prompt(self, new_system_prompt):
        self.system_prompt = new_system_prompt

    def to_json(self):
        return { 'temp': self.temp, 'max_tokens': self.max_tokens }
    
mlx_model_parameters = MLXModelParameters()

In [3]:
# Creating a starter conversation history object
STARTER_CONVERSATION_HISTORY = ChatMessageHistory(messages = [ SystemMessage(content = 'You are a helpful assistant.') ])

In [4]:
# Setting a list of models that we'll need to check against
NO_SYSTEM_MODEL_PROVIDERS = ['mistralai', 'meta-llama']

In [5]:
# Setting constant values to represent model name and directory
MODEL_NAME = 'mistralai/Mistral-7B-Instruct-v0.2'
BASE_DIRECTORY = '../models'
MLX_DIRECTORY = f'{BASE_DIRECTORY}/mlx'
mlx_model_directory = f'{MLX_DIRECTORY}/{MODEL_NAME}'

# Setting a constant value to represent where to place the chat history data
CHAT_HISTORY_DIRECTORY = '../data/'

In [6]:
# Setting up the LangChain MLX LLM
llm = MLXPipeline.from_model_id(
    model_id = mlx_model_directory,
    pipeline_kwargs = {
        'temp': mlx_model_parameters.temp,
        'max_tokens': mlx_model_parameters.max_tokens,
    }
)

# Setting up the LangChain MLX Chat Model with the LLM above
chat_model = ChatMLX(llm = llm)

  from .autonotebook import tqdm as notebook_tqdm


## Setting up the LangChain inference pipeline
In order to use LangChain's preferred implementation of memory management, we're first going to need to establish our LangChain pipeline. We've done similar things to this in other notebooks, but with this particular implementation, we are going to make a specific adjustment. Namely, since we are now going to make use of the LangChain Community implementation of MLX, we are going to need to manually add our own metadata. To seamlessly do this, we are going to make use of LCEL's **RunnableLambda**, which essentially allows us to define our own custom function.

Also note that when we set up our chat prompt, we are going to need to slide in an extra entry referred to as **MessagesPlaceholder**. As the name implies, that will serve as a placeholder so that we can keep passing the history back through the model.

In [7]:
# Setting up the Chat prompt template
human_message_prompt = HumanMessagePromptTemplate.from_template(template = "{input}")
chat_prompt = ChatPromptTemplate.from_messages(messages = [
    MessagesPlaceholder(variable_name = 'history'),
    human_message_prompt
])

In [8]:
def correct_for_no_system_models(chat_history):
    '''
    Precorrects for the issue where certain models (e.g. Llama, Mistral) are unable to accept for system messages

    Inputs:
        - chat_history (LangChain ChatPromptValue): The current chat history with no alterations

    Returns:
        - chat_history (LangChain ChatPromptValue): The new chat history with alterations (if needed)
    '''
    # Referencing global variables
    global MODEL_NAME
    global NO_SYSTEM_MODEL_PROVIDERS


    # Checking if the correction needs to be made if the model is Llama or Mistral
    if MODEL_NAME.split('/')[0] in NO_SYSTEM_MODEL_PROVIDERS:

        # Getting the system message content
        system_message_content = chat_history.messages[0].content

        # Replacing the System Message with a Human Message
        chat_history.messages[0] = HumanMessage(content = system_message_content)

        # Adds a dummy AI Message
        chat_history.messages.insert(1, AIMessage(content = ''))

    return chat_history

In [9]:
def update_ai_response_metadata(ai_message):
    '''
    Updates the metadata on the AI response

    Inputs:
        - ai_message (LangChain AIMessage): The AI message produced by the model

    Returns:
        - ai_message (LangChain AIMessage): The AI message produced by the model, except now with the appropriate metadata intact
    '''

    # Referencing global variables
    global mlx_model_parameters
    global MODEL_NAME

    # Creating a dictionary of the metadata that we will be adding to the AI message
    metadata = {
        'model_name': MODEL_NAME,
        'timestamp': str(pd.Timestamp.utcnow()),
        'like_data': None,
        'hyperparameters': mlx_model_parameters.to_json()
    }

    # Applying the metadata to the AI response
    ai_message.response_metadata = metadata

    return ai_message

In [10]:
def add_to_chat_history(ai_message):
    '''
    Adds messages to chat history

    Inputs:
        - ai_message (LangChain AIMessage): The AI message produced by the AI model

    Returns:
        - ai_message (LangChain AIMessage): Passing the AI message through to the end of the inference pipeline
    '''

    # Referencing global variables
    global chat_history
    global prompt_text

    # Adding the AI message to the chat history
    chat_history.add_messages(
        messages = [
            HumanMessage(content = prompt_text),
            ai_message
        ]
    )

    return ai_message

In [11]:
# Creating the inference chain by chaining together the chat prompt, chat model, and custom function to update metadata
inference_chain = chat_prompt | RunnableLambda(correct_for_no_system_models) | chat_model | RunnableLambda(update_ai_response_metadata) | RunnableLambda(add_to_chat_history)

In [12]:
# Instantiating a simple chat history
chat_history = ChatMessageHistory(messages = [
    SystemMessage(content = 'You are a helpful assistant.')
])

# Generating the response with the first prompt
prompt_text = 'What is the capital of Illinois?'
response = inference_chain.invoke({
    'history': chat_history.messages,
    'input': prompt_text
})
print(f'First prompt: "{prompt_text}')
print(f'AI Response: {response.content}"')
print(f'AI Response Metadata: {response.response_metadata}')
print('Current chat history:')
print(chat_history.messages)

print('\n-------------\n')

# Generating the response with the second prompt
prompt_text = 'What is the largest city in that state?'
response = inference_chain.invoke({
    'history': chat_history.messages,
    'input': prompt_text
})
print(f'Second prompt: "{prompt_text}"')
print(f'AI Response: {response.content}')
print(f'AI Response Metadata: {response.response_metadata}')
print('Current chat history:')
print(chat_history.messages)

First prompt: "What is the capital of Illinois?
AI Response: The capital city of Illinois is Springfield. It's located in the central part of the state and is the county seat of Sangamon County. Springfield is known for its rich history, including being the site of Abraham Lincoln's presidential library and his former home, which is now a National Historic Site."
AI Response Metadata: {'model_name': 'mistralai/Mistral-7B-Instruct-v0.2', 'timestamp': '2024-04-21 04:49:47.984104+00:00', 'like_data': None, 'hyperparameters': {'temp': 0.7, 'max_tokens': 1000}}
Current chat history:
[SystemMessage(content='You are a helpful assistant.'), HumanMessage(content='What is the capital of Illinois?'), AIMessage(content="The capital city of Illinois is Springfield. It's located in the central part of the state and is the county seat of Sangamon County. Springfield is known for its rich history, including being the site of Abraham Lincoln's presidential library and his former home, which is now a Na

In [13]:
chat_history.messages

[SystemMessage(content='You are a helpful assistant.'),
 HumanMessage(content='What is the capital of Illinois?'),
 AIMessage(content="The capital city of Illinois is Springfield. It's located in the central part of the state and is the county seat of Sangamon County. Springfield is known for its rich history, including being the site of Abraham Lincoln's presidential library and his former home, which is now a National Historic Site.", response_metadata={'model_name': 'mistralai/Mistral-7B-Instruct-v0.2', 'timestamp': '2024-04-21 04:49:47.984104+00:00', 'like_data': None, 'hyperparameters': {'temp': 0.7, 'max_tokens': 1000}}, id='run-2b97edc4-2510-492a-af12-d63ad4a95e62-0'),
 HumanMessage(content='What is the largest city in that state?'),
 AIMessage(content="The largest city in Illinois is Chicago. Chicago is located in the northeastern part of the state and is the third most populous city in the United States. It's known for its iconic skyline, major industries, cultural institution

## (Optional) Generating a Summary Title
In certain interfaces including OpenAI's ChatGPT, the chat history will represent your conversation with what I like to call a "summary title." For example, let's say I ask it for a recipe for chocolate chip cookies, then the ChatGPT interface will represent my conversation in the chat history window as something like "A Recipe for Chocolate Chip Cookies". While this is not necessary, I thought it might be a fun touch to add!

Let's demonstrate by starting a conversation asking it to write a fun haiku.

In [14]:
# Instantiating a simple chat history
chat_history = ChatMessageHistory(messages = [
    SystemMessage(content = 'You are a helpful assistant.')
])

# Generating the response with the prompt
prompt_text = 'Please give me a recipe for delicions chocolate chip cookies.'
response = inference_chain.invoke({
    'history': chat_history.messages,
    'input': prompt_text
})
print(f'Prompt: "{prompt_text}')
print(f'AI Response: {response.content}"')
print(f'AI Response Metadata: {response.response_metadata}')
print('Current chat history:')
print(chat_history.messages)

Prompt: "Please give me a recipe for delicions chocolate chip cookies.
AI Response: I'd be happy to help you make delicious chocolate chip cookies! Here's a simple and popular recipe that you can try at home.

Ingredients:
- 1 cup (2 sticks) unsalted butter, softened
- 1 cup white sugar
- 1 cup packed brown sugar
- 2 eggs
- 2 teaspoons vanilla extract
- 3 1/2 cups all-purpose flour
- 1"
AI Response Metadata: {'model_name': 'mistralai/Mistral-7B-Instruct-v0.2', 'timestamp': '2024-04-21 04:49:55.055835+00:00', 'like_data': None, 'hyperparameters': {'temp': 0.7, 'max_tokens': 1000}}
Current chat history:
[SystemMessage(content='You are a helpful assistant.'), HumanMessage(content='Please give me a recipe for delicions chocolate chip cookies.'), AIMessage(content="I'd be happy to help you make delicious chocolate chip cookies! Here's a simple and popular recipe that you can try at home.\n\nIngredients:\n- 1 cup (2 sticks) unsalted butter, softened\n- 1 cup white sugar\n- 1 cup packed brown

Very much like how we created the inference chain, we can use LangChain again here to create a new chain that will produce the summary as we please! In the code below, you will see the prompt engineering that I use to generate this summary heading.

(Note: You'll notice I'm using the `HumanMessagePromptTemplate` object for placing my prompt engineering. Ideally, we would place this prompt engineering in something more representative of the system message, but as you already know, some of these models don't play well with the `SystemMessage` object! In this particular case, we are not saving any metadata about how we generated this summary, so we can sneak by using the human message prompt template.)

In [15]:
# Creating the summary title prompt engineering
summary_title_prompt = '''The text delineated by the triple backticks below contains the beginning of a conversation between a human and a large language model (LLM). Please provide a brief summary to serve as a title for this conversation. Do not use any system messages. Place more emphasis on the human's prompt over the AI's response. Please ensure the summary title does not exceed more than ten words. Please format the summary title as one would any formal title, like that of a book. Do not give any extra words except the summary title. (Example: Do not show "Title: ")

```
{history}
```
'''

# Creating the LangChain chat prompt
summary_title_prompt_template = HumanMessagePromptTemplate.from_template(template = summary_title_prompt)

summary_title_chat_prompt = ChatPromptTemplate.from_messages(messages = [
    summary_title_prompt_template
])

# Creating the summary title chain
summary_title_chain = summary_title_chat_prompt | chat_model


In [16]:
# Generating the summary title based on the sample chat history
summary_title_response = summary_title_chain.invoke({
    'history': chat_history.messages
})

print(f'Summary Title: {summary_title_response.content}')

Summary Title: Title: Delicious Chocolate Chip Cookies Recipe

(Note: The title focuses on the human's request for a chocolate chip cookies recipe and does not include any details about the AI's response.)


In [17]:
# Stripping out "Title: " (Because no idea why it wants to keep doing that...)
summary_title = summary_title_response.content.replace("Title: ", "").replace("title: ", "")
print(summary_title)

Delicious Chocolate Chip Cookies Recipe

(Note: The title focuses on the human's request for a chocolate chip cookies recipe and does not include any details about the AI's response.)


## Starting the Conversation History Schema
As mentioned before, we are going to be emulating the structure of the schema as defined in `data/schema.json`. In this notebook, we are going to pretend as if the user is a brand new user, so we will need to set up the conversation history schema from scratch.

In [18]:
# Generating a current conversation ID
current_conversation_id = 'conv_id_' + str.replace(str(uuid.uuid4()), '-', '_')
current_conversation_id

'conv_id_91496830_918e_4934_a7ba_1420346912ac'

In [19]:
# Creating the base conversation history schema per a single user
BASE_CONVERSATION_HISTORY_SCHEMA = {
    'user_id': 'default_username',
    'chat_history': []
}

BASE_CONVERSATION_HISTORY_SCHEMA

{'user_id': 'default_username', 'chat_history': []}

In [20]:
# Creating the user history from the base schema
user_history = BASE_CONVERSATION_HISTORY_SCHEMA

## Converting the LangChain Messages to and from JSON
One great thing about LangChain are the many integrations it offers for connections to many of your favorite online services. To keep things simple, I want to save my chat history as a JSON file. Now as you saw in the previous section, I'm a bit picky with my choice of schema. This means that we are going to need to figure out a way to convert the LangChain messages to and from JSON.

In [21]:
def lc_to_json_list(lc_messages):
    '''
    Converts LangChain messages to a JSON-like structure in a list

    Inputs:
        - lc_messages (List[BaseMessage]): A list of LangChain message objects

    Returns:
        - conversation_json_list (list): The LangChain messages now represented in our JSON structure
    '''

    # Instantiating a dictionary to hold the JSON output
    conversation_json_list = []

    # Iterating over each of the LangChain messages
    for message in lc_messages:

        # Determining the action based on if message is System type
        if message.type == 'system':

            conversation_json = {
                'role': 'system',
                'content': message.content
            }
            conversation_json_list.append(conversation_json)

        # Determining the action based on if message is Human type
        elif message.type == 'human':

            conversation_json = {
                'role': 'user',
                'content': message.content,
            }
            conversation_json_list.append(conversation_json)
        
        # Determining the action based on if message is AI type
        elif message.type == 'ai':

            conversation_json = {
                'role': 'assistant',
                'content': message.content,
                'metadata': message.response_metadata
            }
            conversation_json_list.append(conversation_json)
        
    return conversation_json_list

In [22]:
# Converting the LangChain history to a JSON-like list of messages
conversation_json_list = lc_to_json_list(chat_history.messages)
conversation_json_list

[{'role': 'system', 'content': 'You are a helpful assistant.'},
 {'role': 'user',
  'content': 'Please give me a recipe for delicions chocolate chip cookies.'},
 {'role': 'assistant',
  'content': "I'd be happy to help you make delicious chocolate chip cookies! Here's a simple and popular recipe that you can try at home.\n\nIngredients:\n- 1 cup (2 sticks) unsalted butter, softened\n- 1 cup white sugar\n- 1 cup packed brown sugar\n- 2 eggs\n- 2 teaspoons vanilla extract\n- 3 1/2 cups all-purpose flour\n- 1",
  'metadata': {'model_name': 'mistralai/Mistral-7B-Instruct-v0.2',
   'timestamp': '2024-04-21 04:49:55.055835+00:00',
   'like_data': None,
   'hyperparameters': {'temp': 0.7, 'max_tokens': 1000}}}]

In [23]:
# Adding the conversation to the user's overall history
user_history['chat_history'].append({
    'summary_title': summary_title,
    'conversation_id': current_conversation_id,
    'conversation': conversation_json_list
})
user_history

{'user_id': 'default_username',
 'chat_history': [{'summary_title': "Delicious Chocolate Chip Cookies Recipe\n\n(Note: The title focuses on the human's request for a chocolate chip cookies recipe and does not include any details about the AI's response.)",
   'conversation_id': 'conv_id_91496830_918e_4934_a7ba_1420346912ac',
   'conversation': [{'role': 'system',
     'content': 'You are a helpful assistant.'},
    {'role': 'user',
     'content': 'Please give me a recipe for delicions chocolate chip cookies.'},
    {'role': 'assistant',
     'content': "I'd be happy to help you make delicious chocolate chip cookies! Here's a simple and popular recipe that you can try at home.\n\nIngredients:\n- 1 cup (2 sticks) unsalted butter, softened\n- 1 cup white sugar\n- 1 cup packed brown sugar\n- 2 eggs\n- 2 teaspoons vanilla extract\n- 3 1/2 cups all-purpose flour\n- 1",
     'metadata': {'model_name': 'mistralai/Mistral-7B-Instruct-v0.2',
      'timestamp': '2024-04-21 04:49:55.055835+00:00'

In [24]:
# Saving the chat history to file
with open('../data/chat_history.json', 'w') as f:
    json.dump(user_history, f, indent = 2)

In [25]:
def json_list_to_lc(conversation_json_list):
    '''
    Converts the JSON-like list of conversation messages into a formalized LangChain chat history

    Inputs:
        - conversation_json_list (List[dict]): A JSON-like list of the conversation messages

    Returns:
        - chat_history (LangChain ChatMessageHistory): The input list but now in LangChain form
    '''

    # Instantiating the chat history
    chat_history = ChatMessageHistory()

    # Iterating over all the messages
    for message in conversation_json_list:

        # Taking action based on the message's role type
        if message['role'] == 'system':
            chat_history.add_message(SystemMessage(content = message['content']))
        elif message['role'] == 'user':
            chat_history.add_message(HumanMessage(content = message['content']))
        elif message['role'] == 'assistant':
            chat_history.add_message(AIMessage(content = message['content'], response_metadata = message['metadata']))
    
    return chat_history

In [26]:
# Loading user history from file
with open('../data/chat_history.json') as f:
    user_history = json.load(f)

In [27]:
# Loading the chat history from the first conversation of the user history
chat_history = json_list_to_lc(user_history['chat_history'][0]['conversation'])

In [28]:
# Displaying the re-loaded messages
chat_history.messages

[SystemMessage(content='You are a helpful assistant.'),
 HumanMessage(content='Please give me a recipe for delicions chocolate chip cookies.'),
 AIMessage(content="I'd be happy to help you make delicious chocolate chip cookies! Here's a simple and popular recipe that you can try at home.\n\nIngredients:\n- 1 cup (2 sticks) unsalted butter, softened\n- 1 cup white sugar\n- 1 cup packed brown sugar\n- 2 eggs\n- 2 teaspoons vanilla extract\n- 3 1/2 cups all-purpose flour\n- 1", response_metadata={'model_name': 'mistralai/Mistral-7B-Instruct-v0.2', 'timestamp': '2024-04-21 04:49:55.055835+00:00', 'like_data': None, 'hyperparameters': {'temp': 0.7, 'max_tokens': 1000}})]

## A Real Life Use Case
Okay, we now have the essential basic framework to start using our chatbot and saving our chat history to a local JSON file. We will jump back and forth between different use cases to ensure that what we are building is resilient to everyday use.

### Simulating a First Conversation
Let's begin

### Starting a New (Second) Conversation

### Jumping Back to First Conversation

### Loading It All Back In from File

### Starting a Third (Final) Conversation

### Returning to the First Conversation