# Keeping track of information
LLMs calls are inherently stateless - they do not have memory of any previous interactions. Every call is an independent event, and **YOU** must manage any information that needs to be carried over time.

In this notebook we will look at a few different things that we might want to keep track of between calls.

## Conversation History
Keeping track of the conversation history is actually easy. Firstly it is important to remember that LLMs often have the following pattern:

```
-> system prompt
-> user prompt
-> model response

-> user prompt
-> model response

-> user prompt
-> model response

-> etc.
```

We actually saw an example of this in the prompting notebook when we looked at few-shot prompting.

Here is a really simple example of how we can keep track of the conversation history. We can first define a `system_state` dictionary that will store important information for us. We can give it a `conversation_history` key that will store the conversation history.

In [3]:
system_state = {
    "conversation_history": []
}

In [4]:
system_prompt = (
    "You are a helpful philosophical assistant. "
    "You will help me think about philosophical questions. "
    "Please keep your answers concise and to the point."
)

system_state["conversation_history"].append({
    "role": "system",
    "content": system_prompt
})

user_prompt = "What is the meaning of life?"

system_state["conversation_history"].append({
    "role": "user",
    "content": user_prompt
})

In [5]:
for message in system_state["conversation_history"]:
    print(f"{message['role']}: {message['content']}\n")

system: You are a helpful philosophical assistant. You will help me think about philosophical questions. Please keep your answers concise and to the point.

user: What is the meaning of life?



Now we can use this conversation history to generate a response.

In [12]:
from openai import OpenAI
client = OpenAI()

import dotenv
import os

dotenv.load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

from rich.pretty import pprint

In [34]:
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=system_state["conversation_history"],
    max_tokens=512,
    temperature=1.0
)

print(response.choices[0].message.content)

The meaning of life is a deeply personal and subjective question. Various philosophical, spiritual, and existential perspectives offer different answers: 

1. **Existentialism** suggests that life has no inherent meaning, and individuals must create their own purpose.
2. **Religious perspectives** often provide a framework where life's meaning is linked to a divine purpose or adherence to spiritual teachings.
3. **Humanism** focuses on the meaning derived from human relationships, personal fulfillment, and contributing to the greater good.
4. **Buddhism** emphasizes the pursuit of enlightenment and understanding the nature of suffering as key to a meaningful life.

Ultimately, the meaning of life may depend on one's values, beliefs, and experiences.


Great, but now what happens if I want to ask a follow up question? Without the conversation history?

In [35]:
follow_up_prompt = "Can you tell more about point 1?"

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "user",
        "content": follow_up_prompt
    }],
    max_tokens=512,
    temperature=1.0
)

print(response.choices[0].message.content)


Of course! However, I need more context to provide a detailed response. Could you please specify what "point 1" you are referring to? This could relate to a list, an article, or a particular topic. Let me know so I can assist you better!


Obviously it has no memory of the previous conversation. So we just need to append the follow up prompt to the conversation history.

In [36]:
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=system_state["conversation_history"],
    max_tokens=512,
    temperature=1.0
)

system_state["conversation_history"].append({
    "role": "assistant",
    "content": response.choices[0].message.content
})

for message in system_state["conversation_history"]:
    print(f"{message['role'].upper()}: {message['content']}\n")

SYSTEM: You are a helpful philosophical assistant. You will help me think about philosophical questions. Please keep your answers concise and to the point.

USER: What is the meaning of life?

ASSISTANT: The meaning of life is a deeply personal and subjective question. Some philosophical perspectives suggest it's about seeking happiness, fulfilling potential, or contributing to others. Existentialists argue it’s up to each individual to create their own meaning. Others find purpose in spiritual beliefs or connections with nature. Ultimately, it varies for each person based on their values, experiences, and beliefs.



And now we can keep the conversation going in a simple loop. If you run this cell a few times you will see that the conversation history is correctly maintained.

In [None]:
while True:
    user_input = input("You: ")
    
    if user_input.lower() in ['exit', 'quit', 'bye']:
        print("Assistant: Goodbye!")
        break
    
    system_state["conversation_history"].append({
        "role": "user",
        "content": user_input
    })
    
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=system_state["conversation_history"],
        max_tokens=512,
        temperature=1.0
    )
    
    assistant_response = response.choices[0].message.content
    print(f"Assistant: {assistant_response}\n")
    
    system_state["conversation_history"].append({
        "role": "assistant",
        "content": assistant_response
    })

In [45]:
from rich.console import Console
from rich.text import Text

console = Console()

In [48]:
colors = {
    "system": "green",
    "user": "cyan",
    "assistant": "magenta"
}

for message in system_state["conversation_history"]:
    role = message["role"]
    content = message["content"]
    color = colors[role]
    console.print(f"[{color}]{role.upper()}: {content}[/{color}]")

## Tracking tokens
We should probably also track the tokens. This can be useful for a few reasons - we can track costs, and we can use it to cut off conversation history when we get too close to our limit.

We can make this as simple or complicated as we want. Probably we should create a `Conversation` class to keep track of things like this.

In [114]:
class Conversation:
    def __init__(self, system_prompt):
        self.system_prompt = system_prompt
        self.history = []
        self.tokens = 0
        self.token_limit = 300

        self.add_message("system", system_prompt)

    def add_message(self, role, content):
        self.history.append({"role": role, "content": content})
        self.tokens += len(content)

    def check_token_limit(self):
        while self.tokens > self.token_limit and len(self.history) > 1:
            # Remove the oldest non-system message
            for i in range(1, len(self.history)):
                if self.history[i]["role"] != "system":
                    removed_message = self.history.pop(i)
                    self.tokens -= len(removed_message["content"])
                    break

    def response(self, user_input):
        self.add_message("user", user_input)
        if self.tokens > self.token_limit:
            self.check_token_limit()
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=self.history,
            max_tokens=512,
            temperature=1.0
        ).choices[0].message.content
        
        self.add_message("assistant", response)

        return response
                

Let's see how this works.

Let's see if we can get the model to forget about things we mention at the start of a conversation.

In [119]:
conversation = Conversation(system_prompt)

In [120]:
print(conversation.response("Hello, my name is Bob and I am 25 years old!"))
print(f"Tokens: {conversation.tokens}")

Hello, Bob! It's nice to meet you. What philosophical question or topic would you like to explore today?
Tokens: 295


In [121]:
print(conversation.response("What is my name?"))
print(f"Tokens: {conversation.tokens}")


You mentioned your name is Bob. How can I assist you further?
Tokens: 328


Great, so now we have hit our token limit, and the conversation should be trimmed in the next response.

In [122]:
print(conversation.response("What is my age?"))
print(f"Tokens: {conversation.tokens}")

I don't have access to personal information, so I can't know your age. You could share it if you'd like to discuss it further!
Tokens: 365


In [123]:
pprint(conversation.history, expand_all=True)

This is a good start, but there is a problem here. What if there was something very important that we wanted to keep track of that was mentioned at the start of the conversation, but it has been cut off!?