In [None]:
# Implementing self-editing memory from scratch

This module demonstrates how to build long-term memory capabilities in LLM agents using a technique called self-editing memory, inspired by the MemGPT paper.

Traditionally, methods like RAG (Retrieval-Augmented Generation) or recursive summarization are used to maintain memory in LLMs. However, MemGPT introduces a powerful alternative: memory management to the LLM itself — treating it as the most intelligent component of the system.

💡 Instead of hard-coding memory updates or writing manual memory pruning logic, we use OpenAI's tool-calling to let the LLM autonomously decide:

What to remember

When to update memory

How to format and store memory internally

In [None]:
!pip install -r requirements.txt


In [None]:
##  Setup OpenAI

In [None]:
from helper import get_openai_api_key
openai_api_key = get_openai_api_key()

In [None]:
from openai import OpenAI
import os

client = OpenAI(
    api_key=openai_api_key
)

In [None]:
from openai import OpenAI
client = OpenAI(api_key=get_openai_api_key())

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)
#to confirm connection

In [None]:
## Breaking down the LLM context window


### Context Window Design in MemoryAgent
In LLM-based agents, context is everything. Since large language models have a limited context window (a fixed number of tokens the model can "see" at once), it's essential to manage that space intelligently.

This project implements a structured context window for the agent, drawing on architectural insights from the MemGPT framework.

###  Agent Context Layout
MemoryAgent structures its prompt using the following key components:

System Prompt: Defines the personality and behavior rules of the agent.

Conversation History: Recent messages exchanged between the user and the assistant.

[MEMORY] Section: A custom-defined core memory area that contains structured knowledge (e.g., facts about the user, prior events, goals). This section is read-writeable by the agent via tool-calling.

Recursive Summary (Optional): A synthesized history summary that compresses old interactions, useful when message history exceeds the token limit.

###  Why This Matters
By explicitly dividing the context window, MemoryAgent can:

Maintain a consistent personality and role

Recall essential facts even across long conversations

Reduce token usage through smart summarization

Store persistent knowledge in the core memory, while keeping the conversation responsive

This structured design is crucial for building scalable, intelligent agents that don’t “forget” over time.

In [None]:
### A simple agent's context window
In the code below, you can see how we can define an agent with a system prompt. The system prompt will be included in every chat completions request as the first message, while later messages will change over time as the user and assistant exchange messages. 

In [None]:
model = "gpt-4o-mini"

In [None]:
system_prompt = "You are a chatbot."

In [None]:
# Make the completion request with the tool usage
chat_completion = client.chat.completions.create(
    model=model,
    messages=[
        # system prompt: always included in the context window 
        {"role": "system", "content": system_prompt}, 
        # chat history (evolves over time)
        {"role": "user", "content": "What is my name?"}, 
    ]
)
chat_completion.choices[0].message.content

In [None]:
### Adding memory to the context 
Similar to how we always start a chat completitions request with a system prompt, we can also prefix the list of messages with a prompt containing important memories. Lets see how we can add a memory section to the context window, and have the agent use that memory to respond to user messages.  

In [None]:
agent_memory = {"human": "Name: Bob"}
system_prompt = "You are a chatbot. " \
+ "You have a section of your context called [MEMORY] " \
+ "that contains information relevant to your conversation"

In [None]:
import json


chat_completion = client.chat.completions.create(
    model=model,
    messages=[
        # system prompt 
        {"role": "system", "content": system_prompt + "[MEMORY]\n" + json.dumps(agent_memory)},
        # chat history 
        {"role": "user", "content": "What is my name?"},
    ],
)
chat_completion.choices[0].message.content

In [None]:
Now we will  make this memory read-writeable by the agent. 

In [None]:
##  Modifing the memory with tools 

In [None]:
We first need to define a memory save tool. In order to allow the agent to save new memory, we implement a simple tool that appends to a section of the memory dictionary which we will pass to the agent.  

In [None]:
### Defining a memory editing tool 
Instead of directly providing the name in the agent's memory, we'll instead start with a blank memory object and provide a function that can edit the memory object. 

In [None]:
agent_memory = {"human": "", "agent": ""}

def core_memory_save(section: str, memory: str): 
    agent_memory[section] += '\n' 
    agent_memory[section] += memory 

In [None]:
To inform our agent of this tool and how to use it, we need to create a tool schema that OpenAI can process. This includes a description of how to use the tool, and the parameters the agent must generate to input into the tool. 

In [None]:

# tool description 
core_memory_save_description = "Save important information about you," \
+ "the agent or the human you are chatting with."

# arguments into the tool (generated by the LLM)
# defines what the agent must generate to input into the tool 
core_memory_save_properties = \
{
    # arg 1: section of memory to edit
    "section": {
        "type": "string",
        "enum": ["human", "agent"],
        "description": "Must be either 'human' " \
        + "(to save information about the human) or 'agent'" \
        + "(to save information about yourself)",            
    },
    # arg 2: memory to save
    "memory": {
        "type": "string",
        "description": "Memory to save in the section",
    },
}

# tool schema (passed to OpenAI)
core_memory_save_metadata = \
    {
        "type": "function",
        "function": {
            "name": "core_memory_save",
            "description": core_memory_save_description,
            "parameters": {
                "type": "object",
                "properties": core_memory_save_properties,
                "required": ["section", "memory"],
            },
        }
    }

In [None]:
Now, we can pass the tool call into the agent! 

In [None]:


agent_memory = {"human": ""}
system_prompt = "You are a chatbot. " \
+ "You have a section of your context called [MEMORY] " \
+ "that contains information relevant to your conversation"

chat_completion = client.chat.completions.create(
    model=model,
    messages=[
        # system prompt 
        {"role": "system", "content": system_prompt}, 
        # memory 
        {"role": "system", "content": "[MEMORY]\n" + json.dumps(agent_memory)},
        # chat history 
        {"role": "user", "content": "My name is Bob"},
    ],
    # tool schemas 
    tools=[core_memory_save_metadata]
)
response = chat_completion.choices[0]
response

In [None]:
### Executing the tool 
Unfortunately, OpenAI isn't going to actually execute the tool, so we have to do this ourselves. Lets take the arguments specified in the tool call response we just got to run the tool. 

In [None]:
arguments = json.loads(response.message.tool_calls[0].function.arguments)
arguments

In [None]:
# run the function with the specified arguments 
core_memory_save(**arguments)

In [None]:
Now, we can view the memory object that has been updated: 

In [None]:
agent_memory

In [None]:
### Running the next agent step 
Now, we can see how the agent responds differently as the memory has been updated. 

In [None]:
chat_completion = client.chat.completions.create(
    model=model,
    messages=[
        # system prompt 
        {"role": "system", "content": system_prompt}, 
        # memory 
        {"role": "system", "content": "[MEMORY]\n" + json.dumps(agent_memory)},
        # chat history 
        {"role": "user", "content": "what is my name"},
    ],
    tools=[core_memory_save_metadata]
)
response = chat_completion.choices[0]
response.message

In [None]:
## Implementing an agentic loop
In our current implementation, the agent can only take one step at a time: it can either edit memory, or respond to the user. However, ideally we want our agent to support *multi-step reasoning*, so it can combine multiple actions together. For example, if we tell the agent our name is "Bob", we want the agent to both edit its memory and also return back a message to us. 

We can implement multi-step reasoning by calling the `client.chat.completions.create(...)` in a loop, and allowing the agent to choose whether to continue its reasoning steps or to break out of the loop. For simplicity, we will assume that an agent response that is *not* a tool call breaks out of the reasoning loop yields control back to the user.


In [None]:
agent_memory = {"human": ""}

In [None]:
system_prompt_os = system_prompt \
+ "\n. You must either call a tool (core_memory_save) or" \
+ "write a response to the user. " \
+ "Do not take the same actions multiple times!" \
+ "When you learn new information, make sure to always" \
+ "call the core_memory_save tool." 

In [None]:
We implement a simple step function for the agent. The function responds to the user message but allows the agent to take multiple actions in sequence, and returns to the user when a message (that does not call a tool) is sent. 

In [None]:
def agent_step(user_message, chat_history = []): 

    # prefix messages with system prompt and memory
    messages = [
        # system prompt 
        {"role": "system", "content": system_prompt_os}, 
        # memory
        {
            "role": "system", 
            "content": "[MEMORY]\n" + json.dumps(agent_memory)
        },
    ] 

    # append the chat history 
    messages += chat_history
    

    # append the most recent message
    messages.append({"role": "user", "content": user_message})
    
    # agentic loop 
    while True: 
        chat_completion = client.chat.completions.create(
            model=model,
            messages=messages,
            tools=[core_memory_save_metadata]
        )
        response = chat_completion.choices[0]

        # update the messages with the agent's response 
        messages.append(response.message)

        # if NOT calling a tool (responding to the user), return 
        if not response.message.tool_calls: 
            messages.append({
                "role": "assistant", 
                "content": response.message.content
            })
            return response.message.content

        # if calling a tool, execute the tool
        if response.message.tool_calls: 
            print("TOOL CALL:", response.message.tool_calls[0].function)

            # add the tool call response to the message history 
            messages.append({"role": "tool", "tool_call_id": response.message.tool_calls[0].id, "name": "core_memory_save", "content": f"Updated memory: {json.dumps(agent_memory)}"})

            # parse the arguments from the LLM function call
            arguments = json.loads(response.message.tool_calls[0].function.arguments)

            # run the function with the specified arguments
            core_memory_save(**arguments)

In [None]:
Now, we can observe the agent take multiple actions when we sent it a message: 

In [None]:
agent_step("my name is bob.")

In [None]:
Now, the agent is able to both edit its memory *and* generate a response to the user that uses the updated memory in a single step. 

Although in this example, we only support a single tools and responding to the user, this same structure can be used to implement more complex reasoning loops that combine many tools. In MemGPT, all actions (even a response to the user) is a tool, where some tools (such as sending a message) break the agent reasoning loop, while other tools (such as searching the archival memory store or editing memory) do not. 

we have included self-editing memory and multi-step reasoning :)

In [None]:
#  Building Agents with memory

In [None]:
!pip install letta openai python-dotenv


In [None]:
## Setup a Letta client

In [None]:
def get_letta_token():
    load_env()
    return os.getenv("LETTA_API_TOKEN")

def get_letta_base_url():
    load_env()
    return os.getenv("LETTA_BASE_URL")


In [None]:
from letta_client import Letta
from helper import get_letta_token, get_letta_base_url

client = Letta(
    token=get_letta_token(),
    base_url=get_letta_base_url()
)


In [None]:
# to test it
client.agents.list()  # Should return empty or list of agents


In [None]:
def print_message(message):
    if message.message_type == "reasoning_message":
        print("🧠 Reasoning: " + message.reasoning)
    elif message.message_type == "assistant_message":
        print("🤖 Agent: " + message.content)
    elif message.message_type == "tool_call_message":
        print("🔧 Tool Call: " + message.tool_call.name + "\n" + message.tool_call.arguments)
    elif message.message_type == "tool_return_message":
        print("🔧 Tool Return: " + message.tool_return)
    elif message.message_type == "user_message":
        print("👤 User Message: " + message.content)

In [None]:
## Creating a simple agent with memory

In [None]:
### Creating an agent

In [None]:
agent_state = client.agents.create(
    name="simple_agent",
    memory_blocks=[
        {
          "label": "human",
          "value": "My name is Charles",
          "limit": 10000 # character limit
        },
        {
          "label": "persona",
          "value": "You are a helpful assistant and you always use emojis"
        }
    ],
    model="openai/gpt-4o-mini-2024-07-18",
    embedding="openai/text-embedding-3-small"
)

In [None]:
### Messaging an agent

In [None]:
# send a message to the agent
response = client.agents.messages.create(
    agent_id=agent_state.id,
    messages=[
        {
            "role": "user",
            "content": "hows it going????"
        }
    ]
)

# if we want to print the messages
for message in response.messages:
    print_message(message)

In [None]:
### Viewing usage information

In [None]:
# if we want to print the usage stats
print(response.usage.completion_tokens)
print(response.usage.prompt_tokens)
print(response.usage.step_count)

In [None]:
### Understanding agent state

In [None]:
print(agent_state.system)

In [None]:
[t.name for t in agent_state.tools]

In [None]:
 Memory blocks are returned as an unordered list

In [None]:
agent_state.memory

In [None]:
for message in client.agents.messages.list(agent_id=agent_state.id):
    print_message(message)

In [None]:
passages = client.agents.passages.list(
    agent_id=agent_state.id,
)
passages

In [None]:
##  Understanding core memory

In [None]:
### Giving agent new information

In [None]:
# send a message to the agent
response = client.agents.messages.create(
    agent_id=agent_state.id,
    messages=[
        {
            "role": "user",
            "content": "my name actually Sarah "
        }
    ]
)

# if we want to print the messages
for message in response.messages:
    print_message(message)

In [None]:
print(response.usage.completion_tokens)
print(response.usage.prompt_tokens)
print(response.usage.step_count)

In [None]:
### Retrieving new values

In [None]:
client.agents.blocks.retrieve(
    agent_id=agent_state.id,
    block_label="human"
).value

In [None]:
##  Understanding archival memory

In [None]:
### Saving information to archival memory

In [None]:
passages = client.agents.passages.list(
    agent_id=agent_state.id,
)
passages

In [None]:
response = client.agents.messages.create(
    agent_id=agent_state.id,
    messages=[
        {
            "role": "user",
            "content": "Save the information that 'bob loves cats' to archival"
        }
    ]
)

# if we want to print the messages
for message in response.messages:
    print_message(message)

In [None]:
passages = client.agents.passages.list(
    agent_id=agent_state.id,
)
[passage.text for passage in passages]

In [None]:
### Explicitly creating archival memories

In [None]:
client.agents.passages.create(
    agent_id=agent_state.id,
    text="Bob's loves boston terriers",
)

In [None]:
# send a message to the agent
response = client.agents.messages.create(
    agent_id=agent_state.id,
    messages=[
        {
            "role": "user",
            "content": "What animals do I like? Search archival."
        }
    ]
)

for message in response.messages:
    print_message(message)

In [None]:
def print_message(message):  
    if message.message_type == "reasoning_message": 
        print("🧠 Reasoning: " + message.reasoning) 
    elif message.message_type == "assistant_message": 
        print("🤖 Agent: " + message.content) 
    elif message.message_type == "tool_call_message": 
        print("🔧 Tool Call: " + message.tool_call.name + "\n" + message.tool_call.arguments)
    elif message.message_type == "tool_return_message": 
        print("🔧 Tool Return: " + message.tool_return)
    elif message.message_type == "user_message": 
        print("👤 User Message: " + message.content)
    elif message.message_type == "usage_statistics": 
        # for streaming specifically, we send the final chunk that contains the usage statistics 
        print(f"Usage: [{message}]")
    else: 
        print(message)
    print("-----------------------------------------------------")

In [None]:
##  Memory Blocks

In [None]:
### Creating an agent

In [None]:
agent_state = client.agents.create(
    memory_blocks=[
        {
          "label": "human",
          "value": "The human's name is Bob the Builder."
        },
        {
          "label": "persona",
          "value": "My name is Sam, the all-knowing sentient AI."
        }
    ],
    model="openai/gpt-4o-mini",
    embedding="openai/text-embedding-3-small"
)

In [None]:
### Accessing blocks

In [None]:
blocks = client.agents.blocks.list(
    agent_id=agent_state.id,
)

In [None]:
Memory blocks are returned as an unordered list

In [None]:
# Note: Replace the block_id with the id from the cell above.
block_id='add_block_id_above'

In [None]:
client.blocks.retrieve(block_id)

In [None]:
human_block = client.agents.blocks.retrieve(
    agent_id=agent_state.id,
    block_label="human",
)
human_block

In [None]:
### Accessing block prompt template

In [None]:
client.agents.core_memory.retrieve(
    agent_id=agent_state.id
).prompt_template

In [None]:
##  Accessing `AgentState` with Tools

In [None]:
### Creating tools

In [None]:
def get_agent_id(agent_state: "AgentState"):
    """
    Query your agent ID field
    """
    return agent_state.id

In [None]:
get_id_tool = client.tools.upsert_from_function(func=get_agent_id)

In [None]:
### Creating agents that use tools

In [None]:
agent_state = client.agents.create(
    memory_blocks=[],
    model="openai/gpt-4o-mini",
    embedding="openai/text-embedding-3-small",
    tool_ids=[get_id_tool.id]
)

In [None]:
response_stream = client.agents.messages.create_stream(
    agent_id=agent_state.id,
    messages=[
        {
            "role": "user",
            "content": "What is your agent id?" 
        }
    ]
)

for chunk in response_stream:
    print_message(chunk)

In [None]:
##  Custom Task Queue Memory

In [None]:
### Creating custom memory management tools

In [None]:
def task_queue_push(agent_state: "AgentState", task_description: str):
    """
    Push to a task queue stored in core memory.

    Args:
        task_description (str): A description of the next task you must accomplish.

    Returns:
        Optional[str]: None is always returned as this function
        does not produce a response.
    """

    from letta_client import Letta
    import json

    client = Letta(base_url="http://localhost:8283")

    block = client.agents.blocks.retrieve(
        agent_id=agent_state.id,
        block_label="tasks",
    )
    tasks = json.loads(block.value)
    tasks.append(task_description)

    # update the block value
    client.agents.blocks.modify(
        agent_id=agent_state.id,
        value=json.dumps(tasks),
        block_label="tasks"
    )
    return None

In [None]:
def task_queue_pop(agent_state: "AgentState"):
    """
    Get the next task from the task queue 
 
    Returns:
        Optional[str]: Remaining tasks in the queue
    """

    from letta_client import Letta
    import json 

    client = Letta(base_url="http://localhost:8283") 

    # get the block 
    block = client.agents.blocks.retrieve(
        agent_id=agent_state.id,
        block_label="tasks",
    )
    tasks = json.loads(block.value) 
    if len(tasks) == 0: 
        return None
    task = tasks[0]

    # update the block value 
    remaining_tasks = json.dumps(tasks[1:])
    client.agents.blocks.modify(
        agent_id=agent_state.id,
        value=remaining_tasks,
        block_label="tasks"
    )
    return f"Remaining tasks {remaining_tasks}"

In [None]:
### Upserting tools into Letta

In [None]:
task_queue_pop_tool = client.tools.upsert_from_function(
    func=task_queue_pop
)
task_queue_push_tool = client.tools.upsert_from_function(
    func=task_queue_push
)

In [None]:
import json

task_agent = client.agents.create(
    system=open("task_queue_system_prompt.txt", "r").read(),
    memory_blocks=[
        {
          "label": "tasks",
          "value": json.dumps([])
        }
    ],
    model="openai/gpt-4o-mini-2024-07-18",
    embedding="openai/text-embedding-3-small", 
    tool_ids=[task_queue_pop_tool.id, task_queue_push_tool.id], 
    include_base_tools=False, 
    tools=["send_message"]
)

In [None]:
[tool.name for tool in task_agent.tools]

In [None]:
client.agents.blocks.retrieve(task_agent.id, block_label="tasks").value

In [None]:
### Using task agent

In [None]:
response_stream = client.agents.messages.create_stream(
    agent_id=task_agent.id,
    messages=[
        {
            "role": "user",
            "content": "Add 'start calling me Charles' and "
            + "'tell me a haiku about my name' as two seperate tasks."
        }
    ]
)

for chunk in response_stream:
    print_message(chunk)

In [None]:
response_stream = client.agents.messages.create_stream(
    agent_id=task_agent.id,
    messages=[
        {
            "role": "user",
            "content": "Complete your tasks"
        }
    ]
)

for chunk in response_stream:
    print_message(chunk)

In [None]:
### Retrieving task list

In [None]:
client.agents.blocks.retrieve(block_label="tasks", agent_id=task_agent.id).value