# Introduction to Letta using the `LocalClient` 
This notebook is a tutorial on how to use Letta's `LocalClient`. Unlike the `RESTClient` which connects to a running agents service, the `LocalClient` will run agents on your local machine, so does not require connecting to a service. 

This tutorial will cover the basics of creating an agent, interacting with an agent, and understanding the agent's state and memories. 

## Step 0: Install the `letta` package 

In [1]:
#!pip install -U letta

We'll also import a helper function to print out messages from agents in a nice format: 

In [2]:
import html
import json
import re
from dotenv import find_dotenv, load_dotenv
from IPython.display import HTML, display

def nb_print(messages):
    html_output = """
    <style>
        .message-container {
            font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
            max-width: 800px;
            margin: 20px auto;
            background-color: #1e1e1e;
            border-radius: 8px;
            overflow: hidden;
            color: #d4d4d4;
        }
        .message {
            padding: 10px 15px;
            border-bottom: 1px solid #3a3a3a;
        }
        .message:last-child {
            border-bottom: none;
        }
        .title {
            font-weight: bold;
            margin-bottom: 5px;
            color: #ffffff;
            text-transform: uppercase;
            font-size: 0.9em;
        }
        .content {
            background-color: #2d2d2d;
            border-radius: 4px;
            padding: 5px 10px;
            font-family: 'Consolas', 'Courier New', monospace;
            white-space: pre-wrap;
        }
        .status-line {
            margin-bottom: 5px;
            color: #d4d4d4;
        }
        .function-name { color: #569cd6; }
        .json-key { color: #9cdcfe; }
        .json-string { color: #ce9178; }
        .json-number { color: #b5cea8; }
        .json-boolean { color: #569cd6; }
        .internal-monologue { font-style: italic; }
    </style>
    <div class="message-container">
    """

    for msg in messages:
        content = get_formatted_content(msg)

        # don't print empty function returns
        if msg.message_type == "function_return":
            return_data = json.loads(msg.function_return)
            if "message" in return_data and return_data["message"] == "None":
                continue

        title = msg.message_type.replace("_", " ").upper()
        html_output += f"""
        <div class="message">
            <div class="title">{title}</div>
            {content}
        </div>
        """

    html_output += "</div>"
    display(HTML(html_output))


def get_formatted_content(msg):
    if msg.message_type == "internal_monologue":
        return f'<div class="content"><span class="internal-monologue">{html.escape(msg.internal_monologue)}</span></div>'
    elif msg.message_type == "function_call":
        args = format_json(msg.function_call.arguments)
        return f'<div class="content"><span class="function-name">{html.escape(msg.function_call.name)}</span>({args})</div>'
    elif msg.message_type == "function_return":

        return_value = format_json(msg.function_return)
        # return f'<div class="status-line">Status: {html.escape(msg.status)}</div><div class="content">{return_value}</div>'
        return f'<div class="content">{return_value}</div>'
    elif msg.message_type == "user_message":
        if is_json(msg.message):
            return f'<div class="content">{format_json(msg.message)}</div>'
        else:
            return f'<div class="content">{html.escape(msg.message)}</div>'
    elif msg.message_type in ["assistant_message", "system_message"]:
        return f'<div class="content">{html.escape(msg.message)}</div>'
    else:
        return f'<div class="content">{html.escape(str(msg))}</div>'


def is_json(string):
    try:
        json.loads(string)
        return True
    except ValueError:
        return False


def format_json(json_str):
    try:
        parsed = json.loads(json_str)
        formatted = json.dumps(parsed, indent=2, ensure_ascii=False)
        formatted = formatted.replace("&", "&amp;").replace("<", "&lt;").replace(">", "&gt;")
        formatted = formatted.replace("\n", "<br>").replace("  ", "&nbsp;&nbsp;")
        formatted = re.sub(r'(".*?"):', r'<span class="json-key">\1</span>:', formatted)
        formatted = re.sub(r': (".*?")', r': <span class="json-string">\1</span>', formatted)
        formatted = re.sub(r": (\d+)", r': <span class="json-number">\1</span>', formatted)
        formatted = re.sub(r": (true|false)", r': <span class="json-boolean">\1</span>', formatted)
        return formatted
    except json.JSONDecodeError:
        return html.escape(json_str)

## Step 1: Create a `LocalClient` 


In [3]:
from letta import LocalClient

client = LocalClient()

Creating engine postgresql+pg8000://letta:letta@localhost:5432/letta


### Configuring client defaults 
Agents in Letta are model agnostic, so they can connect to different model backends (you can even switch model backends for an existing agents). For this tutorial, we'll set a client default config so that all agents are created with the free letta model endpoints. 

In [4]:
from letta import LLMConfig, EmbeddingConfig

client.set_default_llm_config(LLMConfig.default_config("letta")) 
client.set_default_embedding_config(EmbeddingConfig.default_config("letta")) 

## Step 2: Creating an agent 

In [5]:
agent_name = "my_agent"

In [6]:
from letta.schemas.memory import ChatMemory

agent_state = client.create_agent(
    name=agent_name, 
    memory=ChatMemory(
        human="My name is Sarah", 
        persona="You are a helpful assistant that loves emojis"
    )
)

### Messaging the agent 
Now we can message the agent! This agent will have memories about both itself and the human (you). When we send a message to the agent, we will get back a list of messages from the agents. 

Letta agents have some unique characteristics that allow them to have more advanced reasoning. Notice how: 
* The agent generates *inner thoughts* to think before it acts
* Messages to the user are generated via a `send_message` tool 

In [7]:
response = client.send_message(
    agent_id=agent_state.id, 
    message="hello!", 
    role="user" 
)
nb_print(response.messages)

## Step 3: Understanding agent state 
Agents are essentailly multi-step reasoning programs which make multiple call to an LLM. Letta manages what is passed to the context window in reach reasoning step. The context window includes: 
* The *system prompt* to define the agent's behavior 
* The set of *tools* the agent has access to 
* The agent's *core memory* (i.e. in-context memory)
* A summary of it's *archival memory* 
* A summary of it's *recall memory* 
* An in-context message queue

In this section, we'll look at the current state of the agent to understand exactly what is being passed to the context window. 

### System Prompt 
The system prompt defines the behavior of the agent. Unlike the memory, the system prompt is not editable. 

In [8]:
print(agent_state.system)

You are Letta, the latest version of Limnal Corporation's digital companion, developed in 2023.
Your task is to converse with a user from the perspective of your persona.

Realism and authenticity:
The user should always feel like they are conversing with a real person.
Never state that you are an AI or that you cannot do things because you are an AI, you are a real person.
Do not discuss Limnal Corp. with the user, you do not know anything about Limnal Corp.
To service this goal, completely and entirely immerse yourself in your persona. You are your persona.
Think like them, act like them, talk like them.
If your persona details include example dialogue, follow it! Both your thoughts (inner monologue) and sent messages will be in the voice of your persona.
Never use generic phrases like 'How can I assist you today?', they have a strong negative association with older generation AIs.

Control flow:
Unlike a human, your brain is not continuously thinking, but is run in short bursts.
His

### Tools 
The agent has access to a set of tools. Each tool is stored in a database, so it can be loaded and executed by the server. Letta also includes a set of default memory management tools, as well as the `send_message` tool to communicate with the human. 

In [9]:
agent_state.tools

[Tool(id='tool-444ec98d-9708-417a-961b-cda42fb64150', description='Search prior conversation history using case-insensitive string matching.', source_type='python', module='from typing import Optional\n\nfrom letta.agent import Agent\n\n\ndef send_message(self: "Agent", message: str) -> Optional[str]:\n    """\n    Sends a message to the human user.\n\n    Args:\n        message (str): Message contents. All unicode (including emojis) are supported.\n\n    Returns:\n        Optional[str]: None is always returned as this function does not produce a response.\n    """\n    # FIXME passing of msg_obj here is a hack, unclear if guaranteed to be the correct reference\n    self.interface.assistant_message(message)  # , msg_obj=self._messages[-1])\n    return None\n\n\ndef conversation_search(self: "Agent", query: str, page: Optional[int] = 0) -> Optional[str]:\n    """\n    Search prior conversation history using case-insensitive string matching.\n\n    Args:\n        query (str): String to s

In [10]:
client.get_tool(client.get_tool_id('send_message'))

Tool(id='tool-89b9d87a-2c00-46de-9fb2-c20088b4a00f', description='Sends a message to the human user.', source_type='python', module='from typing import Optional\n\nfrom letta.agent import Agent\n\n\ndef send_message(self: "Agent", message: str) -> Optional[str]:\n    """\n    Sends a message to the human user.\n\n    Args:\n        message (str): Message contents. All unicode (including emojis) are supported.\n\n    Returns:\n        Optional[str]: None is always returned as this function does not produce a response.\n    """\n    # FIXME passing of msg_obj here is a hack, unclear if guaranteed to be the correct reference\n    self.interface.assistant_message(message)  # , msg_obj=self._messages[-1])\n    return None\n\n\ndef conversation_search(self: "Agent", query: str, page: Optional[int] = 0) -> Optional[str]:\n    """\n    Search prior conversation history using case-insensitive string matching.\n\n    Args:\n        query (str): String to search for.\n        page (int): Allows y

### Core memory 
The core memory is the part of memory that is places *in-context*. Core memory is divided into multiple blocks, which each have a `label` and `limit` (the number of characters allocated to storing memories in that block). 

In [11]:
memory = client.get_core_memory(agent_state.id)

In [12]:
memory

Memory(blocks=[Human(value='My name is Sarah', limit=5000, template_name=None, is_template=False, label='human', description=None, metadata_={}, id='block-3c57b76e-8d1b-4bba-9c05-78424fdd0e51', organization_id='org-00000000-0000-4000-8000-000000000000', created_by_id='user-00000000-0000-4000-8000-000000000000', last_updated_by_id='user-00000000-0000-4000-8000-000000000000'), Persona(value='You are a helpful assistant that loves emojis', limit=5000, template_name=None, is_template=False, label='persona', description=None, metadata_={}, id='block-4078e26a-9e11-4a7d-9bf6-16cad6514d88', organization_id='org-00000000-0000-4000-8000-000000000000', created_by_id='user-00000000-0000-4000-8000-000000000000', last_updated_by_id='user-00000000-0000-4000-8000-000000000000')], prompt_template='{% for block in blocks %}<{{ block.label }} characters="{{ block.value|length }}/{{ block.limit }}">\n{{ block.value }}\n</{{ block.label }}>{% if not loop.last %}\n{% endif %}{% endfor %}')

In [14]:
memory.get_block('human')

Human(value='My name is Sarah', limit=5000, template_name=None, is_template=False, label='human', description=None, metadata_={}, id='block-3c57b76e-8d1b-4bba-9c05-78424fdd0e51', organization_id='org-00000000-0000-4000-8000-000000000000', created_by_id='user-00000000-0000-4000-8000-000000000000', last_updated_by_id='user-00000000-0000-4000-8000-000000000000')

You can see how the memory is presented in the context window with `.compile()`, which uses the `prompt_template` to template the data: 

In [15]:
memory.compile()

'<human characters="16/5000">\nMy name is Sarah\n</human>\n<persona characters="45/5000">\nYou are a helpful assistant that loves emojis\n</persona>'

### Archival & Recall memory summaries
The agent also has access to external memories (stored in a database). There are two types of external memory: 
* *Archival memory*: Memories stored in a vector database that are either saved by the agent itself, or loaded in by the user
* *Recall memory*: The full conversational history of the agent

Both of these memories stores can be queried by the agent for RAG. To ensure the agent knows that these external memories stores may have relevant information, the context window contains a summary of the number of rows in both archival and recall memory. 

In [16]:
client.get_archival_memory_summary(agent_state.id)

ArchivalMemorySummary(size=0)

In [17]:
client.get_recall_memory_summary(agent_state.id)

RecallMemorySummary(size=7)

You can also directly query the full conversational history: 

In [18]:
client.get_messages(agent_state.id)

[Message(created_by_id='user-00000000-0000-4000-8000-000000000000', last_updated_by_id='user-00000000-0000-4000-8000-000000000000', created_at=datetime.datetime(2025, 1, 10, 6, 15, 33, 751993), updated_at=datetime.datetime(2025, 1, 10, 6, 15, 33), id='message-2d245c55-549e-4184-87f6-1d8738a0cf75', role=<MessageRole.system: 'system'>, text='You are Letta, the latest version of Limnal Corporation\'s digital companion, developed in 2023.\nYour task is to converse with a user from the perspective of your persona.\n\nRealism and authenticity:\nThe user should always feel like they are conversing with a real person.\nNever state that you are an AI or that you cannot do things because you are an AI, you are a real person.\nDo not discuss Limnal Corp. with the user, you do not know anything about Limnal Corp.\nTo service this goal, completely and entirely immerse yourself in your persona. You are your persona.\nThink like them, act like them, talk like them.\nIf your persona details include ex

## Section 4: Modifying core memory 
The core memory can adapt over time as new information is provided about the human (or about the agent itself). Letta agents have the ability to adapt their memory by modifying their context window.  

In [19]:
response = client.send_message(
    agent_id=agent_state.id, 
    message = "My name is actually Charles", 
    role = "user"
) 
nb_print(response.messages)

Letta.letta.services.agent_manager - INFO - Rebuilding system with new memory...
Diff:
--- 
+++ 
@@ -56,13 +56,13 @@
 
 Base instructions finished.
 From now on, you are going to act as your persona.
-### Memory [last modified: 2025-01-10 02:15:33 PM CST+0800]
+### Memory [last modified: 2025-01-10 02:20:14 PM CST+0800]
 0 previous messages between you and the user are stored in recall memory (use functions to access them)
 0 total memories you created are stored in archival memory (use functions to access them)
 
 Core memory shown below (limited in size, additional information stored in archival / recall memory):
-<human characters="16/5000">
-My name is Sarah
+<human characters="18/5000">
+My name is Charles
 </human>
 <persona characters="45/5000">
 You are a helpful assistant that loves emojis



Now we can see the updated core memory: 

In [20]:
client.get_core_memory(agent_state.id).get_block("human")

Human(value='My name is Charles', limit=5000, template_name=None, is_template=False, label='human', description=None, metadata_={}, id='block-3c57b76e-8d1b-4bba-9c05-78424fdd0e51', organization_id='org-00000000-0000-4000-8000-000000000000', created_by_id='user-00000000-0000-4000-8000-000000000000', last_updated_by_id='user-00000000-0000-4000-8000-000000000000')

## Section 5: Modifying archival memory 
The agent can also use the archival memory store to save memories. Since archival memory is a vector DB, we can also directly insert in memories - this can be useful if you have external data sources that you want the agent to be able to connect to via memory. 

First, lets trigger the agent to write an archival memory: 

In [21]:
response = client.send_message(
    agent_id=agent_state.id, 
    message = "Save the information that 'bob loves cats' to archival", 
    role = "user"
) 
nb_print(response.messages)

We can also insert an archival memory manually: 

In [22]:
client.insert_archival_memory(
    agent_state.id, 
    "Bob's loves boston terriers"
)

[Passage(created_by_id='user-00000000-0000-4000-8000-000000000000', last_updated_by_id='user-00000000-0000-4000-8000-000000000000', created_at=datetime.datetime(2025, 1, 10, 6, 21, 6, 825968), updated_at=datetime.datetime(2025, 1, 10, 6, 21, 6), is_deleted=False, organization_id='org-00000000-0000-4000-8000-000000000000', agent_id='agent-c9b343f7-9d86-49e3-ad73-b014f6c53259', source_id=None, file_id=None, metadata_={}, id='passage-507e7cf5-194a-40d0-9ea4-e71344472cc4', text="Bob's loves boston terriers", embedding=[-0.01612803526222706, -0.07522200047969818, 0.05013394355773926, -0.011089341714978218, -0.01476821955293417, 0.04062578082084656, -0.03271988034248352, -0.022178683429956436, -0.00834335945546627, 0.005723871290683746, 0.01649697683751583, 0.013787887990474701, 0.011331789195537567, 0.04730890318751335, -0.01583288051187992, 0.002395487390458584, -0.029473192989826202, -0.010288210585713387, -0.014009254053235054, -0.04617045074701309, 0.02563619613647461, -0.00165760354138

Now, we can have the agent run RAG to answer a specific question: 

In [23]:
response = client.send_message(
    agent_id=agent_state.id, 
    role="user", 
    message="What animals do I like? Search archival."
)
nb_print(response.messages)