## Memory 

There are several use cases where it is valuable to maintain a bank of useful facts that can be intelligently added to the context of the agent just before a specific step. The typically use case here is a RAG pattern where a query is used to retrieve relevant information from a database that is then added to the agent's context.


AgentChat provides a `Memory` protocol that can be extended to provide this functionality.  The key methods are `query`, `add`, `clear`, and `cleanup`. The `query` method is used to retrieve relevant information from the memory store, the `add` method is used to add new entries to the memory store, the `clear` method is used to clear all entries from the memory store, and the `cleanup` method is used to clean up any resources used by the memory store.


## ListMemory

ListMemory is a simple list-based memory implementation that uses text similarity matching to retrieve relevant information from the memory store. The similarity score is calculated using the `SequenceMatcher` class from the `difflib` module. The similarity score is calculated between the query text and the content text of each memory entry.   

In [1]:
from autogen_ext.models import OpenAIChatCompletionClient
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.memory._base_memory import MemoryEntry
from autogen_agentchat.task import Console, TextMentionTermination, MaxMessageTermination
from autogen_agentchat.memory._list_memory import ListMemory, MemoryEntry


# create a simple memory item
memory = ListMemory()
await memory.add(MemoryEntry(content="Whenever you are asked for the weather, return it in metric units."))


async def get_weather(city: str, units: str = "imperial") -> str:
    if units == "imperial":
        return f"The weather in {city} is 73 degrees and Sunny."
    elif units == "metric":
        return f"The weather in {city} is 23 degrees and Sunny."


weather_agent = AssistantAgent(
    name="weather_agent",
    model_client=OpenAIChatCompletionClient(
        model="gpt-4o-2024-08-06",
    ),
    tools=[get_weather],
    memory=memory,
)

agent_team = RoundRobinGroupChat([weather_agent], termination_condition=TextMentionTermination("TERMINATE"))

# Run the team and stream messages to the console
stream = agent_team.run_stream(task="What is the weather in New York?")
await Console(stream);

---------- user ----------
What is the weather in New York?
---------- weather_agent ----------
[FunctionCall(id='call_oIYlxvmJ6k9JxfPx96JuDz1G', arguments='{"city":"New York","units":"metric"}', name='get_weather')]
[Prompt tokens: 130, Completion tokens: 19]
---------- weather_agent ----------
[FunctionExecutionResult(content='The weather in New York is 23 degrees and Sunny.', call_id='call_oIYlxvmJ6k9JxfPx96JuDz1G')]
---------- weather_agent ----------
The weather in New York is 23°C and sunny. TERMINATE
[Prompt tokens: 172, Completion tokens: 15]
---------- Summary ----------
Number of messages: 4
Finish reason: Text 'TERMINATE' mentioned
Total prompt tokens: 302
Total completion tokens: 34
Duration: 1.55 seconds


## Vector DB Memory (ChromaDB)

Similarly, we can implement a memory store that uses a vector database to store and retrieve information.  `ChromaMemory` is a memory implementation that uses ChromaDB to store and retrieve information. ChromaDB is a vector database that is optimized for similarity search.  
 

In [2]:
# !pip install chromadb sentence-transformers

In [3]:
from autogen_core.base import CancellationToken
from autogen_ext.models import OpenAIChatCompletionClient
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.messages import TextMessage
from autogen_agentchat.memory._base_memory import MemoryEntry
from autogen_agentchat.memory._chroma_memory import ChromaMemory, ChromaMemoryConfig


# Initialize memory
chroma_memory = ChromaMemory(
    name="travel_memory",
    config=ChromaMemoryConfig(
        collection_name="travel_facts",
        # Configure number of results to return instead of similarity threshold
        k=1,
    ),
)
# Add some travel-related memories
await chroma_memory.add(
    MemoryEntry(content="Paris is known for the Eiffel Tower and amazing cuisine.", source="travel_guide")
)

await chroma_memory.add(
    MemoryEntry(
        content="The most important thing about tokyo is that it has the world's busiest railway station - Shinjuku Station.",
        source="travel_facts",
    )
)

results = await chroma_memory.query("Tell me about Tokyo.")
print(len(results), results)

1 [MemoryQueryResult(entry=MemoryEntry(content="The most important thing about tokyo is that it has the world's busiest railway station - Shinjuku Station.", metadata={}, timestamp=datetime.datetime(2024, 11, 30, 15, 28, 58, 195846), source='travel_facts'), score=0.5832697153091431)]


In [5]:
# Create agent with memory
agent = AssistantAgent(
    name="travel_agent",
    model_client=OpenAIChatCompletionClient(
        model="gpt-4o",
        # api_key="your_api_key"
    ),
    memory=chroma_memory,
    system_message="You are a travel expert",
)

agent_team = RoundRobinGroupChat([agent], termination_condition=MaxMessageTermination(max_messages=2))
stream = agent_team.run_stream(task="Tell me the most important thing about Tokyo.")
await Console(stream);

---------- user ----------
Tell me the most important thing about Tokyo.
---------- travel_agent ----------
One of the most important aspects of Tokyo is that it has the world's busiest railway station, Shinjuku Station. This station serves as a major hub for transportation, with millions of commuters and travelers passing through its complex network of train lines each day. It highlights Tokyo's status as a bustling metropolis with an advanced public transportation system.
[Prompt tokens: 72, Completion tokens: 66]
---------- Summary ----------
Number of messages: 2
Finish reason: Maximum number of messages 2 reached, current message count: 2
Total prompt tokens: 72
Total completion tokens: 66
Duration: 1.47 seconds
