https://devblogs.microsoft.com/foundry/azure-ai-mem0-integration/


TL;DR
Learn how to integrate Mem0 with Azure AI Search and Azure OpenAI to create AI applications with persistent memory. This tutorial provides code examples for setting up a memory layer using Azure services and demonstrates how to build a travel planning assistant that remembers user preferences across conversations.

Introduction
One of the key limitations of most AI systems is their inability to maintain context beyond a single session. This lack of memory significantly impacts the quality of user interactions, often requiring users to repeat information they’ve already provided. Enter Mem0, a powerful memory layer designed specifically for AI applications.

In this guide, we’ll explore how to integrate Mem0 with Azure AI services to create AI applications with persistent memory. We’ll cover:

Setting up Mem0 with Azure AI Search and Azure OpenAI
Basic memory operations (storing, retrieving, and searching memories)
Building a practical travel planning assistant that remembers user preferences
Prerequisites
Azure OpenAI account with access to model deployments
Azure AI Search service
Python environment with required packages
First, let’s install the necessary packages:

In [24]:
%pip install mem0ai python-dotenv openai azure-identity azure_search azure_search_documents

Note: you may need to restart the kernel to use updated packages.


Setting Up Your Azure Environment
To get started, you’ll need to configure your Azure environment variables:

In [25]:
import os
from mem0 import Memory
from openai import AzureOpenAI

# Load Azure OpenAI configuration
AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT")
AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY")
AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME = os.getenv("AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME")
AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME = os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME")

# Load Azure AI Search configuration
SEARCH_SERVICE_ENDPOINT = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")
SEARCH_SERVICE_API_KEY = os.getenv("AZURE_SEARCH_ADMIN_KEY")
SEARCH_SERVICE_NAME = "kmpzzcj6ybivvm-search"

# Create Azure OpenAI client
azure_openai_client = AzureOpenAI(
    azure_endpoint=AZURE_OPENAI_ENDPOINT,
    api_key=AZURE_OPENAI_API_KEY,
    api_version="2024-10-21"
)

Basic Memory Operations with Mem0 and Azure AI Search
Let’s start with a simple example demonstrating how to store and retrieve memories:

In [26]:
# Configure Mem0 with Azure AI Search
memory_config = {
    "vector_store": {
        "provider": "azure_ai_search",
        "config": {
            "service_name": SEARCH_SERVICE_NAME,
            "api_key": SEARCH_SERVICE_API_KEY,
            "collection_name": "memories",
            "embedding_model_dims": 3072,
        },
    },
    "embedder": {
        "provider": "azure_openai",
        "config": {
            "model": AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME,
            "embedding_dims": 3072,
            "azure_kwargs": {
                "api_version": "2024-10-21",
                "azure_deployment": AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME,
                "azure_endpoint": AZURE_OPENAI_ENDPOINT,
                "api_key": AZURE_OPENAI_API_KEY,
            },
        },
    },
    "llm": {
        "provider": "azure_openai",
        "config": {
            "model": AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME,
            "temperature": 0.1,
            "max_tokens": 2000,
            "azure_kwargs": {
                "azure_deployment": AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME,
                "api_version": "2024-10-21",
                "azure_endpoint": AZURE_OPENAI_ENDPOINT,
                "api_key": AZURE_OPENAI_API_KEY,
            },
        },
    },
    "version": "v1.1",
}

# Initialize memory
memory = Memory.from_config(memory_config)
print("Mem0 initialized with Azure AI Search")

Mem0 initialized with Azure AI Search


The configuration above sets up:

1) Vector Store: Using Azure AI Search to store and retrieve vectors
2) Embedder: Using Azure OpenAI to generate embeddings for semantic search
3) LLM: Using Azure OpenAI for language model capabilities
Storing Memories
With Mem0, you can store three types of memories:

1. Simple statements:

In [27]:
memory.add(
    "I enjoy hiking in national parks and taking landscape photos.",
    user_id="demo_user"
)

{'results': []}

2. Conversations:


In [28]:
conversation = [
    {"role": "user", "content": "Hello, I'm interested in planning a relaxing holiday."},
    {"role": "assistant", "content": "Great! Do you have a destination in mind or would you like some suggestions?"},
    {"role": "user", "content": "I'd like some suggestions for peaceful places to visit and local attractions."}
]
memory.add(conversation, user_id="demo_user")

{'results': []}

3. Memories with metadata:

In [29]:
memory.add(
    "I prefer window seats on long flights and usually bring my own headphones.",
    user_id="demo_user",
    metadata={"category": "travel_preferences", "importance": "medium"}
)

{'results': []}

**Searching Memories**

When you need to retrieve relevant memories, you can use the **search** method:

In [30]:
search_results = memory.search(
    "What are this user's travel plans?",
    user_id="demo_user",
    limit=3
)

for i, result in enumerate(search_results['results'], 1):
    print(f"{i}. {result['memory']} (Score: {result['score']:.4f})")

1. Interested in planning a relaxing holiday (Score: 0.6246)
2. Prefers window seats on long flights (Score: 0.5978)
3. Usually brings own headphones on flights (Score: 0.5915)


This will return memories sorted by relevance to the query, along with their similarity scores.

Retrieving All Memories
You can also retrieve all memories for a user:

In [31]:
all_memories = memory.get_all(user_id="demo_user")
print(f"Total memories: {len(all_memories['results'])}")

Total memories: 6


**Building a Travel Planning Assistant with Memory**

Now, let’s create a more practical example: a London travel planning assistant that remembers user preferences across conversations.

In [32]:
class TravelAssistant:
    def __init__(self, user_id):
        """Initialize travel assistant with memory for a specific user"""
        self.user_id = user_id

        # Configure memory for travel planning
        memory_config = {
            "vector_store": {
                "provider": "azure_ai_search",
                "config": {
                    "service_name": SEARCH_SERVICE_NAME,
                    "api_key": SEARCH_SERVICE_API_KEY,
                    "collection_name": "travel_memories",
                    "embedding_model_dims": 3072,
                    "compression_type": "binary",
                },
            },
            "llm": {
                "provider": "azure_openai",
                "config": {
                    "model": AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME,
                    "temperature": 0.7,
                    "max_tokens": 800,
                    "azure_kwargs": {
                        "azure_deployment": AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME,
                        "api_version": "2024-10-21",
                        "azure_endpoint": AZURE_OPENAI_ENDPOINT,
                        "api_key": AZURE_OPENAI_API_KEY,
                    },
                },
            },
            "embedder": {
                "provider": "azure_openai",
                "config": {
                    "model": AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME,
                    "embedding_dims": 3072,
                    "azure_kwargs": {
                        "api_version": "2024-10-21",
                        "azure_deployment": AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME,
                        "azure_endpoint": AZURE_OPENAI_ENDPOINT,
                        "api_key": AZURE_OPENAI_API_KEY,
                    },
                },
            },
            "version": "v1.1",
        }

        self.memory = Memory.from_config(memory_config)
        self.azure_client = azure_openai_client
        print(f"Travel Assistant initialized for user {user_id}")

    def get_response(self, query, memory_context=True):
        """Get response from Azure OpenAI with memory context"""
        # Retrieve relevant memories if enabled
        memory_text = ""
        if memory_context:
            memories = self.memory.search(query, user_id=self.user_id)
            if 'results' in memories and memories['results']:
                memory_text = "\n\nRelevant information from previous conversations:\n"
                for i, mem in enumerate(memories['results'][:5], 1):
                    memory_text += f"{i}. {mem['memory']}\n"
                print(f"Including {len(memories['results'][:5])} memories in context")
            else:
                print("No relevant memories found")

        # Construct messages with system prompt and memory context
        messages = [
            {
                "role": "system",
                "content": "You are a helpful travel assistant for London travel planning. "
                           "Be concise, specific, and helpful. Refer to the user by name when appropriate. "
                           "Recommend specific places when relevant."
            },
            {
                "role": "user",
                "content": f"{query}\n{memory_text}"
            }
        ]

        # Get response from Azure OpenAI
        response = self.azure_client.chat.completions.create(
            model=AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME,
            messages=messages,
            temperature=0.7,
            max_tokens=800,
        )

        # Extract response content
        response_content = response.choices[0].message.content

        # Store the conversation in memory
        conversation = [
            {"role": "user", "content": query},
            {"role": "assistant", "content": response_content}
        ]
        self.memory.add(conversation, user_id=self.user_id)

        return response_content

    def verify_memories(self):
        """Verify what memories have been stored"""
        all_memories = self.memory.get_all(user_id=self.user_id)
        print(f"\n===== STORED MEMORIES ({len(all_memories['results'])}) =====")
        for i, memory in enumerate(all_memories['results'], 1):
            print(f"{i}. {memory['memory']}")
        print("==============================\n")
        return all_memories

**Using the Travel Assistant**

Now, let’s put our travel assistant to work:

In [33]:
# Create travel assistant for a user
assistant = TravelAssistant(user_id="suresh_london_2025")

# First interaction - Initial inquiry
query1 = "Hi, my name is Suresh. I'm planning a business trip to London next month for about 5 days."
print(f"User: {query1}")
response1 = assistant.get_response(query1, memory_context=False)  # No memories yet
print(f"Assistant: {response1}\n")

# Second interaction - Specific question about fish and chips
query2 = "I need recommendations for fish and chips restaurants near London Bridge cause I love the taste!"
print(f"User: {query2}")
response2 = assistant.get_response(query2)  # Should use memory context
print(f"Assistant: {response2}\n")

# Verify what memories have been stored
assistant.verify_memories()

Travel Assistant initialized for user suresh_london_2025
User: Hi, my name is Suresh. I'm planning a business trip to London next month for about 5 days.
Assistant: Hi Suresh! That sounds exciting. For your business trip to London, here are a few tips and recommendations:

1. **Accommodation**: Consider staying in areas like the City of London or Canary Wharf if your meetings are business-oriented. These areas have excellent transport links and plenty of business-friendly hotels like The Ned or Hilton London Canary Wharf.

2. **Transport**: The London Underground (Tube) is efficient for getting around, but consider a contactless payment method or an Oyster card for convenience. For short distances, walking can be pleasant, especially in central areas.

3. **Dining**: If you're looking to impress clients or colleagues, try dining at places like The Ivy or Hawksmoor for a classic British dining experience. For more casual meetings, Dishoom offers a lively atmosphere with delicious Indian

{'results': [{'id': '3bab7819-17ef-4617-858e-abf6181f7258',
   'memory': 'Planning a business trip to London next month for about 5 days',
   'hash': '031c26def9826fc727a9376b558310ec',
   'metadata': None,
   'created_at': '2025-06-04T22:23:20.001106-07:00',
   'updated_at': None,
   'user_id': 'suresh_london_2025'},
  {'id': 'ab642a64-3f4d-4c58-9352-d5b62d4b7922',
   'memory': 'Name is Suresh',
   'hash': '1afd89fe2fb284397171eaab71ec42c1',
   'metadata': None,
   'created_at': '2025-06-04T22:23:19.730070-07:00',
   'updated_at': None,
   'user_id': 'suresh_london_2025'},
  {'id': 'ac063b30-e6db-412b-9a25-9a09f0e13f56',
   'memory': 'Needs recommendations for fish and chips restaurants near London Bridge',
   'hash': '1052f62dd8cb41ece8e968ce0c604aaf',
   'metadata': None,
   'created_at': '2025-06-04T22:23:26.382988-07:00',
   'updated_at': None,
   'user_id': 'suresh_london_2025'},
  {'id': '5f1617f0-5a1b-4eb0-b78f-aca5c3f9cbbc',
   'memory': 'Loves the taste of fish and chips',
  

This demonstration shows how the assistant:

1. Stores Suresh’s name and travel plans from the first interaction
2. Remembers these details in the second interaction
3. Uses the memory context to provide a personalized response

**Searching for Specific Preferences**

You can also directly search for specific user preferences:

In [34]:
search_query = "What are Farzad's preferences for food in London?"
search_results = assistant.memory.search(search_query, user_id="suresh_london_2025")
print(f"Found {len(search_results['results'])} relevant memories:")
for i, result in enumerate(search_results['results'][:3], 1):
    print(f"{i}. {result['memory']} (Score: {result['score']:.4f})")

Found 4 relevant memories:
1. Needs recommendations for fish and chips restaurants near London Bridge (Score: 0.6556)
2. Loves the taste of fish and chips (Score: 0.6482)
3. Planning a business trip to London next month for about 5 days (Score: 0.5847)


This might return results like:

Found 3 relevant memories:
1. Name is Suresh (Score: 0.6696)
2. Needs recommendations for fish and chips restaurants near London Bridge (Score: 0.6564)
3. Loves the taste of fish and chips (Score: 0.6324)

**The Power of Persistent Memory**

The key advantage of this approach is that the assistant maintains context across multiple interactions. By leveraging Mem0 with Azure AI services, we’ve created a system that:

1. Remembers user details: The assistant stores information like names, preferences, and travel plans
2. Personalizes responses: By retrieving relevant memories, the assistant can refer to the user by name and tailor recommendations
3. Improves over time: As more interactions occur, the system builds a more comprehensive understanding of the user
This persistent memory dramatically improves the user experience by eliminating the need to repeat information in every conversation.

**Conclusion**

Integrating Mem0 with Azure AI services opens up a world of possibilities for creating more personalized and context-aware AI applications. By maintaining user memories across interactions, we can build assistants that feel more intelligent and responsive to user needs.

This tutorial has shown you how to:

* Configure Mem0 with Azure AI Search and Azure OpenAI
* Store and retrieve different types of memories
* Build a practical travel assistant that remembers user preferences
As you implement this in your own applications, consider the different types of memories you might want to store and how they can be used to enhance user experiences. With Mem0’s flexible memory system, you can create AI applications that truly understand and adapt to individual users over time.

**Next Steps**

* Explore using different types of metadata to categorize memories
* Implement memory expiration or importance scoring
* Combine memories with other Azure services like Azure AI Services for more advanced features
For more information on Mem0 and its capabilities, visit the Mem0 documentation.
