# Neptune as Graph Memory

In this notebook, we will be connecting using a Amazon Neptune Analytics instance as our memory graph storage for Mem0.

The Graph Memory storage persists memories in a graph or relationship form when performing `m.add` memory operations. It then uses vector distance algorithms to find related memories during a `m.search` operation. Relationships are returned in the result, and add context to the memories.

Reference: [Vector Similarity using Neptune Analytics](https://docs.aws.amazon.com/neptune-analytics/latest/userguide/vector-similarity.html)

## Prerequisites

### 1. Install Mem0 with Graph Memory support 

To use Mem0 with Graph Memory support (as well as other Amazon services), use pip install:

```bash
pip install "mem0ai[graph,extras]"
```

This command installs Mem0 along with the necessary dependencies for graph functionality (`graph`) and other Amazon dependencies (`extras`).

### 2. Connect to Amazon services

For this sample notebook, configure `mem0ai` with [Amazon Neptune Analytics](https://docs.aws.amazon.com/neptune-analytics/latest/userguide/what-is-neptune-analytics.html) as the graph store, [Amazon OpenSearch Serverless](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-overview.html) as the vector store, and [Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html) for generating embeddings.

Use the following guide for setup details: [Setup AWS Bedrock, AOSS, and Neptune](https://docs.mem0.ai/examples/aws_example#aws-bedrock-and-aoss)

Your configuration should look similar to:

```python
config = {
    "embedder": {
        "provider": "aws_bedrock",
        "config": {
            "model": "amazon.titan-embed-text-v2:0"
        }
    },
    "llm": {
        "provider": "aws_bedrock",
        "config": {
            "model": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
            "temperature": 0.1,
            "max_tokens": 2000
        }
    },
    "vector_store": {
        "provider": "opensearch",
        "config": {
            "collection_name": "mem0",
            "host": "your-opensearch-domain.us-west-2.es.amazonaws.com",
            "port": 443,
            "http_auth": auth,
            "connection_class": RequestsHttpConnection,
            "pool_maxsize": 20,
            "use_ssl": True,
            "verify_certs": True,
            "embedding_model_dims": 1024,
        }
    },
    "graph_store": {
        "provider": "neptune",
        "config": {
            "endpoint": f"neptune-graph://my-graph-identifier",
        },
    },
}
```

## Setup

Import all packages and setup logging

In [1]:
from mem0 import Memory
import os
import logging
import sys
import boto3
from opensearchpy import RequestsHttpConnection, AWSV4SignerAuth
from dotenv import load_dotenv

load_dotenv()

# logging.getLogger("mem0.graphs.neptune.main").setLevel(logging.DEBUG)
# logging.getLogger("mem0.graphs.neptune.base").setLevel(logging.DEBUG)
# logger = logging.getLogger(__name__)
# logger.setLevel(logging.DEBUG)

logging.basicConfig(
    format="%(levelname)s - %(message)s",
    datefmt="%Y-%m-%d %H:%M:%S",
    stream=sys.stdout,  # Explicitly set output to stdout
)

Setup the Mem0 configuration using:
- Amazon Bedrock as the embedder
- Amazon Neptune Analytics instance as a graph store
- OpenSearch as the vector store

In [9]:
bedrock_embedder_model = "amazon.titan-embed-text-v2:0"
bedrock_llm_model = "us.anthropic.claude-3-7-sonnet-20250219-v1:0"
embedding_model_dims = 1024

graph_identifier = os.environ.get("GRAPH_ID")

opensearch_host = os.environ.get("OS_HOST")
opensearch_port = os.environ.get("OS_PORT")

credentials = boto3.Session().get_credentials()
region = os.environ.get("AWS_REGION")
auth = AWSV4SignerAuth(credentials, region)

config = {
    "embedder": {
        "provider": "aws_bedrock",
        "config": {
            "model": bedrock_embedder_model,
        }
    },
    "llm": {
        "provider": "aws_bedrock",
        "config": {
            "model": bedrock_llm_model,
            "temperature": 0.1,
            "max_tokens": 2000
        }
    },
    "vector_store": {
        "provider": "opensearch",
        "config": {
            "collection_name": "mem0ai_vector_demo",
            "host": opensearch_host,
            "port": opensearch_port,
            "http_auth": auth,
            "embedding_model_dims": embedding_model_dims,
            "use_ssl": True,
            "verify_certs": True,
            "connection_class": RequestsHttpConnection,
        },
    },
    "graph_store": {
        "provider": "neptune",
        "config": {
            "endpoint": f"neptune-graph://{graph_identifier}",
        },
    },
}

## Graph Memory initializiation

Initialize Memgraph as a Graph Memory store:

In [10]:
m = Memory.from_config(config_dict=config)

user_id = "Andrew"

m.delete_all(user_id=user_id)



{'message': 'Memories deleted successfully!'}

## Store memories

Create memories and store one at a time:

In [12]:
messages = [
    {
        "role": "user",
        "content": "Gremlin is the graph traversal language of Apache TinkerPop. Gremlin is a functional, data-flow language that enables users to succinctly express complex traversals on (or queries of) their application's property graph. Every Gremlin traversal is composed of a sequence of (potentially nested) steps. A step performs an atomic operation on the data stream. Every step is either a map-step (transforming the objects in the stream), a filter-step (removing objects from the stream), or a sideEffect-step (computing statistics about the stream). The Gremlin step library extends on these 3-fundamental operations to provide users a rich collection of steps that they can compose in order to ask any conceivable question they may have of their data for Gremlin is Turing Complete.",
    },
]

# Store inferred memories (default behavior)
result = m.add(messages, user_id=user_id, metadata={"category": "gremlin"})

all_results = m.get_all(user_id=user_id)
print(f"all_results: {all_results}")
# for n in all_results["results"]:
#     print(f"node \"{n['memory']}\": [hash: {n['hash']}]")
#
# for e in all_results["relations"]:
#     print(f"edge \"{e['source']}\" --{e['relationship']}--> \"{e['target']}\"")

all_results: {'results': [], 'relations': [{'source': 'map-step', 'relationship': 'transforms', 'target': 'property_graph'}, {'source': 'map-step', 'relationship': 'is_type_of', 'target': 'step'}, {'source': 'traversal', 'relationship': 'composed_of', 'target': 'step'}, {'source': 'filter-step', 'relationship': 'removes_objects_from', 'target': 'property_graph'}, {'source': 'filter-step', 'relationship': 'is_type_of', 'target': 'step'}, {'source': 'step', 'relationship': 'operates_on', 'target': 'traversal'}, {'source': 'step', 'relationship': 'performs_operation_on', 'target': 'property_graph'}, {'source': 'gremlin', 'relationship': 'is', 'target': 'turing_complete'}, {'source': 'gremlin', 'relationship': 'is_turing_complete', 'target': 'gremlin'}, {'source': 'gremlin', 'relationship': 'queries', 'target': 'property_graph'}, {'source': 'gremlin', 'relationship': 'is', 'target': 'graph_traversal'}, {'source': 'gremlin', 'relationship': 'enables', 'target': 'graph_traversal'}, {'source'

In [None]:
all_results = m.get_all(user_id=user_id)
print(f"all_results: {all_results}")

In [None]:
messages = [
    {
        "role": "user",
        "content": "Gremlin is the graph traversal language of Apache TinkerPop. Gremlin is a functional, data-flow language that enables users to succinctly express complex traversals on (or queries of) their application's property graph. Every Gremlin traversal is composed of a sequence of (potentially nested) steps. A step performs an atomic operation on the data stream. Every step is either a map-step (transforming the objects in the stream), a filter-step (removing objects from the stream), or a sideEffect-step (computing statistics about the stream). The Gremlin step library extends on these 3-fundamental operations to provide users a rich collection of steps that they can compose in order to ask any conceivable question they may have of their data for Gremlin is Turing Complete.",
    },
    {
        "role": "assistant",
        "content": "You're absolutely right — that’s a solid summary of Gremlin and how it operates within the Apache TinkerPop graph computing framework."
    },
    {
        "role": "user",
        "content": "Gremlin was designed according to the 'write once, run anywhere'-philosophy. This means that not only can all TinkerPop-enabled graph systems execute Gremlin traversals, but also, every Gremlin traversal can be evaluated as either a real-time database query or as a batch analytics query. The former is known as an online transactional process (OLTP) and the latter as an online analytics process (OLAP). This universality is made possible by the Gremlin traversal machine. This distributed, graph-based virtual machine understands how to coordinate the execution of a multi-machine graph traversal. Moreover, not only can the execution either be OLTP or OLAP, it is also possible for certain subsets of a traversal to execute OLTP while others via OLAP. The benefit is that the user does not need to learn both a database query language and a domain-specific BigData analytics language (e.g. Spark DSL, MapReduce, etc.). Gremlin is all that is required to build a graph-based application because the Gremlin traversal machine will handle the rest."
    },
    {
        "role": "assistant",
        "content": "Exactly — what you’ve described here is one of Gremlin’s most powerful and distinctive design features: its platform-agnostic, dual-execution capability enabled by the Gremlin traversal machine."
    },
    {
        "role": "user",
        "content": "A Gremlin traversal can be written in either an imperative (procedural) manner, a declarative (descriptive) manner, or in a hybrid manner containing both imperative and declarative aspects. An imperative Gremlin traversal tells the traversers how to proceed at each step in the traversal. For instance, the imperative traversal in the first box first places a traverser at the vertex denoting Gremlin. That traverser then splits itself across all of Gremlin's collaborators that are not Gremlin himself. Next, the traversers walk to the managers of those collaborators to ultimately be grouped into a manager name count distribution. This traversal is imperative in that it tells the traversers to 'go here and then go there' in an explicit, procedural manner."
    },
    {
        "role": "user",
        "content": "A declarative Gremlin traversal does not tell the traversers the order in which to execute their walk, but instead, allows each traverser to select a pattern to execute from a collection of (potentially nested) patterns. The declarative traversal in the second box yields the same result as the imperative traversal above. However, the declarative traversal has the added benefit that it leverages not only a compile-time query planner (like imperative traversals), but also a runtime query planner that chooses which traversal pattern to execute next based on the historic statistics of each pattern -- favoring those patterns which tend to reduce/filter the most data."
    },
    {
        "role": "assistant",
        "content": "Yes — exactly. Gremlin's flexibility in supporting imperative, declarative, and hybrid styles is one of its key strengths, especially when working with graph-based systems where both traversal logic and expressive querying are important."
    },
    {
        "role": "user",
        "content": "The user can write their traversals in any way they choose. However, ultimately when their traversal is compiled, and depending on the underlying execution engine (i.e. an OLTP graph database or an OLAP graph processor), the user's traversal is rewritten by a set of traversal strategies which do their best to determine the most optimal execution plan based on an understanding of graph data access costs as well as the underlying data systems's unique capabilities (e.g. fetch the Gremlin vertex from the graph database's 'name'-index). Gremlin has been designed to give users flexibility in how they express their queries and graph system providers flexibility in how to efficiently evaluate traversals against their TinkerPop-enabled data system."
    },
    {
        "role": "assistant",
        "content": "Exactly — this is a key strength of Gremlin and the TinkerPop architecture."
    },
]

# Store inferred memories (default behavior)
result = m.add(messages, user_id=user_id, metadata={"category": "movie_recommendations"})

all_results = m.get_all(user_id=user_id)
for n in all_results["results"]:
    print(f"node \"{n['memory']}\": [hash: {n['hash']}]")

for e in all_results["relations"]:
    print(f"edge \"{e['source']}\" --{e['relationship']}--> \"{e['target']}\"")

In [None]:
messages = [
    {
        "role": "user",
        "content": "I'm not a big fan of thriller movies but I love sci-fi movies.",
    },
]

# Store inferred memories (default behavior)
result = m.add(messages, user_id=user_id, metadata={"category": "movie_recommendations"})

all_results = m.get_all(user_id=user_id)
for n in all_results["results"]:
    print(f"node \"{n['memory']}\": [hash: {n['hash']}]")

for e in all_results["relations"]:
    print(f"edge \"{e['source']}\" --{e['relationship']}--> \"{e['target']}\"")

In [None]:
messages = [
    {
        "role": "assistant",
        "content": "Got it! I'll avoid thriller recommendations and suggest sci-fi movies in the future.",
    },
]

# Store inferred memories (default behavior)
result = m.add(messages, user_id=user_id, metadata={"category": "movie_recommendations"})

all_results = m.get_all(user_id=user_id)
for n in all_results["results"]:
    print(f"node \"{n['memory']}\": [hash: {n['hash']}]")

for e in all_results["relations"]:
    print(f"edge \"{e['source']}\" --{e['relationship']}--> \"{e['target']}\"")

## Search memories

Search all memories for "what does alice love?".  Since "alice" the user, this will search for a relationship that fits the users love of "sci-fi" movies and dislike of "thriller" movies.

In [15]:
search_results = m.search("what does Gremlin do?", user_id=user_id)
for result in search_results["results"]:
    print(f"\"{result['memory']}\" [score: {result['score']}]")
for relation in search_results["relations"]:
    print(f"{relation}")

DEBUG - _search_graph_db
  query=
            MATCH (n )
            WHERE n.user_id = $user_id
            WITH n, $n_embedding as n_embedding
            CALL neptune.algo.vectors.distanceByEmbedding(
                n_embedding,
                n,
                {metric:"CosineSimilarity"}
            ) YIELD distance
            WITH n, distance as similarity
            WHERE similarity >= $threshold
            CALL {
                WITH n
                MATCH (n)-[r]->(m) 
                RETURN n.name AS source, id(n) AS source_id, type(r) AS relationship, id(r) AS relation_id, m.name AS destination, id(m) AS destination_id
                UNION ALL
                WITH n
                MATCH (m)-[r]->(n) 
                RETURN m.name AS source, id(m) AS source_id, type(r) AS relationship, id(r) AS relation_id, n.name AS destination, id(n) AS destination_id
            }
            WITH distinct source, source_id, relationship, relation_id, destination, destination_id, si

In [7]:
m.delete_all(user_id)
m.reset()



## Conclusion

In this example we demonstrated how an AWS tech stack can be used to store and retrieve memory context. Bedrock LLM models can be used to interpret given conversations.  OpenSearch can store text chunks with vector embeddings. Neptune Analytics can store the text chunks in a graph format with relationship entities.