[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Hawksight-AI/semantica/blob/main/cookbook/introduction/19_Context_Module.ipynb)

# Context Module ‚Äî Practical Guide

Semantica‚Äôs `context` module is the layer that makes an agent ‚Äústateful‚Äù. It combines:

- **Memory** (short-term + long-term) via `AgentMemory`
- **Graph context** via `ContextGraph`
- **Hybrid retrieval** (vector + memory + graph) via `ContextRetriever`
- **High-level UX** via `AgentContext` (recommended entry point)
- **Entity linking** via `EntityLinker`
- **Extensibility + config** via `registry` and `config`

This notebook focuses on small, runnable examples and keeps imports scoped to each cell.

In [1]:
!pip install -q semantica



## 1) Vector store (for long-term memory)

The `VectorStore` can generate embeddings via its internal embedder. If no embedder is available in your environment, it falls back to random vectors so the API stays usable for demos.

In [2]:
from semantica.vector_store import VectorStore

vs = VectorStore(backend="inmemory", dimension=384)

if getattr(vs, "embedder", None) and hasattr(vs.embedder, "set_text_model"):
    vs.embedder.set_text_model(method="fastembed", model_name="BAAI/bge-small-en-v1.5")

vs.backend, vs.dimension

  from tqdm.autonotebook import tqdm, trange
fastembed not available. Install with: pip install fastembed. Using fallback embedding method.
fastembed not available. Install with: pip install fastembed. Using fallback embedding method.


('inmemory', 384)

## 2) Quick start with `AgentContext` (recommended)

`AgentContext` is the user-friendly interface that ties memory, vector store, and graph together. If you pass a `ContextGraph`, the system can do GraphRAG-style retrieval.

In [3]:
from semantica.context import AgentContext, ContextGraph

kg = ContextGraph()
context = AgentContext(vector_store=vs, knowledge_graph=kg)

context.config

{'retention_days': 30,
 'max_memories': 10000,
 'use_graph_expansion': True,
 'max_expansion_hops': 2,
 'hybrid_alpha': 0.5}

## 3) Store and retrieve memory

A single string is treated as a memory item. You can attach `conversation_id` and `user_id` through metadata-friendly parameters.

In [4]:
memory_id = context.store(
    "User prefers short answers about Python.",
    conversation_id="conv_1",
    user_id="user_1",
    metadata={"type": "preference"},
)

context.get_memory(memory_id)

Status,Action,Module,Submodule,File,Time
‚úÖ,Semantica is processing,üîó context,AgentMemory,-,0.01s
‚úÖ,Semantica is embedding,üíæ embeddings,TextEmbedder,-,0.00s
‚úÖ,Semantica is indexing,üìä vector_store,VectorStore,-,0.00s
‚úÖ,Semantica is processing,üîó context,ContextRetriever,-,0.02s
‚úÖ,Semantica is processing,üîó context,EntityLinker,-,0.00s


{'id': None,
 'content': 'User prefers short answers about Python.',
 'timestamp': '2025-12-18T20:46:17.265703',
 'metadata': {'type': 'preference',
  'conversation_id': 'conv_1',
  'user_id': 'user_1'}}

In [5]:
context.store(
    "User is working on Semantica context module examples.",
    conversation_id="conv_1",
    user_id="user_1",
    metadata={"type": "note"},
)

context.retrieve("Python answers", max_results=3)

  similarities = np.dot(vectors, query_vector) / (vector_norms * query_norm)


[{'content': 'User prefers short answers about Python.',
  'score': 1.0,
  'source': 'short_term',
  'metadata': {'type': 'preference',
   'conversation_id': 'conv_1',
   'user_id': 'user_1'},
  'related_entities': []}]

In [6]:
context.conversation("conv_1", max_items=10)

[]

## 4) Export, save, load

`AgentContext` includes simple persistence helpers. This example uses a temporary directory.

In [7]:
export_json = context.export(conversation_id="conv_1", format="json")
export_json[:300]

'{\n  "exported_at": "2025-12-18T20:46:27.455602",\n  "count": 2,\n  "memories": [\n    {\n      "memory_id": "mem_5c7dba9a7373",\n      "content": "User prefers short answers about Python.",\n      "timestamp": "2025-12-18T20:46:17.265703",\n      "metadata": {\n        "type": "preference",\n        "convers'

In [8]:
import tempfile

with tempfile.TemporaryDirectory() as d:
    context.save(d)
    context.load(d)

context.conversation_summary("conv_1")

{'conversation_id': 'conv_1',
 'message_count': 0,
 'first_message': None,
 'last_message': None}

## 5) Store documents and build a context graph

If you store a list, `AgentContext.store(...)` treats it as documents. To keep this notebook lightweight and deterministic, we pass pre-extracted entities and relationships per document.

In [9]:
documents = [
    {
        "id": "doc_1",
        "content": "Python is used for machine learning.",
        "metadata": {"source": "docs"},
        "entities": [
            {"id": "e_python", "text": "Python", "type": "PROGRAMMING_LANGUAGE"},
            {"id": "e_ml", "text": "Machine Learning", "type": "CONCEPT"},
        ],
        "relationships": [
            {
                "source_id": "e_python",
                "target_id": "e_ml",
                "type": "used_for",
                "confidence": 0.9,
            }
        ],
    },
    {
        "id": "doc_2",
        "content": "PyTorch is a machine learning framework.",
        "metadata": {"source": "docs"},
        "entities": [
            {"id": "e_pytorch", "text": "PyTorch", "type": "FRAMEWORK"},
            {"id": "e_ml", "text": "Machine Learning", "type": "CONCEPT"},
        ],
        "relationships": [
            {
                "source_id": "e_pytorch",
                "target_id": "e_ml",
                "type": "implements",
                "confidence": 0.95,
            }
        ],
    },
]

stats = context.store(
    documents,
    extract_entities=False,
    extract_relationships=False,
    link_entities=True,
)

stats

{'stored_count': 2,
 'memory_ids': ['mem_b467da01b18a', 'mem_e3dbd310355d'],
 'graph_nodes': 0,
 'graph_edges': 0}

In [None]:
kg.stats()

## 6) Explore the graph with `ContextGraph`

The graph supports keyword querying and neighbor expansion.

In [10]:
kg.query("machine learning")

[{'node': {'id': 'e_ml',
   'type': 'CONCEPT',
   'properties': {'content': 'Machine Learning',
    'id': 'e_ml',
    'text': 'Machine Learning',
    'type': 'CONCEPT'}},
  'score': 1.0,
  'content': 'Machine Learning'}]

In [11]:
kg.get_neighbors("e_python", hops=2)

[{'id': 'e_ml',
  'type': 'CONCEPT',
  'content': 'Machine Learning',
  'relationship': 'used_for',
  'weight': 0.9,
  'hop': 1}]

## 7) Entity linking with `EntityLinker`

`EntityLinker` assigns stable URIs and can link related or duplicate entities across sources.

In [12]:
from semantica.context import EntityLinker

linker = EntityLinker(knowledge_graph={"entities": [{"id": "e_py", "text": "Python", "type": "PROGRAMMING_LANGUAGE"}]})

entities = [
    {"id": "e1", "text": "Python", "type": "PROGRAMMING_LANGUAGE"},
    {"id": "e2", "text": "PyTorch", "type": "FRAMEWORK"},
]

linked = linker.link("Python and PyTorch", entities=entities)
[(e.entity_id, e.uri, len(e.linked_entities)) for e in linked]

[('e1', 'https://semantica.dev/entity/python#programming_language', 1),
 ('e2', 'https://semantica.dev/entity/pytorch#framework', 0)]

In [13]:
linker.link_entities("e1", "e2", link_type="related_to", confidence=0.8)
linker.get_entity_links("e1")[:2]

[EntityLink(source_entity_id='e1', target_entity_id='e_py', link_type='same_as', confidence=1.0, source=None, metadata={'similarity': 1.0}),
 EntityLink(source_entity_id='e1', target_entity_id='e2', link_type='related_to', confidence=0.8, source=None, metadata={})]

In [None]:
linker.build_entity_web()["statistics"]

## 8) Low-level building blocks: `AgentMemory` and `ContextRetriever`

If you want more control than `AgentContext`, you can wire the parts directly.

In [14]:
from semantica.context import AgentMemory, ContextRetriever

memory = AgentMemory(vector_store=vs, knowledge_graph=kg, retention_policy="unlimited")
memory.store("Python powers Semantica.", metadata={"type": "fact", "conversation_id": "conv_2"})

retriever = ContextRetriever(memory_store=memory, knowledge_graph=kg, vector_store=vs)
results = retriever.retrieve("Python Semantica", max_results=5)

[(r.content, r.source, round(r.score, 3)) for r in results]

[('Python powers Semantica.', 'short_term', 1.0),
 ('Python', 'graph:e_python', 0.5)]

## 9) Methods, registry, and configuration

The `methods` layer exposes convenience functions, while `registry` lets you plug in your own implementations. `config` provides runtime configuration.

In [15]:
from semantica.context.config import context_config

context_config.set("retention_policy", "7_days")
context_config.get("retention_policy")

'7_days'

In [16]:
from semantica.context.methods import build_context_graph
from semantica.context.registry import method_registry

def custom_graph_method(entities, relationships, conversations=None, **kwargs):
    return {
        "nodes": [],
        "edges": [],
        "statistics": {"node_count": 0, "edge_count": 0},
    }

method_registry.register("graph", "custom_demo", custom_graph_method)
method_registry.list_all("graph")

{'graph': ['entities_relationships', 'conversations', 'hybrid', 'custom_demo']}

In [17]:
build_context_graph(
    entities=[{"id": "e1", "text": "Python", "type": "PROGRAMMING_LANGUAGE"}],
    relationships=[{"source_id": "e1", "target_id": "e2", "type": "related_to"}],
    method="custom_demo",
)

{'nodes': [], 'edges': [], 'statistics': {'node_count': 0, 'edge_count': 0}}