# Advanced Context Engineering: The Agent's Brain

Welcome to the **Master Class** on Semantica Context Engineering. This notebook demonstrates how to build a production-grade memory system for your AI agents.

Unlike simple chatbots that forget everything after a session, a **Context-Aware Agent** needs:
*   **Long-term Memory**: To recall facts from weeks ago.
*   **Structured Knowledge**: To understand how entities (People, Projects, Topics) are connected.
*   **Hybrid Retrieval**: To combine fuzzy text search with precise graph traversal.

## Learning Objectives

In this walkthrough, we will:
1.  **Initialize Production Stores**: Replace toy examples with real **Vector Stores** (FAISS) and **Graph Stores** (Neo4j).
2.  **Build the Agent Context**: Configure the central brain that orchestrates memory.
3.  **Ingest Knowledge**: Store complex documents and auto-extract entities.
4.  **Inject Relationships**: Manually teach the agent about connections in the world.
5.  **Perform GraphRAG**: Execute advanced queries that "hop" through the knowledge graph to find answers standard RAG misses.
6.  **Manage Lifecycle**: Learn to prune old memories and keep the system healthy.

---

## 1. Installation

To get started, simply install the package:

```bash
pip install semantica
```

In [None]:
!pip install -qU semantica 

In [None]:
import sys
import os
import time
from typing import Any, List, Dict, Optional

# Add project root to path to import semantica
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), "../../")))

# Core Imports
from semantica.context import AgentContext, ContextGraph, AgentMemory
from semantica.vector_store import VectorStore
from semantica.graph_store import GraphStore

print("Libraries imported successfully.")

---

## 2. Initialize Storage Backends

We will now connect to our persistent storage layers. Semantica abstracts these behind unified interfaces, so you can swap backends (e.g., switch from FAISS to Weaviate) without changing your application logic.

### Vector Store (The Library)
Holds the *content* of memories and documents, indexed by semantic meaning.

In [None]:
try:
    # Initialize FAISS Vector Store
    # You can also use: backend="weaviate", backend="qdrant", etc.
    vs = VectorStore(backend="faiss", dimension=768)
    print("VectorStore initialized (Backend: FAISS)")
except ImportError:
    print("FAISS not installed. Using in-memory fallback (not persistent).")
    vs = VectorStore(backend="inmemory", dimension=768)
except Exception as e:
    print(f"VectorStore Error: {e}")
    vs = None

### Graph Store (The Map)
Holds the *connections* between entities. This is crucial for reasoning.

In [None]:
try:
    # Initialize Neo4j Graph Store
    # Ensure your Docker container is running!
    gs = GraphStore(
        backend="neo4j",
        uri="bolt://localhost:7687",
        user="neo4j",
        password="password"
    )
    
    # Test connection
    if gs.connect():
        print("GraphStore connected (Backend: Neo4j)")
    else:
        raise ConnectionError("Could not connect to Neo4j")

except Exception as e:
    print(f"GraphStore Connection Failed: {e}")
    print("   Switching to in-memory ContextGraph (Non-persistent fallback)")
    gs = ContextGraph() # Fallback implementation

---

## 3. The Agent Context

The `AgentContext` is the high-level orchestrator. It sits on top of the Vector and Graph stores and manages the flow of information.

**Configuration for GraphRAG:**
*   `use_graph_expansion=True`: When retrieving, don't just look at the doc, look at its neighbors.
*   `max_expansion_hops=2`: How far to traverse? (e.g., A -> B -> C).
*   `hybrid_alpha=0.6`: Weighting. 0.0 is pure Vector, 1.0 is pure Graph. 0.6 favors graph slightly.

In [None]:
if vs:
    context = AgentContext(
        vector_store=vs,
        knowledge_graph=gs,
        retention_days=90,          # Remember things for 3 months
        use_graph_expansion=True,   # Enable GraphRAG
        max_expansion_hops=2,       # 2-Hop reasoning
        hybrid_alpha=0.6            # Balanced retrieval
    )
    print("Agent Context is online and ready.")
else:
    print("Cannot proceed without VectorStore.")

---

## 4. Ingestion: Teaching the Agent

We can store different types of information. The system is smart enough to distinguish between a conversational memory and a factual document.

### A. Episodic Memory (Conversations)
These are raw logs of interactions. They provide the "personal" history.

In [None]:
user_id = "user_123"
session_id = "session_alpha"

# Store a user preference
mem_id = context.store(
    content="I am working on a new project called 'Project Apollo' which uses Python and React.",
    conversation_id=session_id,
    user_id=user_id,
    metadata={"type": "user_preference"}
)
print(f"Memory Stored: {mem_id}")

### B. Semantic Knowledge (Documents)
When we feed documents, we want to **extract entities** and **link them**. 

*(Note: In a real setup, this uses an LLM to parse entities. Here we use the context module's native extraction capabilities.)*

In [None]:
documents = [
    {
        "content": "Project Apollo is a next-gen web framework designed for high scalability.",
        "metadata": {"source": "internal_wiki", "category": "projects"}
    },
    {
        "content": "Python 3.12 introduces significant performance improvements for async workloads.",
        "metadata": {"source": "tech_news", "category": "languages"}
    }
]

# Store documents and trigger graph build
stats = context.store(
    documents,
    extract_entities=True,      # Extract entities from text
    extract_relationships=True, # Infer relationships
    link_entities=True          # Connect to existing graph nodes
)

print("Knowledge Ingestion Stats:", stats)

---

## 5. Graph Engineering: Manual Injection

Sometimes automatic extraction isn't enough. You want to enforce specific business logic or relationships. We can use `build_graph` to manually inject nodes and edges.

**We will define:**
*   **User** (Alice)
*   **Role** (Admin)
*   **Project** (Apollo)
*   **Relationship**: Alice *MANAGES* Project Apollo.

In [None]:
# 1. Define Nodes
entities = [
    {"id": "alice", "type": "PERSON", "text": "Alice", "properties": {"role": "Admin"}},
    {"id": "project_apollo", "type": "PROJECT", "text": "Project Apollo"},
    {"id": "python", "type": "TECH", "text": "Python"},
    {"id": "react", "type": "TECH", "text": "React"}
]

# 2. Define Edges (The Knowledge)
relationships = [
    {"source": "alice", "target": "project_apollo", "type": "MANAGES", "weight": 1.0},
    {"source": "project_apollo", "target": "python", "type": "USES_TECH", "weight": 1.0},
    {"source": "project_apollo", "target": "react", "type": "USES_TECH", "weight": 1.0}
]

# 3. Inject into Graph
graph_stats = context.build_graph(
    entities=entities,
    relationships=relationships
)

print("Manual Graph Build Complete:", graph_stats)

### Visualizing the Graph Logic
Let's query the graph directly to see what "Project Apollo" looks like.

In [None]:
# Helper to print graph neighbors
def inspect_node(node_id):
    if hasattr(gs, "get_neighbors"):
        neighbors = gs.get_neighbors(node_id)
        print(f"\nNeighbors of '{node_id}':")
        for n in neighbors:
            # Handle different return formats between stores
            rel_type = n.get('relationship') or n.get('type') or 'linked'
            target = n.get('id') or n.get('node_id')
            print(f"   └── [{rel_type}] ──> {target}")
    else:
        print("Graph store does not support neighbor inspection.")

inspect_node("project_apollo")

---

## 6. Hybrid Retrieval (GraphRAG)

Now for the magic. We ask a question that requires connecting the dots.

**Query**: *"Who is responsible for the Python web framework project?"*

**Logic Flow:**
1.  **Vector Search**: Finds "Project Apollo" (described as web framework).
2.  **Graph Expansion**: Looks at "Project Apollo" in the graph.
3.  **Discovery**: Sees `(Alice)-[MANAGES]->(Project Apollo)`.
4.  **Result**: Returns Alice, even though her name wasn't in the project description text!

In [None]:
query = "Who is responsible for the Python web framework project?"
print(f"Asking: '{query}'...\n")

results = context.retrieve(
    query,
    max_results=3,
    use_graph=True,         # Vital for finding Alice
    expand_graph=True,      # Hop to neighbors
    include_entities=True   # Return structured entity data
)

print(f"Retrieved {len(results)} context items:\n")

for i, res in enumerate(results, 1):
    print(f"{i}. [Score: {res['score']:.2f}] {res['content'][:120]}...")
    
    # Did we find graph connections?
    if 'related_entities' in res and res['related_entities']:
        print("   Graph Insights:")
        for ent in res['related_entities'][:3]:
            print(f"      - {ent.get('text', 'Entity')} ({ent.get('type', 'Unknown')})")
    print("")

---

## 7. Lifecycle Management

A production system needs maintenance. You can query history, check health, and prune old data.

### Conversation History

In [None]:
# Get recent chat history for context window
history = context.conversation(
    conversation_id=session_id,
    limit=5
)

print(f"Chat History for {session_id}:")
for msg in history:
    print(f" - {msg['content']}")

### System Health & Stats

In [None]:
stats = context.stats()
print("System Vital Signs:")
print(f"   - Total Memories: {stats.get('total_items', 0)}")
print(f"   - Graph Nodes:    {stats.get('graph_stats', {}).get('node_count', 'N/A')}")
print(f"   - Graph Edges:    {stats.get('graph_stats', {}).get('edge_count', 'N/A')}")

## Summary

You have successfully built a **Context-Aware Agent** using Semantica's production modules.

**Key Achievements:**
1.  **Persistence**: Swapped in FAISS and Neo4j for real-world storage.
2.  **GraphRAG**: Demonstrated how graph relationships improve retrieval accuracy.
3.  **Entity Injection**: Manually taught the agent about business relationships.

This architecture is ready to scale to millions of vectors and graph nodes.