# LangGraph Neo4j MCP Agent (SageMaker)

This notebook uses **pre-installed packages** - no pip install needed!

SageMaker Studio already has:
- langchain 1.2.6
- langgraph 1.0.6
- langchain-mcp-adapters 0.2.1
- mcp 1.25.0

## 1. Verify Pre-installed Packages

In [None]:
import importlib.metadata

packages = [
    "langchain",
    "langchain-core",
    "langgraph",
    "langchain-aws",
    "langchain-mcp-adapters",
    "mcp",
    "httpx",
    "boto3",
]

print("Pre-installed packages:")
print("-" * 50)
for pkg in packages:
    try:
        version = importlib.metadata.version(pkg)
        print(f"{pkg:30} {version}")
    except importlib.metadata.PackageNotFoundError:
        print(f"{pkg:30} NOT INSTALLED")

In [None]:
# Install missing packages
%pip install langgraph>=1.0.6 langchain-mcp-adapters>=0.2.1 -q

## 2. Imports

In [None]:
import asyncio
import nest_asyncio

# Enable nested event loops for Jupyter
nest_asyncio.apply()

from langchain_aws import ChatBedrockConverse
from langgraph.prebuilt import create_react_agent
from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain_mcp_adapters.tools import load_mcp_tools

print("All imports successful!")

## 3. Configuration

### **ACTION REQUIRED**

Open `.mcp-credentials.json` and copy the values below:

```json
{
  "gateway_url": "<-- copy to GATEWAY_URL",
  "access_token": "<-- copy to ACCESS_TOKEN"
}
```

In [None]:
# =============================================================================
# REPLACE THESE VALUES with your credentials from .mcp-credentials.json
# =============================================================================

GATEWAY_URL = "YOUR_GATEWAY_URL_HERE"
ACCESS_TOKEN = "YOUR_ACCESS_TOKEN_HERE"

# AWS Bedrock settings
AWS_REGION = "us-west-2"

# Use inference profile ARN from Bedrock IDE (required for SageMaker Unified Studio)
# See MODEL.md for how to get this ARN
INFERENCE_PROFILE_ARN = "arn:aws:bedrock:us-west-2:159878781974:application-inference-profile/hsl5b7kh1279"

# Validate
if "YOUR_" in GATEWAY_URL or "YOUR_" in ACCESS_TOKEN:
    print("ERROR: Replace GATEWAY_URL and ACCESS_TOKEN above!")
    print("       Get values from .mcp-credentials.json")
else:
    print(f"Gateway: {GATEWAY_URL[:60]}...")
    print(f"Token:   {ACCESS_TOKEN[:30]}...")
    print(f"Region:  {AWS_REGION}")
    print(f"Profile: {INFERENCE_PROFILE_ARN}")
    print("\nConfiguration OK!")

## 4. System Prompt

In [None]:
SYSTEM_PROMPT = """You are a helpful Neo4j database assistant with access to tools that let you query a Neo4j graph database.

Your capabilities include:
- Retrieve the database schema to understand node labels, relationship types, and properties
- Execute read-only Cypher queries to answer questions about the data
- Do not execute any write Cypher queries

When answering questions about the database:
1. First retrieve the schema to understand the database structure
2. Formulate appropriate Cypher queries based on the actual schema
3. If a query returns no results, explain what you looked for and suggest alternatives
4. Format results in a clear, human-readable way
5. Cite the actual data returned in your response

Important Cypher notes:
- Use MATCH patterns that align with the actual schema
- For counting, use MATCH (n:Label) RETURN count(n)
- For listing items, add LIMIT to avoid overwhelming results
- Handle potential NULL values gracefully

Be concise but thorough in your responses."""

## 5. Initialize LLM

In [None]:
# Initialize LLM (can be reused across queries)
print(f"Initializing LLM: {INFERENCE_PROFILE_ARN}")
llm = ChatBedrockConverse(
    model=INFERENCE_PROFILE_ARN,
    provider="anthropic",  # Required when using ARN
    region_name=AWS_REGION,
    temperature=0,
)
print("LLM ready!")

## 6. Query Helper

Each query creates a fresh MCP connection and agent - this avoids session lifecycle issues in Jupyter.

In [None]:
async def query_async(question: str) -> str:
    """Ask the agent a question about the Neo4j database.
    
    Uses explicit session context manager to properly manage MCP connection lifecycle.
    """
    client = MultiServerMCPClient({
        "neo4j": {
            "transport": "streamable_http",
            "url": GATEWAY_URL,
            "headers": {"Authorization": f"Bearer {ACCESS_TOKEN}"},
        }
    })
    
    # Use explicit session context manager
    async with client.session("neo4j") as session:
        tools = await load_mcp_tools(session)
        
        # Create agent with tools
        agent = create_react_agent(
            model=llm,
            tools=tools,
            prompt=SYSTEM_PROMPT,
        )
        
        # Run the query within the session context
        result = await agent.ainvoke({"messages": [("human", question)]})
        messages = result.get("messages", [])
        return getattr(messages[-1], "content", str(messages[-1])) if messages else "No response"


def query(question: str) -> str:
    """Ask the agent a question about the Neo4j database."""
    print("=" * 70)
    print(f"Q: {question}")
    print("=" * 70)
    answer = asyncio.get_event_loop().run_until_complete(query_async(question))
    print(f"\nA: {answer}")
    return answer

## 7. Demo Queries

In [None]:
_ = query("What is the database schema? Give me a brief summary.")

In [None]:
_ = query("How many nodes are in the database by label?")

In [None]:
_ = query("What types of relationships exist in the database?")

## 8. Your Queries

In [None]:
_ = query("List 5 sample records from the most populated node type.")

In [None]:
# Your custom query
# _ = query("Your question here")