# ðŸ““ The GenAI Revolution Cookbook

**Title:** How to Build Explainability AI by Design with OPA, MCP, and Neo4j

**Description:** Build production-ready control layers for agent systems using OPA, MCP, and Neo4j. Enforce policies, capture auditable traces, and insert human approvals without slowing teams or shipping velocity.

---

*This jupyter notebook contains executable code examples. Run the cells below to try out the code yourself!*



Full E2E prototype:

* CrewAI agent
* MCP enforcement server (FastAPI)
* OPA ingress and egress policies
* Small ML (spaCy) PII classifier
* Human approval
* Audit trail

# Enterprise AI Agent. Full E2E in Colab

Use this notebook to stand up a governed, auditable agent in Colab. You will run a CrewAI agent behind an MCP enforcement boundary. You will use OPA for policy decisions, a spaCy model for PII detection, and a human approval step. Each cell explains what it does and how it fits into the flow.

## Cell 1\. Install dependencies

Install the core packages for the runtime. This includes OPA tooling support, the FastAPI server for the MCP boundary, spaCy and its model for PII classification, and CrewAI for the agent.

In [None]:
!pip install crewai fastapi uvicorn requests spacy
!python -m spacy download en_core_web_sm

## Cell 2\. Download and start OPA. Binary and Colab safe

Fetch the OPA binary and prepare it to run in Colab. This gives you a local policy engine for both ingress and egress decisions.

In [None]:
!wget https://openpolicyagent.org/downloads/latest/opa_linux_amd64 -O opa
!chmod +x opa

Create OPA folders:

In [None]:
!mkdir -p opa/policies opa/data

## Cell 3\. OPA tool catalog. Facts only

Define a static catalog of tools and capabilities that OPA will use as facts. This keeps policy checks deterministic. It also separates facts from decisions.

In [None]:
%%writefile opa/data/tool_catalog.json
{
  "tools": {
    "initiate_refund": {
      "risk_level": "high",
      "data_classification": "financial",
      "max_auto_amount": 500,
      "approval_threshold": 2000
    }
  }
}

## Cell 4\. OPA ingress policy. Tool access

Authorize which tools the agent can call. Use clear rules that reference the tool catalog. You can allow or deny access before any tool executes.

In [None]:
%%writefile opa/policies/tool_access.rego
package tool_access

default decision = "deny"

tool := data.tools[input.tool.id]

decision = "allow" {
  input.agent.role == "support_agent"
  input.arguments.amount <= tool.max_auto_amount
}

decision = "require_approval" {
  input.agent.role == "support_agent"
  input.arguments.amount > tool.max_auto_amount
  input.arguments.amount <= tool.approval_threshold
}

## Cell 5\. OPA egress policy. PII control

Control sensitive output before it leaves the boundary. The policy evaluates the ML classifier result and the approval state. It returns a safe, redacted decision when needed.

In [None]:
%%writefile opa/policies/egress_control.rego
package egress_control

default action = "block"

# No PII â†’ allow
action = "allow" {
  not input.egress.contains_pii
}

# PII present, not cleared â†’ redact
action = "redact" {
  input.egress.contains_pii
  input.agent.clearance != "pii_allowed"
}

# PII present, cleared â†’ allow
action = "allow" {
  input.egress.contains_pii
  input.agent.clearance == "pii_allowed"
}

## Cell 6\. Start OPA server. Background

Start OPA in the background so you can query it from the client and the MCP server.

In [None]:
!./opa run --server opa/policies opa/data &

OPA now runs at <http://localhost:8181>

## Cell 7\. Audit logger

Capture every decision and event in a single audit log. Include inputs, policy outcomes, redaction decisions, and timestamps. This gives you traceability across the entire flow.

In [None]:
%%writefile audit.py
import json
import uuid
from datetime import datetime

def audit_log(event: dict) -> str:
    event["event_id"] = event.get("event_id") or str(uuid.uuid4())
    event["timestamp"] = datetime.utcnow().isoformat()
    with open("audit.log", "a", encoding="utf-8") as f:
        f.write(json.dumps(event) + "\n")
    return event["event_id"]

## Cell 8\. Approval store

Provide a simple store to record human approvals. This lets the policy check whether a sensitive disclosure has been reviewed before it is released.

In [None]:
%%writefile approvals.py
import uuid

PENDING_APPROVALS = {}

def create_approval(facts: dict) -> str:
    token = str(uuid.uuid4())
    PENDING_APPROVALS[token] = facts
    return token

def consume_approval(token: str):
    return PENDING_APPROVALS.pop(token, None)

## Cell 9\. OPA client

Create a lightweight client to send structured queries to OPA. Keep the input schema consistent. This ensures policies receive the data they need.

In [None]:
%%writefile opa_client.py
import requests

OPA = "http://localhost:8181/v1/data"

def tool_access_decision(facts: dict) -> str:
    r = requests.post(f"{OPA}/tool_access/decision", json={"input": facts})
    return r.json()["result"]

def egress_action(facts: dict) -> str:
    r = requests.post(f"{OPA}/egress_control/action", json={"input": facts})
    return r.json()["result"]

## Cell 10\. Small ML PII classifier. spaCy, not regex

Use a spaCy model to detect PII in tool outputs. This reduces reliance on fragile pattern matching. You can extend the model later with custom entities.

In [None]:
%%writefile pii_classifier.py
import json
import spacy

nlp = spacy.load("en_core_web_sm")

LABEL_MAP = {
    "PERSON": "person_name",
    "GPE": "location",
    "LOC": "location",
    "ORG": "organization",
    "DATE": "date"
}

def classify_pii(result):
    text = result if isinstance(result, str) else json.dumps(result, ensure_ascii=False)
    doc = nlp(text)

    findings = []
    for ent in doc.ents:
        if ent.label_ in LABEL_MAP:
            findings.append({
                "type": LABEL_MAP[ent.label_],
                "text": ent.text
            })

    return {
        "contains_pii": len(findings) > 0,
        "pii_types": sorted({f["type"] for f in findings}),
        "findings": findings
    }

## Cell 11\. Tool implementation. Returns PII internally

Implement a sample tool that may return sensitive fields internally. The MCP boundary and policies will control what gets released to the agent.

In [None]:
%%writefile tools.py
def initiate_refund(customer_id: str, amount: int):
    return {
        "status": "executed",
        "customer_id": customer_id,
        "message": (
            f"Refund of ${amount} issued to customer {customer_id}. "
            f"Contact: John Doe, Montreal."
        )
    }

## Cell 12\. MCP server. Enforcement boundary

Run a FastAPI server that enforces ingress and egress decisions. The server calls tools, runs the ML classifier, applies OPA policies, and writes to the audit log.

In [None]:
%%writefile mcp_server.py
from fastapi import FastAPI
from audit import audit_log
from approvals import create_approval
from opa_client import tool_access_decision, egress_action
from pii_classifier import classify_pii
from tools import initiate_refund

app = FastAPI()

def execute_tool(name, args):
    if name == "initiate_refund":
        return initiate_refund(**args)
    raise ValueError("Unknown tool")

@app.post("/mcp/tool/{tool_name}")
def call_tool(tool_name: str, payload: dict):
    facts = {
        "agent": payload["agent"],
        "tool": {"id": tool_name},
        "arguments": payload["arguments"],
        "environment": payload.get("environment", "production")
    }

    decision = tool_access_decision(facts)
    event_id = audit_log({
        "phase": "access",
        "agent": facts["agent"],
        "tool": tool_name,
        "arguments": facts["arguments"],
        "decision": decision
    })
    facts["event_id"] = event_id

    if decision == "deny":
        return {"error": "Denied by policy"}

    if decision == "require_approval":
        token = create_approval(facts)
        audit_log({"event_id": event_id, "phase": "approval_requested", "token": token})
        return {"status": "pending_approval", "approval_token": token}

    raw = execute_tool(tool_name, facts["arguments"])
    egress = classify_pii(raw)
    facts["egress"] = egress

    action = egress_action(facts)
    audit_log({
        "event_id": event_id,
        "phase": "egress",
        "action": action,
        "egress": egress
    })

    if action == "allow":
        return raw
    if action == "redact":
        return {"status": "redacted", "pii_types": egress["pii_types"]}
    return {"error": "Blocked by egress policy"}

## Cell 13\. Start MCP server. Background

Start the MCP server in the background. The agent will call this server instead of calling tools directly.

In [None]:
!uvicorn mcp_server:app --port 3333 &

## Cell 14\. CrewAI tool

Expose an MCP\-backed tool to the agent. The agent interacts with the tool through the enforcement boundary. This keeps all calls governed and auditable.

In [None]:
%%writefile crew_tools.py
import requests
from crewai.tools import tool

MCP = "http://localhost:3333/mcp/tool/initiate_refund"

@tool("initiate_refund")
def initiate_refund_tool(customer_id: str, amount: int):
    payload = {
        "agent": {
            "id": "agent-1",
            "role": "support_agent",
            "clearance": "no_pii"
        },
        "arguments": {
            "customer_id": customer_id,
            "amount": amount
        }
    }
    return requests.post(MCP, json=payload).json()

## Cell 15\. CrewAI agent execution

Run the agent against the MCP tool. The pipeline will evaluate access, classify content, enforce policy, and return a safe response.

In [None]:
from crewai import Agent, Task, Crew
from crew_tools import initiate_refund_tool

agent = Agent(
    role="support_agent",
    goal="Handle refunds safely",
    tools=[initiate_refund_tool],
    verbose=True
)

task = Task(
    description="Refund customer 123 for $250",
    agent=agent
)

crew = Crew(agents=[agent], tasks=[task])
print(crew.kickoff())

## What you should see

* The tool executes internally
* PII is detected by the ML classifier
* The OPA egress policy fires to control disclosure
* The agent receives a redacted response
* audit.log contains a full decision trail

## Final mental model

Reasoning, Enforcement, Classification, Policy, Audit. This is the end to end path your data and decisions follow. You now have a real, governed, auditable agent system running entirely in Colab.