# Casys MCP Gateway - The Story

Welcome to the Casys MCP Gateway playground. This series of notebooks will teach you not just *how* to use the gateway, but *why* it exists and what problems it solves.

---

## Learning Objectives

After completing this notebook series, you will:
- [ ] Understand the MCP context explosion problem
- [ ] Know how to reduce context usage from 30-50% to <5%
- [ ] Execute workflows 5x faster with DAG parallelization
- [ ] Run agent-generated code safely in a sandbox
- [ ] Build adaptive workflows that learn from execution

## The Problem: MCP Doesn't Scale

The Model Context Protocol (MCP) is amazing. It lets AI agents like Claude connect to tools - GitHub, databases, file systems, browsers, and more.

But there's a problem. A big one.

### The "Invisible Tax"

Every MCP server sends its **entire tool catalog** to the LLM at the start of each conversation.

```
You have 8 MCP servers installed:
- GitHub (15 tools)      ~12,000 tokens
- Filesystem (8 tools)    ~6,000 tokens  
- Database (12 tools)     ~9,000 tokens
- Playwright (20 tools)  ~18,000 tokens
- Slack (10 tools)        ~8,000 tokens
- Notion (14 tools)      ~11,000 tokens
- Jira (16 tools)        ~13,000 tokens
- Custom API (5 tools)    ~4,000 tokens
─────────────────────────────────────────
TOTAL: ~81,000 tokens (just for tool schemas!)
```

With a 200K token context window, **40% is already consumed** before you even say "Hello".

### The Latency Problem

Now suppose Claude needs to:
1. Read a config file
2. Parse it as JSON
3. Fetch related GitHub issues
4. Search Slack for discussions
5. Create a Jira ticket

Each tool call is **sequential**. 5 tools × 1 second each = **5 seconds of waiting**.

But steps 3 and 4 don't depend on each other. They could run in parallel!

## The Solution: Casys MCP Gateway

Casys MCP Gateway is an intelligent MCP gateway that solves both problems:

### 1. Context Optimization (30-50% → <5%)

Instead of loading ALL tool schemas upfront, the gateway:
- **Indexes** all tools using vector embeddings (semantic search)
- **Loads on-demand** only the relevant tools for each query
- **Result**: 100 tools available, but only 3-5 loaded per request

### 2. DAG Execution (5x → 1x latency)

The gateway analyzes tool dependencies and:
- **Builds a DAG** (Directed Acyclic Graph) of the workflow
- **Parallelizes** independent branches
- **Streams results** as they complete

### 3. Sandbox Execution (Safe code running)

For complex data processing, agents can:
- **Write TypeScript code** to process data locally
- **Execute in isolation** with strict permissions
- **Return summaries** instead of raw data (1MB → 1KB)

### 4. GraphRAG Learning (Gets smarter over time)

The gateway remembers which tools are used together:
- **Tracks patterns**: "read_file" is followed by "parse_json" 85% of the time
- **Recommends tools** based on graph relationships
- **Predicts next actions** for speculative execution

## The Journey

This playground takes you through AgentCards step by step:

| Notebook | What You'll Learn |
|----------|-------------------|
| **01-the-problem** | See the context explosion and latency issues live |
| **02-context-optimization** | Reduce context usage with vector search |
| **03-dag-execution** | Parallelize workflows with DAG |
| **04-sandbox-security** | Execute code safely in isolation |
| **05-context-injection** | Pass data into the sandbox |
| **06-mcp-gateway** | Connect to MCP servers through AgentCards |
| **07-graphrag-learning** | See how the system learns patterns |
| **08-adaptive-workflows** | Build workflows that adapt at runtime |
| **09-speculative-execution** | Achieve 0ms perceived latency |

Each notebook:
- Starts with **why** (the problem it solves)
- Shows **how** (with runnable examples)
- Ends with a **checkpoint** to verify understanding

## Quick Start

Before diving in, let's verify your environment is ready:

In [None]:
// Check Deno version and Casys MCP Gateway availability
console.log(`Deno version: ${Deno.version.deno}`);
console.log(`TypeScript version: ${Deno.version.typescript}`);
console.log(`V8 version: ${Deno.version.v8}`);

// Verify we can import gateway modules
try {
  const { DenoSandboxExecutor } = await import("../../src/sandbox/executor.ts");
  console.log("\n✅ Casys MCP Gateway modules loaded successfully");
  console.log("\nYou're ready to start the journey!");
} catch (e) {
  console.error("\n❌ Error loading gateway:", e.message);
  console.log("\nMake sure you're running from the project root.");
}

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────┐
│                    Claude Code (Client)                      │
└────────────────────────────┬────────────────────────────────┘
                             │ MCP Protocol
                             ▼
┌─────────────────────────────────────────────────────────────┐
│                  Casys MCP Gateway                           │
│  ┌─────────────────────────────────────────────────────┐    │
│  │  Vector Search    │  Load only relevant tools       │    │
│  ├─────────────────────────────────────────────────────┤    │
│  │  DAG Executor     │  Parallelize independent tasks  │    │
│  ├─────────────────────────────────────────────────────┤    │
│  │  Sandbox          │  Safe code execution            │    │
│  ├─────────────────────────────────────────────────────┤    │
│  │  GraphRAG         │  Learn and predict patterns     │    │
│  └─────────────────────────────────────────────────────┘    │
│                                                              │
│  Storage: PGlite (single portable .db file)                  │
└────────────────────────────┬────────────────────────────────┘
                             │ MCP Protocol
                             ▼
┌─────────────────────────────────────────────────────────────┐
│           MCP Servers (GitHub, Slack, DB, ...)              │
└─────────────────────────────────────────────────────────────┘
```

## Ready?

Let's see the problem in action.

**Next:** [01-the-problem.ipynb](./01-the-problem.ipynb) - Watch context explode and latency accumulate