Architecture

System Overview

graph TB
    Agent["AI Agent<br/>(Claude, Cursor, Copilot)"]
    
    Agent -->|MCP Protocol<br/>stdio or SSE| Server["MCP Server<br/>(FastMCP)"]
    
    Server -->|Register Tools| Tools["20 Built-in Tools<br/>+ Custom Tools"]
    Server -->|Route Requests| Executor["Job Executor<br/>(Sync/Async Hybrid)"]
    Server -->|Track Sessions| Sessions["Session Manager<br/>(Per-user Isolation)"]
    Server -->|Validate Code| Security["Security Validator<br/>(Blocklist, Sanitize)"]
    Server -->|Format Results| Formatter["Result Formatter<br/>(Text, Plotly, Images)"]
    
    Executor -->|Acquire/Release| Pool["Engine Pool Manager<br/>(Elastic Scaling)"]
    Pool -->|Start/Stop/Execute| Engines["MATLAB Engines<br/>(2020b+)"]
    
    Executor -->|Track Jobs| Tracker["Job Tracker<br/>(Metadata Store)"]
    
    Server -->|Optional| Monitor["Monitoring Stack<br/>(Collector, Store, Dashboard)"]
    
    Engines -->|Figure Props| PlotlyConvert["Plotly Converter<br/>(Style Mapping, WebGL)"]
    PlotlyConvert -->|JSON + PNG| Formatter

Major Components

MCP Server (`server.py`)

The entry point orchestrating all subsystems:

Tool Registration: Exposes 20 built-in tools + dynamic custom tools from YAML
Lifecycle Management: Server startup (pool init, monitoring setup), graceful shutdown, job draining
Request Routing: Delegates tool calls to implementation modules
Background Tasks: Health checks, session cleanup, metrics collection
Transport Support: stdio (single session) and SSE (multi-session with HTTP)

Engine Pool Manager (`pool/manager.py`)

Manages an elastic pool of MATLAB engine instances:

Elastic Scaling: Starts with min_engines (default 2), scales up to max_engines (default 8) under load
Proactive Warmup: When pool utilization exceeds proactive_warmup_threshold (80%), starts a new engine preemptively
Idle Scale-Down: Engines idle longer than scale_down_idle_timeout (15 minutes) are stopped, down to min_engines
Health Checks: Periodic 1+1 eval every 30 seconds to verify engines are responsive; unhealthy engines replaced
Work Queue: Async queue for requests when all engines are busy; configurable max queue depth
Workspace Reset: Optional clear all between sessions for isolation

Engine Wrapper (`pool/engine.py`)

Wraps a single matlab.engine instance with lifecycle and state management:

States: STOPPED, STARTING, IDLE, BUSY (tracked via EngineState enum)
Lifecycle: start() applies config (paths, startup commands), stop() calls quit()
Execution: Sync eval or background eval with concurrent.futures.Future wrapping
Health: health_check() runs a trivial eval; is_alive property
Workspace: Optional reset to clear variables between sessions

Job Executor (`jobs/executor.py`)

Orchestrates the full lifecycle of MATLAB code execution with hybrid sync/async model:

Security Check: Scans code for blocked functions (handled by SecurityValidator)
Job Creation: Registers job in tracker with initial PENDING state
Context Injection: Adds __mcp_job_id__ and __mcp_temp_dir__ to workspace
Synchronous Execution: Runs code, waits up to sync_timeout (default 30 seconds)
Async Promotion: If timeout exceeded, promotes to background execution, returns job_id
Result Capture: Serializes workspace variables, stdout/stderr, execution time, figures
Monitoring Events: Emits events for metrics collection

Job Tracker (`jobs/tracker.py`)

Thread-safe in-memory registry of all jobs:

CRUD: Create, get, list jobs filtered by session
Lifecycle States: PENDING → RUNNING → COMPLETED/FAILED/CANCELLED
Metadata: Job ID, session ID, code, engine assignment, timestamps, elapsed time
Pruning: Periodic cleanup removes jobs older than job_retention_seconds (default 86400 = 24h)

Session Manager (`session/manager.py`)

Per-user session isolation for multi-user (SSE) deployments:

Session Creation: Each session gets unique ID and temp directory
Default Session: stdio transport uses a singleton "default" session
Timeout: Sessions expire after session_timeout (default 3600s); cleanup on access
Limits: Enforces max_sessions (default 10) per server
Workspace Isolation: Optional clear all between sessions

Security Validator (`security/validator.py`)

Pre-execution security hardening:

Function Blocklist: Detects dangerous patterns: system(), unix(), dos(), ! (shell escape), eval(), feval(), evalc(), evalin(), assignin(), perl(), python()
Smart Scanning: Strips string literals ('...' and "...") and comments before regex matching to avoid false positives
Filename Sanitization: Validates upload filenames to prevent path traversal (../../../etc/passwd)
Size Limits: Enforces max_upload_size_mb (default 100MB)
Customizable: Blocklist and enabled/disabled modes configurable via YAML

Result Formatter (`output/formatter.py`)

Structures tool responses for MCP protocol:

Text Formatting: Truncates output exceeding max_output_chars (default 50000), saves overflow to file
Variable Display: Summarizes workspace variables with type/shape info, hides large arrays for brevity
Error Handling: Builds error response dicts with exception messages and stack traces
Delegates: Hands off to Plotly converter and thumbnail generator for rich media

Plotly Converter (`output/plotly_convert.py`, `output/plotly_style_mapper.py`)

Converts MATLAB figures to interactive Plotly JSON:

MATLAB-side (mcp_extract_props.m): Extracts raw figure structure (axes, traces, styling, limits, labels)
Python-side: Maps MATLAB line styles, marker symbols, colormaps, fonts to Plotly equivalents
WebGL Optimization: Automatically uses WebGL for line/scatter traces with >10,000 points
Output: Plotly JSON + static PNG thumbnail + optional base64-encoded image
Subplot Support: Handles single axes, rectangular grids, and tiled layouts with domain splitting

Monitoring Stack (Optional)

When monitoring.enabled: true:

MetricsCollector (monitoring/collector.py): Aggregates counters (jobs, sessions, errors), execution time ring buffer (p95), pool/system metrics
MetricsStore (monitoring/store.py): Async SQLite database for time-series metrics and structured events
Health Evaluator (monitoring/health.py): Classifies server health (healthy/degraded/unhealthy) based on utilization, error rate, capacity
HTTP Dashboard (monitoring/dashboard.py, static/): Starlette sub-app serving /health, /metrics, /dashboard with real-time Plotly charts

Data Flow Diagrams

Synchronous Code Execution

sequenceDiagram
    participant Agent
    participant Server
    participant Pool
    participant Engine
    participant Tracker
    participant Executor
    
    Agent->>Server: execute_code("x = magic(3)")
    Server->>Executor: execute(session_id, code)
    
    Executor->>Tracker: create_job()
    Tracker-->>Executor: job_id
    
    Executor->>Pool: acquire_engine()
    Pool-->>Executor: engine
    Executor->>Engine: workspace["__mcp_job_id__"] = job_id
    Executor->>Engine: eval(code, timeout=30s)
    Engine-->>Executor: result
    
    Executor->>Tracker: mark_completed(result)
    Executor->>Pool: release_engine()
    
    Executor-->>Server: {status: "completed", output: "...", time: 0.23}
    Server-->>Agent: MCP response

Async Job Promotion

sequenceDiagram
    participant Agent
    participant Server
    participant Executor
    participant Pool
    participant Engine
    participant Tracker
    
    Agent->>Server: execute_code("long_simulation()")
    Server->>Executor: execute(session_id, code)
    
    Executor->>Tracker: create_job()
    Executor->>Pool: acquire_engine()
    Executor->>Engine: eval(code, background=True)
    Engine-->>Executor: Future (running)
    
    Note over Executor: Timeout (30s) exceeded
    
    Executor->>Tracker: mark_running(job)
    Executor->>Pool: release_engine()
    Executor-->>Server: {status: "running", job_id: "abc123"}
    Server-->>Agent: MCP response
    
    Note over Engine: Code continues in background
    
    rect rgb(200, 200, 200)
    Agent->>Server: get_job_status("abc123")
    Server->>Tracker: get_job("abc123")
    Tracker-->>Server: {status: "running", progress: 45%}
    Server-->>Agent: {progress: 45%, elapsed: 120s}
    end
    
    rect rgb(200, 200, 200)
    Agent->>Server: get_job_result("abc123")
    Server->>Executor: fetch_result("abc123")
    Executor->>Engine: result
    Engine-->>Executor: workspace snapshot
    Executor-->>Server: full result
    Server-->>Agent: {status: "completed", output: "...", ...}
    end

Multi-User SSE Flow

sequenceDiagram
    participant Agent1
    participant Agent2
    participant Server
    participant SessionMgr
    participant Pool
    
    Agent1->>Server: POST /mcp with session_id="user1"
    Server->>SessionMgr: get_session("user1")
    SessionMgr-->>Server: Session(temp_dir="/tmp/user1_abc123")
    
    Agent2->>Server: POST /mcp with session_id="user2"
    Server->>SessionMgr: get_session("user2")
    SessionMgr-->>Server: Session(temp_dir="/tmp/user2_def456")
    
    Agent1->>Server: execute_code("x = 1", session_id="user1")
    Agent2->>Server: execute_code("y = 2", session_id="user2")
    
    Server->>Pool: acquire_engine()
    Pool-->>Server: engine1
    
    Server->>Pool: acquire_engine()
    Pool-->>Server: engine2
    
    Note over Server: Both run in parallel<br/>in separate engines

Design Decisions & Trade-offs

1. Sync-First with Async Fallback

Decision: Execute code synchronously by default; promote to async on timeout.

Rationale:

Simplicity: Most code completes fast (< 5s); sync results are simpler for agents
Responsiveness: No polling overhead for typical queries
Async Support: Long-running code (Monte Carlo, large matrix ops) still supported via promotion

Trade-off: Timeout must be tuned per deployment. Too short → unnecessary promotion; too long → slow response to agents.

2. Elastic Pool with Proactive Warmup

Decision: Scale engines on-demand up to max_engines, with proactive startup at 80% utilization.

Rationale:

Resource Efficiency: Don't pre-allocate 10 engines if only 2 are needed
Responsiveness: Proactive warmup avoids startup latency when load spikes
Stability: Idle scale-down prevents resource bloat over long runs

Trade-off: Adds complexity (pool state machine); startup latency for first request in a cold pool (~2-3s).

3. Per-Session Temp Directories

Decision: Each session gets a unique temp directory; files isolated across sessions.

Rationale:

Multi-User Safety: Prevents agent A from reading agent B's uploaded files
Cleanup: Temp dirs deleted on session timeout; no manual cleanup needed

Trade-off: Requires session management in SSE mode; stdio mode is simpler (single "default" session).

4. Plotly Over MATLAB Native Figures

Decision: Convert MATLAB figures to Plotly JSON for agent-side rendering.

Rationale:

Interactivity: Agents can zoom/pan/hover; better for exploration
Portability: No need to render/PNG on server; JSON is lightweight
WebGL: Automatic acceleration for large datasets (10k+ points)

Trade-off: Some MATLAB figure features (custom uicontrols, special objects) don't convert cleanly. Fallback to PNG.

5. Configuration-Driven Security

Decision: Blocklist configurable in YAML; can be customized or disabled.

Rationale:

Flexibility: Different deployments have different risk tolerances
Auditable: Blocklist visible in config; easy to review what's blocked

Trade-off: Operators must actively decide on security posture; no "safe by default."

6. Optional Monitoring with SQLite Backend

Decision: Metrics are optional; when enabled, stored in async SQLite.

Rationale:

Zero Overhead: If disabled, no background threads or I/O
Low Footprint: SQLite requires no external service (Prometheus, InfluxDB)
Async: Database writes don't block execution threads

Trade-off: SQLite scales to ~1000s of metrics/sec; high-volume deployments may need external time-series DB.

Languages & Technology Stack

Python 3.10+: Core server, async I/O with asyncio, FastMCP framework
MATLAB 2020b+: Code execution via matlab.engine Python API
JavaScript: Client-side dashboard with Plotly charting
MATLAB Helper Scripts: mcp_extract_props.m for figure conversion, mcp_checkcode.m for linting, mcp_progress.m for progress reporting
SQLite (Optional): Metrics storage when monitoring enabled
FastMCP: MCP protocol abstraction (stdio/SSE)
Starlette (Optional): HTTP sub-app for monitoring dashboard
Pydantic: Configuration validation and serialization

Architecture

Architecture

System Overview

Major Components

MCP Server (server.py)

Engine Pool Manager (pool/manager.py)

Engine Wrapper (pool/engine.py)

Job Executor (jobs/executor.py)

Job Tracker (jobs/tracker.py)

Session Manager (session/manager.py)

Security Validator (security/validator.py)

Result Formatter (output/formatter.py)

Plotly Converter (output/plotly_convert.py, output/plotly_style_mapper.py)

Monitoring Stack (Optional)

Data Flow Diagrams

Synchronous Code Execution

Async Job Promotion

Multi-User SSE Flow

Design Decisions & Trade-offs

1. Sync-First with Async Fallback

2. Elastic Pool with Proactive Warmup

3. Per-Session Temp Directories

4. Plotly Over MATLAB Native Figures

5. Configuration-Driven Security

6. Optional Monitoring with SQLite Backend

Languages & Technology Stack

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

MCP Server (`server.py`)

Engine Pool Manager (`pool/manager.py`)

Engine Wrapper (`pool/engine.py`)

Job Executor (`jobs/executor.py`)

Job Tracker (`jobs/tracker.py`)

Session Manager (`session/manager.py`)

Security Validator (`security/validator.py`)

Result Formatter (`output/formatter.py`)

Plotly Converter (`output/plotly_convert.py`, `output/plotly_style_mapper.py`)