Skip to content

Architecture

github-actions[bot] edited this page Mar 22, 2026 · 20 revisions

Architecture

System Overview

graph TB
    Agent["AI Agent<br/>(Claude, Cursor, Copilot)"]
    
    Agent -->|MCP Protocol<br/>stdio or SSE| Server["MCP Server<br/>(FastMCP)"]
    
    Server -->|Register Tools| Tools["20 Built-in Tools<br/>+ Custom Tools"]
    Server -->|Route Requests| Executor["Job Executor<br/>(Sync/Async Hybrid)"]
    Server -->|Track Sessions| Sessions["Session Manager<br/>(Per-user Isolation)"]
    Server -->|Validate Code| Security["Security Validator<br/>(Blocklist, Sanitize)"]
    Server -->|Format Results| Formatter["Result Formatter<br/>(Text, Plotly, Images)"]
    
    Executor -->|Acquire/Release| Pool["Engine Pool Manager<br/>(Elastic Scaling)"]
    Pool -->|Start/Stop/Execute| Engines["MATLAB Engines<br/>(2020b+)"]
    
    Executor -->|Track Jobs| Tracker["Job Tracker<br/>(Metadata Store)"]
    
    Server -->|Optional| Monitor["Monitoring Stack<br/>(Collector, Store, Dashboard)"]
    
    Engines -->|Figure Props| PlotlyConvert["Plotly Converter<br/>(Style Mapping, WebGL)"]
    PlotlyConvert -->|JSON + PNG| Formatter
Loading

Major Components

MCP Server (server.py)

The entry point orchestrating all subsystems:

  • Tool Registration: Exposes 20 built-in tools + dynamic custom tools from YAML
  • Lifecycle Management: Server startup (pool init, monitoring setup), graceful shutdown, job draining
  • Request Routing: Delegates tool calls to implementation modules
  • Background Tasks: Health checks, session cleanup, metrics collection
  • Transport Support: stdio (single session) and SSE (multi-session with HTTP)

Engine Pool Manager (pool/manager.py)

Manages an elastic pool of MATLAB engine instances:

  • Elastic Scaling: Starts with min_engines (default 2), scales up to max_engines (default 8) under load
  • Proactive Warmup: When pool utilization exceeds proactive_warmup_threshold (80%), starts a new engine preemptively
  • Idle Scale-Down: Engines idle longer than scale_down_idle_timeout (15 minutes) are stopped, down to min_engines
  • Health Checks: Periodic 1+1 eval every 30 seconds to verify engines are responsive; unhealthy engines replaced
  • Work Queue: Async queue for requests when all engines are busy; configurable max queue depth
  • Workspace Reset: Optional clear all between sessions for isolation

Engine Wrapper (pool/engine.py)

Wraps a single matlab.engine instance with lifecycle and state management:

  • States: STOPPED, STARTING, IDLE, BUSY (tracked via EngineState enum)
  • Lifecycle: start() applies config (paths, startup commands), stop() calls quit()
  • Execution: Sync eval or background eval with concurrent.futures.Future wrapping
  • Health: health_check() runs a trivial eval; is_alive property
  • Workspace: Optional reset to clear variables between sessions

Job Executor (jobs/executor.py)

Orchestrates the full lifecycle of MATLAB code execution with hybrid sync/async model:

  1. Security Check: Scans code for blocked functions (handled by SecurityValidator)
  2. Job Creation: Registers job in tracker with initial PENDING state
  3. Context Injection: Adds __mcp_job_id__ and __mcp_temp_dir__ to workspace
  4. Synchronous Execution: Runs code, waits up to sync_timeout (default 30 seconds)
  5. Async Promotion: If timeout exceeded, promotes to background execution, returns job_id
  6. Result Capture: Serializes workspace variables, stdout/stderr, execution time, figures
  7. Monitoring Events: Emits events for metrics collection

Job Tracker (jobs/tracker.py)

Thread-safe in-memory registry of all jobs:

  • CRUD: Create, get, list jobs filtered by session
  • Lifecycle States: PENDING → RUNNING → COMPLETED/FAILED/CANCELLED
  • Metadata: Job ID, session ID, code, engine assignment, timestamps, elapsed time
  • Pruning: Periodic cleanup removes jobs older than job_retention_seconds (default 86400 = 24h)

Session Manager (session/manager.py)

Per-user session isolation for multi-user (SSE) deployments:

  • Session Creation: Each session gets unique ID and temp directory
  • Default Session: stdio transport uses a singleton "default" session
  • Timeout: Sessions expire after session_timeout (default 3600s); cleanup on access
  • Limits: Enforces max_sessions (default 10) per server
  • Workspace Isolation: Optional clear all between sessions

Security Validator (security/validator.py)

Pre-execution security hardening:

  • Function Blocklist: Detects dangerous patterns: system(), unix(), dos(), ! (shell escape), eval(), feval(), evalc(), evalin(), assignin(), perl(), python()
  • Smart Scanning: Strips string literals ('...' and "...") and comments before regex matching to avoid false positives
  • Filename Sanitization: Validates upload filenames to prevent path traversal (../../../etc/passwd)
  • Size Limits: Enforces max_upload_size_mb (default 100MB)
  • Customizable: Blocklist and enabled/disabled modes configurable via YAML

Result Formatter (output/formatter.py)

Structures tool responses for MCP protocol:

  • Text Formatting: Truncates output exceeding max_output_chars (default 50000), saves overflow to file
  • Variable Display: Summarizes workspace variables with type/shape info, hides large arrays for brevity
  • Error Handling: Builds error response dicts with exception messages and stack traces
  • Delegates: Hands off to Plotly converter and thumbnail generator for rich media

Plotly Converter (output/plotly_convert.py, output/plotly_style_mapper.py)

Converts MATLAB figures to interactive Plotly JSON:

  1. MATLAB-side (mcp_extract_props.m): Extracts raw figure structure (axes, traces, styling, limits, labels)
  2. Python-side: Maps MATLAB line styles, marker symbols, colormaps, fonts to Plotly equivalents
  3. WebGL Optimization: Automatically uses WebGL for line/scatter traces with >10,000 points
  4. Output: Plotly JSON + static PNG thumbnail + optional base64-encoded image
  5. Subplot Support: Handles single axes, rectangular grids, and tiled layouts with domain splitting

Monitoring Stack (Optional)

When monitoring.enabled: true:

  • MetricsCollector (monitoring/collector.py): Aggregates counters (jobs, sessions, errors), execution time ring buffer (p95), pool/system metrics
  • MetricsStore (monitoring/store.py): Async SQLite database for time-series metrics and structured events
  • Health Evaluator (monitoring/health.py): Classifies server health (healthy/degraded/unhealthy) based on utilization, error rate, capacity
  • HTTP Dashboard (monitoring/dashboard.py, static/): Starlette sub-app serving /health, /metrics, /dashboard with real-time Plotly charts

Data Flow Diagrams

Synchronous Code Execution

sequenceDiagram
    participant Agent
    participant Server
    participant Pool
    participant Engine
    participant Tracker
    participant Executor
    
    Agent->>Server: execute_code("x = magic(3)")
    Server->>Executor: execute(session_id, code)
    
    Executor->>Tracker: create_job()
    Tracker-->>Executor: job_id
    
    Executor->>Pool: acquire_engine()
    Pool-->>Executor: engine
    Executor->>Engine: workspace["__mcp_job_id__"] = job_id
    Executor->>Engine: eval(code, timeout=30s)
    Engine-->>Executor: result
    
    Executor->>Tracker: mark_completed(result)
    Executor->>Pool: release_engine()
    
    Executor-->>Server: {status: "completed", output: "...", time: 0.23}
    Server-->>Agent: MCP response
Loading

Async Job Promotion

sequenceDiagram
    participant Agent
    participant Server
    participant Executor
    participant Pool
    participant Engine
    participant Tracker
    
    Agent->>Server: execute_code("long_simulation()")
    Server->>Executor: execute(session_id, code)
    
    Executor->>Tracker: create_job()
    Executor->>Pool: acquire_engine()
    Executor->>Engine: eval(code, background=True)
    Engine-->>Executor: Future (running)
    
    Note over Executor: Timeout (30s) exceeded
    
    Executor->>Tracker: mark_running(job)
    Executor->>Pool: release_engine()
    Executor-->>Server: {status: "running", job_id: "abc123"}
    Server-->>Agent: MCP response
    
    Note over Engine: Code continues in background
    
    rect rgb(200, 200, 200)
    Agent->>Server: get_job_status("abc123")
    Server->>Tracker: get_job("abc123")
    Tracker-->>Server: {status: "running", progress: 45%}
    Server-->>Agent: {progress: 45%, elapsed: 120s}
    end
    
    rect rgb(200, 200, 200)
    Agent->>Server: get_job_result("abc123")
    Server->>Executor: fetch_result("abc123")
    Executor->>Engine: result
    Engine-->>Executor: workspace snapshot
    Executor-->>Server: full result
    Server-->>Agent: {status: "completed", output: "...", ...}
    end
Loading

Multi-User SSE Flow

sequenceDiagram
    participant Agent1
    participant Agent2
    participant Server
    participant SessionMgr
    participant Pool
    
    Agent1->>Server: POST /mcp with session_id="user1"
    Server->>SessionMgr: get_session("user1")
    SessionMgr-->>Server: Session(temp_dir="/tmp/user1_abc123")
    
    Agent2->>Server: POST /mcp with session_id="user2"
    Server->>SessionMgr: get_session("user2")
    SessionMgr-->>Server: Session(temp_dir="/tmp/user2_def456")
    
    Agent1->>Server: execute_code("x = 1", session_id="user1")
    Agent2->>Server: execute_code("y = 2", session_id="user2")
    
    Server->>Pool: acquire_engine()
    Pool-->>Server: engine1
    
    Server->>Pool: acquire_engine()
    Pool-->>Server: engine2
    
    Note over Server: Both run in parallel<br/>in separate engines
Loading

Design Decisions & Trade-offs

1. Sync-First with Async Fallback

Decision: Execute code synchronously by default; promote to async on timeout.

Rationale:

  • Simplicity: Most code completes fast (< 5s); sync results are simpler for agents
  • Responsiveness: No polling overhead for typical queries
  • Async Support: Long-running code (Monte Carlo, large matrix ops) still supported via promotion

Trade-off: Timeout must be tuned per deployment. Too short → unnecessary promotion; too long → slow response to agents.

2. Elastic Pool with Proactive Warmup

Decision: Scale engines on-demand up to max_engines, with proactive startup at 80% utilization.

Rationale:

  • Resource Efficiency: Don't pre-allocate 10 engines if only 2 are needed
  • Responsiveness: Proactive warmup avoids startup latency when load spikes
  • Stability: Idle scale-down prevents resource bloat over long runs

Trade-off: Adds complexity (pool state machine); startup latency for first request in a cold pool (~2-3s).

3. Per-Session Temp Directories

Decision: Each session gets a unique temp directory; files isolated across sessions.

Rationale:

  • Multi-User Safety: Prevents agent A from reading agent B's uploaded files
  • Cleanup: Temp dirs deleted on session timeout; no manual cleanup needed

Trade-off: Requires session management in SSE mode; stdio mode is simpler (single "default" session).

4. Plotly Over MATLAB Native Figures

Decision: Convert MATLAB figures to Plotly JSON for agent-side rendering.

Rationale:

  • Interactivity: Agents can zoom/pan/hover; better for exploration
  • Portability: No need to render/PNG on server; JSON is lightweight
  • WebGL: Automatic acceleration for large datasets (10k+ points)

Trade-off: Some MATLAB figure features (custom uicontrols, special objects) don't convert cleanly. Fallback to PNG.

5. Configuration-Driven Security

Decision: Blocklist configurable in YAML; can be customized or disabled.

Rationale:

  • Flexibility: Different deployments have different risk tolerances
  • Auditable: Blocklist visible in config; easy to review what's blocked

Trade-off: Operators must actively decide on security posture; no "safe by default."

6. Optional Monitoring with SQLite Backend

Decision: Metrics are optional; when enabled, stored in async SQLite.

Rationale:

  • Zero Overhead: If disabled, no background threads or I/O
  • Low Footprint: SQLite requires no external service (Prometheus, InfluxDB)
  • Async: Database writes don't block execution threads

Trade-off: SQLite scales to ~1000s of metrics/sec; high-volume deployments may need external time-series DB.

Languages & Technology Stack

  • Python 3.10+: Core server, async I/O with asyncio, FastMCP framework
  • MATLAB 2020b+: Code execution via matlab.engine Python API
  • JavaScript: Client-side dashboard with Plotly charting
  • MATLAB Helper Scripts: mcp_extract_props.m for figure conversion, mcp_checkcode.m for linting, mcp_progress.m for progress reporting
  • SQLite (Optional): Metrics storage when monitoring enabled
  • FastMCP: MCP protocol abstraction (stdio/SSE)
  • Starlette (Optional): HTTP sub-app for monitoring dashboard
  • Pydantic: Configuration validation and serialization

Clone this wiki locally