Skip to content

Architecture

github-actions[bot] edited this page Apr 3, 2026 · 20 revisions

Architecture

System Overview

The MATLAB MCP Server bridges AI agents (Claude, Cursor, Codex CLI) to a shared MATLAB installation through the Model Context Protocol. It manages an elastic pool of MATLAB engines, executes code with security validation, handles async jobs, and serves interactive Plotly visualizations.

graph TB
    Agent["AI Agent<br/>(Claude, Cursor, Codex CLI)"]
    Transport["MCP Transport<br/>(stdio, SSE, streamable HTTP)"]
    Server["FastMCP Server<br/>(20+ Tools)"]
    SessionMgr["Session Manager<br/>(Workspace Isolation)"]
    Executor["Job Executor<br/>(Sync/Async)"]
    PoolMgr["Engine Pool Manager<br/>(Elastic 2-10+ engines)"]
    Engine["MATLAB Engines<br/>(R2022b+)"]
    
    Agent -->|MCP Protocol| Transport
    Transport -->|Tool Calls| Server
    Server -->|Session ID| SessionMgr
    Server -->|Execute Code| Executor
    Executor -->|Acquire/Release| PoolMgr
    PoolMgr -->|Run Code| Engine
    PoolMgr -->|Health Checks| Engine
    Engine -->|Results| Executor
    Executor -->|Format| Server
    Server -->|Response| Agent
Loading

Major Components

1. MCP Server (src/matlab_mcp/server.py)

The main FastMCP server instance that registers and routes all tools. Responsibilities:

  • Register 20 built-in tools (execute_code, check_code, list_toolboxes, etc.)
  • Load and register custom tools from YAML config
  • Manage server lifespan (startup, shutdown, graceful drain)
  • Route tool calls to implementation modules
  • Handle transport modes (stdio, SSE, streamable HTTP)
  • Forward security validation and HITL approval gates

Key methods:

  • main() — CLI entry point; parses args, loads config, starts server
  • _get_session_id() — Extracts session ID from transport context
  • _get_temp_dir() — Resolves session temp directory for workspace isolation
  • Tool implementations delegate to JobExecutor, SecurityValidator, ResultFormatter

2. Engine Pool Manager (src/matlab_mcp/pool/manager.py)

Manages an elastic pool of MATLAB engine instances with dynamic scaling:

  • Min/max sizing: Starts with min_engines (default 2), scales up to max_engines (default 10) under demand
  • Proactive warmup: When utilization exceeds 80% (proactive_warmup_threshold), starts a new engine before saturation
  • Idle scale-down: Engines idle >15 minutes (scale_down_idle_timeout) are stopped, back to minimum
  • Health checks: Every 60 seconds, pings all engines with trivial 1+1 eval; unhealthy engines are replaced
  • Queueing: Requests wait in an async queue (max 50 items) when all engines are busy

Key methods:

  • acquire() — Reserves an engine (waits if none available)
  • release() — Returns an engine to the pool
  • get_status() — Returns pool metrics (total, available, busy, max)

3. MATLAB Engine Wrapper (src/matlab_mcp/pool/engine.py)

Wraps a single matlab.engine instance with lifecycle and state tracking:

  • Lifecycle: States are STOPPEDSTARTINGIDLEBUSY
  • Workspace isolation: Can reset workspace between jobs
  • Health checks: eval("1+1") to verify responsiveness
  • Code execution: Both sync eval() and background eval(..., background=True)

4. Job Executor (src/matlab_mcp/jobs/executor.py)

Orchestrates the full execution lifecycle with hybrid sync/async behavior:

sequenceDiagram
    participant Agent
    participant Server
    participant Executor
    participant Pool
    participant Engine
    participant Tracker
    
    Agent->>Server: execute_code("x=1")
    Server->>Executor: execute(code, session_id)
    Executor->>Tracker: Create Job (PENDING)
    Executor->>Pool: Acquire engine
    Pool->>Engine: Reserve
    Executor->>Executor: Inject context (__mcp_job_id__)
    Executor->>Engine: eval(code)
    alt Completes <30s
        Engine->>Executor: stdout, variables
        Executor->>Tracker: Mark COMPLETED
        Executor->>Pool: Release
        Executor->>Server: Return result
        Server->>Agent: {"status":"completed","output":"ans = 1"}
    else >30s timeout
        Executor->>Tracker: Mark RUNNING (promote to async)
        Executor->>Server: Return job_id
        Server->>Agent: {"status":"running","job_id":"abc123"}
        activate Engine
        Engine->>Engine: Continue execution
        deactivate Engine
        Engine->>Executor: Complete (background)
        Executor->>Tracker: Mark COMPLETED
        Executor->>Pool: Release
    end
Loading

Key methods:

  • execute() — Main entry point; creates job, acquires engine, injects context, runs code
  • _inject_job_context() — Sets __mcp_job_id__ and __mcp_temp_dir__ in workspace
  • _safe_serialize() — Converts Python objects to JSON-safe forms (tuples→lists, objects→.tolist())
  • _build_result() — Structures stdout, variables, figures into MCP response dict

Sync/Async promotion logic:

  1. Execution starts synchronously (blocking)
  2. If timeout (sync_timeout, default 30s) is exceeded, code is promoted to background task
  3. Agent receives job_id and can poll get_job_status(), get_job_result() for completion
  4. Engine is held until background task completes, then released

5. Job Tracker (src/matlab_mcp/jobs/tracker.py)

In-memory registry of MATLAB code execution jobs:

  • Thread-safe storage with asyncio locks
  • CRUD operations: create_job(), get_job(), list_jobs(), cancel_job()
  • Auto-prune completed jobs older than job_retention_seconds (default 24 hours)
  • Tracks job states: PENDING → RUNNING → COMPLETED/FAILED/CANCELLED

6. Session Manager (src/matlab_mcp/session/manager.py)

Manages user session lifecycles with per-session workspace isolation:

  • Each session gets a unique temp directory (via tempfile.gettempdir() for cross-platform compatibility)
  • For stdio transport: single "default" session shared by one agent
  • For SSE/streamable HTTP: one session per connected agent, identified by session_id from context
  • Idle timeout cleanup: sessions inactive >1 hour are destroyed (if no active jobs)
  • Max session limit: configurable (default 50)

Session object contains:

  • session_id — Unique identifier
  • temp_dir — Temporary working directory (isolated from other sessions)
  • created_at, last_active — Timestamps for TTL and idle detection

7. Security Validator (src/matlab_mcp/security/validator.py)

Pre-execution security checks:

  • Function blocklist: Detects dangerous functions (system, unix, dos, eval, evalc, evalin, assignin, feval, perl, python) using smart string/comment stripping (avoids false positives from string literals)
  • Filename sanitization: Prevents path traversal (removes .., ~/, absolute paths)
  • Upload size limits: Enforces max_upload_size_mb (default 100 MB)
  • Can be disabled via config for trusted deployments

8. Result Formatter (src/matlab_mcp/output/formatter.py)

Structures MATLAB execution results into MCP response dictionaries:

  • Text handling: Truncates long output to max_inline_text_length (default 50 KB); optionally saves to file
  • Variables: Formats workspace variables with type, size, and value
  • Figures: Passes to Plotly converter for interactive JSON + static PNG
  • Success/error responses: Builds complete MCP response objects

9. Plotly Converter (src/matlab_mcp/output/plotly_convert.py, output/plotly_style_mapper.py)

Converts MATLAB figures to interactive Plotly JSON:

graph LR
    MATLAB["MATLAB Figure<br/>(plot, scatter, bar)"]
    Extract["mcp_extract_props.m<br/>(Extract properties)"]
    JSON["JSON File<br/>(Lines, colors, axes)"]
    Convert["plotly_style_mapper.py<br/>(Style mapping)"]
    Plotly["Plotly JSON<br/>(Interactive)"]
    PNG["Static PNG<br/>(Fallback)"]
    
    MATLAB -->|Helper function| Extract
    Extract -->|Save| JSON
    JSON -->|Read & parse| Convert
    Convert -->|Render| Plotly
    Convert -->|savefig| PNG
Loading

Style mapping includes:

  • Line styles (-, --, :, -. → Plotly solid, dash, dot, dashdot)
  • Markers (o, s, ^, *, x → circle, square, triangle, star, cross)
  • Colors (MATLAB RGB → CSS hex)
  • Colormaps (parula, viridis, hot, cool)
  • Axis scales (linear, log)
  • Legend positioning (north, south, east, west, best)
  • WebGL rendering for large datasets (>10,000 points)

10. HITL Approval Gate (src/matlab_mcp/hitl/gate.py)

Human-in-the-loop approval for sensitive operations:

  • Protected functions: Operators can mark certain functions (fopen, delete, system) as requiring human approval
  • All execute: Optional mode to require approval for all code execution
  • File operations: Upload/delete can be gated separately
  • Uses FastMCP's ctx.elicit() API to prompt user; returns approval/denial
  • Disabled by default (hitl.enabled=False)

11. Bearer Token Auth Middleware (src/matlab_mcp/auth/middleware.py)

HTTP-level authentication for SSE and streamable HTTP transports:

  • Pure ASGI middleware — validates Authorization: Bearer <token> before FastMCP processes any MCP message
  • Token source: MATLAB_MCP_AUTH_TOKEN environment variable (disabled if unset)
  • Constant-time comparison: Uses hmac.compare_digest() to prevent timing attacks
  • Bypasses: /health path and OPTIONS requests (CORS pre-flight)
  • Rejection: Returns HTTP 401 with WWW-Authenticate: Bearer header and JSON error body
  • stdio transport: No middleware (local, trusted context)

12. Configuration System (src/matlab_mcp/config.py)

Pydantic-based configuration with YAML + environment variable overrides:

  • Sections: server, pool, execution, security, sessions, output, workspace, monitoring, auth, hitl
  • Env overrides: MATLAB_MCP_* prefix; e.g., MATLAB_MCP_POOL_MAX_ENGINES=20
  • Validation: Type-safe Pydantic models; rejects invalid values at load time
  • Paths: Relative paths in YAML are resolved to absolute paths based on config file location
  • Defaults: Full defaults provided; minimal config.yaml sufficient for typical use

13. Monitoring & Health (src/matlab_mcp/monitoring/)

Server metrics, health status, and dashboard:

  • Metrics collector (collector.py) — Tracks counters (jobs, sessions, errors) and execution stats; fires events to persistent store
  • Health evaluation (health.py) — Assesses status (healthy/degraded/unhealthy) from pool utilization, error rates, uptime
  • Metrics store (store.py) — Async SQLite persistence for historical time-series and event logs
  • Dashboard (dashboard.py) — Starlette sub-app with live gauge charts, event log, metrics API
  • Routes (routes.py) — HTTP endpoints /health, /metrics for monitoring systems

Data Flow

Example: Execute MATLAB Code (Sync Path)

1. Agent calls: execute_code("x = [1 2 3]; y = x.^2")
2. Server.execute_code_impl():
   a. Validate syntax (optional checkcode)
   b. Check security (no blocked functions)
   c. Check HITL (if gated, elicit approval)
   d. Create job with status PENDING
   e. Call executor.execute(code, session_id)

3. Executor.execute():
   a. Acquire engine from pool (may wait if busy)
   b. Inject __mcp_job_id__ and __mcp_temp_dir__ into workspace
   c. Mark job RUNNING
   d. Call engine.eval(code) synchronously
   e. Capture stdout, stderr, exception
   f. Read updated workspace variables
   g. Check for figures (*.fig.json files)
   h. Release engine back to pool
   i. Mark job COMPLETED

4. ResultFormatter.format_result():
   a. Truncate output if >50KB
   b. Format variables (x, y: arrays)
   c. Convert figures if present
   d. Generate thumbnail if large
   e. Build response dict

5. Server returns to agent:
   {
     "status": "completed",
     "output": "ans = \n    1     2     3",
     "variables": [
       {"name": "x", "value": "[1 2 3]", "size": "1x3"},
       {"name": "y", "value": "[1 4 9]", "size": "1x3"}
     ],
     "execution_time": 0.123
   }

Example: Long-Running Job (Async Promotion)

1. Agent calls: execute_code("monte_carlo_pi(1e7)")
   (Monte Carlo with 10M samples ~ 45 seconds)

2. Executor starts synchronously:
   a. Acquire engine
   b. Inject context
   c. Call engine.eval(...) blocking

3. At 30s (sync_timeout):
   a. Still running → return immediately to agent
   b. Job state: RUNNING
   c. Response: { "job_id": "abc123", "status": "running" }

4. Background task continues in pool:
   a. Engine stays reserved
   b. MATLAB code continues computing

5. Agent polls periodically:
   a. get_job_status("abc123")
   b. Response: { "status": "running", "progress": 45.2, ... }

6. When MATLAB completes (~45s):
   a. Background task finishes
   b. Job state: COMPLETED
   c. Result stored in tracker
   d. Engine released to pool

7. Agent calls: get_job_result("abc123")
   a. Response: full result dict with output, variables, etc.

Session Isolation Example (Multi-User SSE)

Session A (Alice)               Session B (Bob)
└─ session_id: "sess_alice"    └─ session_id: "sess_bob"
   ├─ engine pool: shared         ├─ engine pool: shared
   ├─ temp_dir: /tmp/abc123       ├─ temp_dir: /tmp/xyz789
   └─ workspace:                  └─ workspace:
      ├─ x = [1 2 3]                 ├─ x = ["foo" "bar"]
      └─ results = {...}             └─ data = {...}

Alice executes:                Bob executes:
  x = 100                        x = "changed"
  
Alice's workspace:             Bob's workspace:
  x = 100 (unchanged)          x = "changed" (unchanged)
  (session isolated)           (session isolated)

Alice uploads file:            Bob lists files:
  upload_data("data.csv")      list_files()
  → /tmp/abc123/data.csv       → [] (empty, own temp dir)

Key Design Decisions

1. Hybrid Sync/Async Execution with Auto-Promotion

  • Why: Agents expect fast responses (<1s) for quick queries, but MATLAB simulations can run hours
  • How: Execute synchronously for first sync_timeout (30s). If still running, promote to async job, return immediately with job_id, continue in background
  • Trade-off: Agents must poll for long-running jobs; prevents response-timeout failures

2. Elastic Engine Pool with Proactive Warmup

  • Why: MATLAB engines are expensive (~30s startup); shared pool amortizes cost across users
  • How: Start with min_engines (2), scale up to max_engines (10) under demand. When utilization >80%, pre-start next engine before saturation
  • Trade-off: Memory overhead of idle engines; high responsiveness under load

3. Per-Session Temp Directories for Isolation

  • Why: Multiple agents share one pool; accidental file collisions or workspace pollution would break isolation
  • How: Each session gets unique temp_dir. Code executes in isolated workspace. Session cleanup removes directory on timeout
  • Trade-off: Filesystem churn; requires session-aware file operations

4. Bearer Token Auth at Middleware Layer (Not Business Logic)

  • Why: Auth must happen before MCP protocol processing; placing it in tool code causes security gaps
  • How: Pure ASGI middleware validates Authorization: Bearer <token> before FastMCP touches the request
  • Trade-off: Requires HTTP/SSE transports; stdio transport cannot authenticate (local, trusted)

5. MATLAB Function Blocklist with Smart String Stripping

  • Why: Prevent accidents (system("rm -rf /")) and malicious injection; but false positives would break legitimate code
  • How: Strip string literals and comments first, then scan for function names
  • Trade-off: Heuristic approach — advanced obfuscation could bypass (e.g., dynamic code via eval); disabling recommended for trusted agents

6. Plotly Figures Over MATLAB Native Format

  • Why: MATLAB .fig files require MATLAB to render; agents and browsers cannot display them
  • How: Extract figure properties with mcp_extract_props.m, map MATLAB styles to Plotly equivalents, return interactive JSON + static PNG
  • Trade-off: Style fidelity loss for complex figures (3D, custom renderers); sufficient for 95% of use cases

7. Configuration via YAML + Environment Variables

  • Why: Containers and CI/CD prefer env vars; developers prefer YAML files
  • How: Load config.yaml if present; env vars (MATLAB_MCP_*) override any value
  • Trade-off: Two sources of truth; env vars always win (can be surprising if misconfigured)

Component Dependencies

graph LR
    FastMCP["FastMCP 3.2.0"]
    Config["Config"]
    Security["SecurityValidator"]
    SessionMgr["SessionManager"]
    JobExecutor["JobExecutor"]
    JobTracker["JobTracker"]
    EnginePool["EnginePoolManager"]
    ResultFormatter["ResultFormatter"]
    PlotlyConverter["PlotlyConverter"]
    Monitoring["Monitoring"]
    HITLGate["HITLGate"]
    
    FastMCP --> Config
    FastMCP --> Security
    FastMCP --> SessionMgr
    FastMCP --> JobExecutor
    JobExecutor --> JobTracker
    JobExecutor --> EnginePool
    JobExecutor --> Security
    JobExecutor --> SessionMgr
    JobExecutor --> ResultFormatter
    ResultFormatter --> PlotlyConverter
    FastMCP --> Monitoring
    FastMCP --> HITLGate
Loading

Transport Modes

stdio (Default, Local)

  • One agent, one session
  • No network overhead
  • No authentication (local context trusted)
  • Simplest for development and single-user setups

SSE (Server-Sent Events, Deprecated in v2.0)

  • Multiple agents via HTTP long-poll
  • One session per client (client-side tracking)
  • Requires reverse proxy for production auth
  • Kept for backward compatibility; migration encouraged to streamable HTTP

streamable-http (New in v2.0, Recommended for Remote)

  • MCP-native HTTP transport
  • Multiple agents with session isolation
  • Built-in bearer token auth via middleware
  • Codex CLI, Claude Code, and other web-based agents connect here
  • Stateless HTTP mode available for multi-instance deployments

Monitoring & Health Status

The server exposes health and metrics via HTTP endpoints:

graph TB
    Health["GET /health"]
    Metrics["GET /metrics"]
    Events["GET /dashboard/api/events"]
    
    Health -->|Status| Healthy["healthy<br/>(OK, no issues)"]
    Health -->|Status| Degraded["degraded<br/>(High utilization,<br/>elevated errors)"]
    Health -->|Status| Unhealthy["unhealthy<br/>(All engines max,<br/>startup failures)"]
    
    Metrics -->|Counters| Jobs["Jobs completed/failed"]
    Metrics -->|Counters| Sessions["Sessions created"]
    Metrics -->|Counters| Errors["Blocked attempts"]
    Metrics -->|Stats| Execution["Execution time<br/>percentiles"]
    Metrics -->|System| System["Memory, CPU,<br/>uptime"]
    
    Events -->|Log| EventLog["Recent events<br/>(errors, jobs)"]
Loading

HTTP status codes:

  • 200 OK — healthy or degraded
  • 503 Service Unavailable — unhealthy (cannot accept new jobs)

Summary

The MATLAB MCP Server is a layered architecture that keeps concerns separated:

  1. Transport layer (stdio, SSE, streamable HTTP) — handles agent connectivity
  2. API layer (FastMCP tools) — defines what agents can do
  3. Security layer (blocklist, sanitization, HITL) — enforces policies
  4. Execution layer (executor, pool, tracker) — runs code reliably
  5. Output layer (formatter, Plotly converter) — structures results for agents
  6. Monitoring layer (collector, health, dashboard) — observability

Each layer is independent and testable. Changes to one layer (e.g., adding a new transport) do not require changes to others.

Clone this wiki locally