Skip to content

Architecture

github-actions[bot] edited this page Apr 3, 2026 · 20 revisions

Architecture

System Overview

The MATLAB MCP Server connects AI agents to MATLAB through a layered architecture with elastic resource pooling, session isolation, and async job orchestration.

graph TB
    Agent["AI Agent<br/>(Claude, Cursor, Codex)"]
    Transport["Transport Layer<br/>(stdio/SSE/HTTP)"]
    MCP["FastMCP Server<br/>(Tool Registration)"]
    Auth["Auth Middleware<br/>(Bearer Token)"]
    Tools["MCP Tools Layer<br/>(execute, discover, files, jobs)"]
    Executor["Job Executor<br/>(Sync/Async Orchestration)"]
    Security["Security Validator<br/>(Code Blocking, Filename Sanitization)"]
    Pool["Engine Pool Manager<br/>(Elastic Scaling, Health Checks)"]
    Session["Session Manager<br/>(Workspace Isolation)"]
    Monitor["Monitoring Layer<br/>(Metrics, Health, Logs)"]
    Output["Output Formatter<br/>(Results, Plotly Conversion)"]
    MATLAB["MATLAB Engine API<br/>(R2022b+)"]
    
    Agent -->|MCP Protocol| Transport
    Transport -->|stdio/SSE/HTTP| Auth
    Auth -->|Validated| MCP
    MCP -->|Routes| Tools
    Tools -->|Execution| Executor
    Tools -->|Workspace| Session
    Executor -->|Validate| Security
    Executor -->|Acquire/Release| Pool
    Pool -->|Execute| MATLAB
    Executor -->|Format| Output
    Tools -->|Query| Monitor
    Monitor -->|Health/Metrics| Tools
    
    style Agent fill:#e1f5ff
    style Transport fill:#fff3e0
    style Auth fill:#f3e5f5
    style MCP fill:#e3f2fd
    style Tools fill:#e8f5e9
    style Executor fill:#fce4ec
    style Security fill:#fff9c4
    style Pool fill:#f0f4c3
    style Session fill:#e0f2f1
    style Monitor fill:#ede7f6
    style Output fill:#fbe9e7
    style MATLAB fill:#c8e6c9
Loading

Component Responsibilities

Server Layer (server.py)

Purpose: FastMCP server initialization, tool registration, and lifespan management.

  • Loads configuration from YAML and environment variables
  • Registers all 20+ MCP tools (core, discovery, file management, jobs, admin, custom, monitoring)
  • Manages server lifecycle (startup initialization, graceful shutdown, engine pool draining)
  • Routes tool invocations to implementation modules
  • Runs background tasks (health checks, session cleanup, metrics sampling)

Engine Pool Layer (pool/)

Purpose: Manage lifecycle and scaling of MATLAB engine instances.

Components:

  • EnginePoolManager — Maintains a queue of available engines, handles scale-up/down, health checks
  • MatlabEngineWrapper — Wraps individual engine with state tracking (STOPPED → STARTING → IDLE ↔ BUSY)

Behavior:

  • Starts with min_engines, scales up to max_engines under load
  • Proactive warmup: when utilization exceeds proactive_warmup_threshold (80%), pre-starts an engine
  • Scale-down: idle engines beyond min_engines are stopped after scale_down_idle_timeout (15 min)
  • Health checks: periodic 1+1 eval to detect and replace unresponsive engines
  • Queue: requests wait when all engines busy

Job Execution Layer (jobs/)

Purpose: Orchestrate code execution with hybrid sync/async semantics.

Components:

  • JobExecutor — Coordinates pool acquisition, context injection, execution, result building
  • JobTracker — Thread-safe registry of all jobs with retention-based cleanup
  • Job model — State machine (PENDING → RUNNING → COMPLETED/FAILED/CANCELLED)

Flow:

  1. Security validation (blocked function scan)
  2. Create PENDING job in tracker
  3. Acquire engine from pool
  4. Inject job context (__mcp_job_id__, __mcp_temp_dir__) into workspace
  5. Start execution (sync or background)
  6. Sync path: Complete within sync_timeout (30s default) → return result inline, mark COMPLETED
  7. Async path: Exceed timeout → return job_id, background task monitors completion

Session Layer (session/manager.py)

Purpose: Provide per-user workspace isolation.

Responsibilities:

  • Manage Session objects: unique ID, per-session temp directory, activity timestamps
  • Enforce session limits and idle timeouts
  • Auto-cleanup of expired sessions
  • Default "default" session for stdio transport, unique sessions per SSE client

Security Layer (security/validator.py)

Purpose: Validate MATLAB code and sanitize filenames before execution.

Mechanisms:

  • Function blocklist: Precompiled regex patterns for system, unix, dos, shell escape (!), eval, evalc, etc.
  • String-literal stripping: Smart removal of string content and comments to avoid false positives
  • Filename sanitization: Rejects path traversal (..), invalid characters, and enforces alphanumeric+safe-punctuation

Tools Layer (tools/)

Purpose: Implement all 20+ MCP tool handlers.

Tool Categories:

  • core.pyexecute_code, check_code, get_workspace
  • discovery.pylist_toolboxes, list_functions, get_help
  • files.pyupload_data, delete_file, list_files, read_script, read_data, read_image
  • jobs.pyget_job_status, get_job_result, cancel_job, list_jobs
  • admin.pyget_pool_status, get_server_metrics
  • custom.py — Load and expose user-defined MATLAB functions from YAML
  • monitoring.pyget_server_metrics, get_server_health, get_error_log

Output Formatting Layer (output/)

Purpose: Format MATLAB execution results for MCP responses.

Components:

  • formatter.py — Text truncation/saving, variable summarization, response building
  • plotly_convert.py & plotly_style_mapper.py — MATLAB figure → Plotly JSON conversion
  • thumbnail.py — Static PNG generation for figures

Monitoring Layer (monitoring/)

Purpose: Collect metrics and provide health diagnostics.

Components:

  • collector.py — In-memory metrics (counters, execution time ring buffer, event recording)
  • store.py — SQLite persistence with periodic pruning
  • health.py — Status classification (healthy/degraded/unhealthy) based on pool utilization and error rates
  • dashboard.py — Starlette HTTP app with /health, /metrics, /dashboard endpoints
  • routes.py — HTTP response builders for monitoring endpoints

Authentication Middleware (auth/) — NEW in v2.0

Purpose: Validate bearer token authentication on HTTP/SSE transports.

Mechanism:

  • BearerAuthMiddleware — Pure ASGI middleware that checks Authorization: Bearer <token> header
  • Reads token from MATLAB_MCP_AUTH_TOKEN environment variable
  • Returns HTTP 401 with WWW-Authenticate header on validation failure
  • Bypasses /health endpoint and OPTIONS pre-flight requests
  • Disabled for stdio transport (no HTTP layer)

Data Flow Diagrams

Synchronous Code Execution

sequenceDiagram
    participant Agent as AI Agent
    participant Server as MCP Server
    participant Job as JobTracker
    participant Pool as EnginePoolManager
    participant Engine as MATLAB Engine
    participant Security as SecurityValidator
    participant Output as OutputFormatter
    
    Agent->>Server: execute_code("x = magic(3)")
    Server->>Security: check_code(code)
    Security-->>Server: OK
    Server->>Job: create_job(code, session_id)
    Job-->>Server: job_id, PENDING
    Server->>Pool: acquire(timeout=5s)
    Pool-->>Server: engine (IDLE→BUSY)
    Server->>Engine: eval("code", background=False)
    Engine->>Engine: Execute synchronously
    Engine-->>Server: stdout, stderr, status
    Server->>Output: format_result(stdout, variables)
    Output-->>Server: formatted_result
    Server->>Job: mark_completed(job_id, result)
    Server->>Pool: release(engine)
    Pool-->>Server: engine (BUSY→IDLE)
    Server->>Agent: {job_id, status: "completed", result}
    
    style Server fill:#e3f2fd
    style Engine fill:#c8e6c9
    style Job fill:#fce4ec
    style Pool fill:#f0f4c3
Loading

Asynchronous Job Promotion

sequenceDiagram
    participant Agent as AI Agent
    participant Server as MCP Server
    participant Job as JobTracker
    participant Pool as EnginePoolManager
    participant Engine as MATLAB Engine
    participant BG as Background Task
    
    Agent->>Server: execute_code("long_simulation()")
    Server->>Job: create_job(code, session_id)
    Server->>Pool: acquire()
    Server->>Engine: eval(code, background=True)
    Engine-->>Server: Future<result>
    Server->>Server: Wait sync_timeout (30s)
    Note over Server: sync_timeout exceeded
    Server-->>Agent: {job_id: "abc123", status: "running"}
    
    par Background Monitoring
        BG->>Engine: Poll future.result()
        Engine->>Engine: Execute simulation
        Engine-->>BG: (still running)
        BG->>Job: Update progress if available
    end
    
    Agent->>Server: get_job_status("abc123")
    Server->>Job: get_job("abc123")
    Job-->>Server: {status: "running", elapsed_seconds: 45}
    Server-->>Agent: {status: "running", progress: 60%}
    
    Engine-->>BG: result (complete)
    BG->>Server: Format result
    BG->>Job: mark_completed("abc123", result)
    BG->>Pool: release(engine)
    
    Agent->>Server: get_job_result("abc123")
    Server->>Job: get_job("abc123")
    Job-->>Server: {status: "completed", result}
    Server-->>Agent: {status: "completed", result: {...}}
Loading

Session Cleanup Flow

sequenceDiagram
    participant Server as MCP Server
    participant SessionMgr as SessionManager
    participant Tracker as JobTracker
    participant FS as File System
    
    loop Every cleanup_interval (60s)
        Server->>SessionMgr: cleanup_expired()
        SessionMgr->>SessionMgr: List all sessions
        
        alt Session idle_seconds > session_timeout
            SessionMgr->>Tracker: list_jobs(session_id)
            
            alt Active jobs exist
                Note over SessionMgr: Skip cleanup (active work)
            else No active jobs
                SessionMgr->>FS: Remove temp_dir
                SessionMgr->>SessionMgr: Delete session
            end
        end
    end
Loading

Bearer Token Authentication Flow (HTTP/SSE)

sequenceDiagram
    participant Agent as AI Agent
    participant Middleware as BearerAuthMiddleware
    participant MCP as MCP Server
    participant Pool as EnginePoolManager
    
    Agent->>Agent: Read MATLAB_MCP_AUTH_TOKEN="token123"
    Agent->>Middleware: POST /mcp<br/>Authorization: Bearer token123
    
    Middleware->>Middleware: Read env MATLAB_MCP_AUTH_TOKEN
    Middleware->>Middleware: Verify token (constant-time compare)
    
    alt Token valid
        Middleware->>MCP: Forward request (validated)
        MCP->>Pool: Process tool call
        Pool-->>MCP: Result
        MCP-->>Middleware: Response
        Middleware-->>Agent: 200 + body
    else Token invalid/missing
        Middleware-->>Agent: 401 + WWW-Authenticate header
        Agent->>Agent: (Retry with correct token)
    end
    
    Note over Middleware: /health endpoint bypasses auth
    Note over Middleware: OPTIONS pre-flight bypassed
    Note over Middleware: stdio transport has no middleware
Loading

Key Design Decisions

1. Hybrid Sync/Async Execution

  • Decision: Jobs start synchronously; if they exceed sync_timeout, they're promoted to async background tracking
  • Rationale: Fast jobs return instantly (low latency), long simulations don't block the agent (non-blocking)
  • Trade-off: Adds complexity in executor state machine, but provides optimal UX for both quick queries and long computations

2. Elastic Engine Pool

  • Decision: Pool starts at min_engines, scales up to max_engines, scales down idle engines
  • Rationale: Handles variable load without over-committing resources (MATLAB engines are heavyweight)
  • Trade-off: Scale-up has ~1-2s startup latency, but amortized over job execution time

3. Session Isolation

  • Decision: Each session gets a unique temp directory and optional workspace clearing
  • Rationale: Multi-user deployments need workspace separation; SSE transports require session IDs
  • Trade-off: Added complexity in session manager, but essential for shared-server setups

4. Pure ASGI Middleware for Auth

  • Decision: Bearer token validation via pure ASGI middleware (not Starlette BaseHTTPMiddleware) wrapping the MCP app
  • Rationale: Avoids Starlette's double-send bug with streaming responses; clean separation of concerns
  • Trade-off: Requires manual ASGI scope handling, but necessary for correctness with HTTP streaming

5. Precompiled Regex for Function Blocking

  • Decision: Security validator uses precompiled regex patterns + string-literal stripping
  • Rationale: Fast, deterministic, avoids false positives for string literals
  • Trade-off: Heuristic-based (not a full MATLAB parser); rare edge cases may slip through, but documented

6. Stateless HTTP Mode (v2.0+)

  • Decision: Optional stateless_http: true mode forwards session ID from headers, no sticky sessions
  • Rationale: Enables horizontal load-balancing without session affinity (critical for shared corporate infra)
  • Trade-off: Requires agent to send consistent headers; small overhead per request

Transport Comparison

Feature stdio SSE HTTP/Streamable
Users 1 Multiple Multiple
Sessions Default By client ID By header
Auth None Optional (reverse proxy) Bearer token (built-in)
Network Local only Remote (CORS issues) Remote (CORS ready)
Upgrade-friendly No SSE deprecated Recommended
Stateless mode N/A No Yes
Monitoring dashboard No Yes Yes

Monitoring & Observability

The monitoring layer continuously collects:

  • Metrics: Engine utilization, jobs completed/failed, execution time percentiles, error rates
  • Health status: HEALTHY (low utilization, <1% errors), DEGRADED (>80% utilization or >5% errors), UNHEALTHY (no engines, max capacity)
  • Events: Job completion, errors, auth failures, engine restarts
  • Storage: In-memory snapshots + SQLite persistence with 24-hour retention (configurable)
  • Dashboard: Real-time Plotly charts at /dashboard endpoint

Health evaluation runs on every /health request and on a background sampling loop, providing agents with visibility into server capacity and reliability.

Clone this wiki locally