Skip to content

Architecture

github-actions[bot] edited this page Mar 23, 2026 · 20 revisions

Architecture

System Overview

graph TB
    Agent["AI Agent<br/>(Claude, Cursor, etc.)"]
    
    subgraph MCP["MCP Server<br/>(FastMCP)"]
        Tools["20 Built-in Tools<br/>+ Custom Tools"]
        SessionMgr["Session Manager"]
        Security["Security Validator"]
        Formatter["Result Formatter"]
    end
    
    subgraph Exec["Execution Layer"]
        JobExec["Job Executor<br/>(Sync/Async)"]
        JobTracker["Job Tracker"]
        Plotly["Plotly Converter<br/>+ Thumbnails"]
    end
    
    subgraph Pool["Engine Pool"]
        PoolMgr["Pool Manager<br/>(Elastic Scaling)"]
        Engine1["Engine 1"]
        Engine2["Engine 2"]
        EngineN["Engine N"]
    end
    
    Monitor["Monitoring<br/>(Metrics, Health)"]
    
    Agent -->|MCP Protocol| MCP
    
    MCP -->|Route Calls| Exec
    MCP -->|Security| Security
    MCP -->|Sessions| SessionMgr
    MCP -->|Format Results| Formatter
    
    Exec -->|Create/Track| JobTracker
    Exec -->|Acquire/Release| Pool
    Exec -->|Convert Figures| Plotly
    
    Pool -->|Manage| PoolMgr
    PoolMgr -->|Distribute| Engine1
    PoolMgr -->|Distribute| Engine2
    PoolMgr -->|Distribute| EngineN
    
    PoolMgr -->|Health Check| Monitor
    JobExec -->|Metrics| Monitor
    SessionMgr -->|Events| Monitor
Loading

Major Components

MCP Server (server.py)

The main FastMCP server entry point that orchestrates all subsystems:

  • Tool Registration: Registers 20 built-in tools across 7 categories (Code Execution, Job Management, Discovery, File Operations, Admin, File Reading, Monitoring) plus any custom tools from custom_tools.yaml
  • Lifecycle Management: Handles startup, shutdown, and graceful drain (waits for pending jobs before exit)
  • Session Context: Extracts session IDs from requests and routes to appropriate handler implementations
  • Background Tasks: Runs periodic health checks and idle session cleanup
  • Transport Support: Routes to stdio or SSE transport backends

Key Exports:

  • MatlabMCPServer — state object holding pool, executor, tracker, sessions, security, monitoring
  • create_server() — factory function assembling and configuring FastMCP
  • main() — CLI entry point with argument parsing

Engine Pool Manager (pool/manager.py)

Elastically scales MATLAB engines based on demand:

  • Elastic Scaling: Maintains min_engines to max_engines (default 2–10)
  • Proactive Warmup: When utilization exceeds proactive_warmup_threshold (80%), starts a new engine before queue builds
  • Scale-Down: Stops idle engines beyond min_engines after scale_down_idle_timeout (15 minutes)
  • Health Checks: Periodically pings engines with trivial eval (1+1). Replaces dead engines
  • Queue Management: Async queue with queue_max_size limit; requests block if all engines busy and queue full

Key Methods:

  • acquire() — async get an engine (waits if needed)
  • release(engine) — return engine to pool
  • get_status() — pool metrics (total, available, busy, max)

Engine Wrapper (pool/engine.py)

Wraps a single matlab.engine instance with lifecycle states:

  • States: STOPPEDSTARTINGIDLEBUSY
  • Lazy Startup: Engine only starts on first use (imports matlab.engine dynamically)
  • Workspace Setup: Applies default paths and startup commands (e.g., format long)
  • Workspace Reset: Optionally clears all variables, paths, and closes files between sessions
  • Health Check: is_alive property checks if engine is responsive

Key Methods:

  • start() / stop() — lifecycle
  • execute(code, background=False) — run code (returns result or Future)
  • mark_busy() / mark_idle() — state transitions
  • reset_workspace() — isolation

Job Executor (jobs/executor.py)

Orchestrates hybrid sync/async code execution:

  1. Create Job: Tracker records job metadata (session, code, timestamps)
  2. Acquire Engine: Gets engine from pool (waits if needed)
  3. Inject Context: Sets __mcp_job_id__ and __mcp_temp_dir__ in workspace
  4. Execute Background: Starts code execution with background=True to capture stdout/stderr
  5. Sync Timeout Logic:
    • If completes in sync_timeout (default 30s) → return result inline with status "completed"
    • If times out → promote to async background task, return job_id with status "pending"
  6. Async Monitoring: Background task waits for completion, stores result, releases engine

Key Methods:

  • execute(session_id, code, temp_dir) — main entry point

Job Tracker (jobs/tracker.py)

Thread-safe in-memory registry of all jobs:

  • Create: create_job(session_id, code) → generates unique ID, initial PENDING status
  • Get: get_job(job_id) → retrieve by ID
  • List: list_jobs(session_id=None) → filter by session
  • Prune: Automatically removes terminal jobs older than job_retention_seconds (default 24 hours)

Session Manager (session/manager.py)

Per-user session isolation:

  • Session Creation: Each user gets a unique temp directory (.../temp/sess-xxx/)
  • Workspace Isolation: Optionally runs clear all and restoredefaultpath between sessions
  • Expiration: Sessions idle longer than session_timeout (default 1 hour) are destroyed
  • Limits: Enforces max_sessions (default 50)
  • Default Session: stdio transport uses a single persistent "default" session

Key Methods:

  • create_session() → Session object with ID and temp dir
  • get_session(session_id) → retrieve
  • destroy_session(session_id) → cleanup temp files
  • touch(session_id) → reset idle timer

Security Validator (security/validator.py)

Pre-execution security checks:

  • Function Blocklist: Detects calls to dangerous functions (system, unix, dos, !, eval, feval, evalc, evalin, assignin, perl, python). Smart scanning strips string literals and comments first to avoid false positives
  • Configurable: Entire blocklist can be disabled or customized per security.blocked_functions
  • Filename Sanitization: Prevents path traversal (../../etc/passwd rejected) and invalid characters
  • Upload Limits: Enforces max_upload_size_mb (default 100MB)

Key Methods:

  • check_code(code) → raises BlockedFunctionError if unsafe
  • sanitize_filename(name) → safe filename or error dict

Result Formatter (output/formatter.py)

Structures tool responses into MCP-compatible dicts:

  • Text Output: Truncates to max_inline_text_length (default 50KB), saves excess to file
  • Variable Formatting: Summarizes workspace variables (type, size, preview)
  • JSON Serialization: Converts numpy arrays and similar objects to JSON-safe forms
  • Large Results: Flags results larger than large_result_threshold (default 10KB) for handling

Key Methods:

  • format_result(job, execution_result, temp_dir) → success dict
  • format_error(job, error_type, message) → error dict

Plotly Converter (output/plotly_convert.py, output/plotly_style_mapper.py)

Converts MATLAB figures to interactive Plotly JSON:

MATLAB Side (mcp_extract_props.m):

  • Extracts raw figure properties: axes, line/scatter/bar/histogram/surface plots
  • Detects layout type (single, multi-row/column subplots, tiledlayout)
  • Exports properties JSON to temp file

Python Side:

  • plotly_style_mapper.py — maps MATLAB styles to Plotly: line styles, marker types, colormaps, fonts, colors, axis ranges
  • plotly_convert.py — loads JSON, validates schema, builds Plotly JSON
  • WebGL Support: Automatically uses scattergl / bar with WebGL for datasets >10,000 points
  • Static Images: Generates PNG/JPG at configurable DPI for non-interactive display
  • Thumbnails: Resizes to 400px width for preview

Monitoring (monitoring/collector.py, monitoring/store.py, monitoring/health.py, monitoring/dashboard.py)

Real-time system metrics and health evaluation:

  • MetricsCollector: Gathers counters (jobs, errors, executions) and execution time statistics in memory
  • MetricsStore: Async SQLite backend for persistent time-series data (metrics snapshots, events)
  • Health Evaluator: Assesses system health ("healthy", "degraded", "unhealthy") based on pool utilization, error rates, capacity
  • Dashboard: Starlette sub-application serving /health, /metrics, and interactive web dashboard with Plotly charts

Key Endpoints:

  • /health → JSON health status (HTTP 200/503)
  • /metrics → metrics snapshot
  • /dashboard → HTML dashboard with real-time charts
  • /dashboard/api/* → JSON API for chart data

Data Flow

Synchronous Execution (< 30s)

sequenceDiagram
    participant Agent
    participant Server as MCP Server
    participant Security as Security
    participant Pool as Engine Pool
    participant Engine as MATLAB Engine
    participant Tracker as Job Tracker
    participant Formatter as Formatter

    Agent->>Server: execute_code("x = magic(3)")
    Server->>Security: check_code(code)
    Security-->>Server: OK
    Server->>Tracker: create_job(session, code)
    Tracker-->>Server: Job {id, status=PENDING}
    Server->>Pool: acquire()
    Pool-->>Server: Engine (waits if busy)
    Server->>Engine: workspace.__mcp_job_id__ = job_id
    Server->>Engine: eval(code, background=True)
    Engine-->>Server: Future (completes in 2s)
    Server->>Tracker: mark_completed(job)
    Server->>Formatter: format_result(job)
    Formatter-->>Server: {status: "completed", output: "ans = ..."}
    Server->>Pool: release(Engine)
    Server-->>Agent: result JSON
Loading

Asynchronous Promotion (> 30s)

sequenceDiagram
    participant Agent
    participant Server as MCP Server
    participant Pool as Engine Pool
    participant Engine as MATLAB Engine
    participant Tracker as Job Tracker

    Agent->>Server: execute_code("long_simulation()")
    Server->>Tracker: create_job(session, code)
    Server->>Pool: acquire()
    Server->>Engine: eval(code, background=True)
    Engine-->>Server: Future
    Note over Server: Wait 30s timeout...
    Server-->>Agent: {status: "pending", job_id: "abc123"}
    
    par Background Monitoring
        Server->>Engine: Poll future.result() (blocks)
        Engine-->>Server: Result (after 120s total)
    end
    
    Agent->>Server: get_job_status("abc123")
    Server->>Tracker: get_job(abc123)
    Tracker-->>Server: Job {status: RUNNING, ...}
    Server-->>Agent: {status: "running", progress: 45%}
    
    Note over Server: Background job completes
    Server->>Tracker: mark_completed(job)
    Server->>Pool: release(Engine)
    
    Agent->>Server: get_job_result("abc123")
    Server->>Tracker: get_job(abc123)
    Tracker-->>Server: Job {status: COMPLETED, result: {...}}
    Server-->>Agent: {status: "completed", output: "..."}
Loading

File Upload & Reading

graph LR
    Agent["Agent"]
    Upload["upload_data"]
    SessionMgr["Session Manager"]
    TempDir["Temp Directory<br/>(sess-xxx/)"]
    ReadFile["read_data/<br/>read_image"]
    
    Agent -->|base64 file| Upload
    Upload -->|sanitize filename| SessionMgr
    SessionMgr -->|save to| TempDir
    Agent -->|read filename| ReadFile
    ReadFile -->|load from| TempDir
    ReadFile -->|encode/render| Agent
Loading

Figure Conversion Pipeline

graph LR
    Code["MATLAB Code<br/>(plot, scatter, etc.)"]
    MatlabHelper["mcp_extract_props.m"]
    JSONFile["Figure JSON<br/>(temp dir)"]
    StyleMapper["plotly_style_mapper.py"]
    PlotlyJSON["Plotly JSON"]
    StaticImage["PNG/JPG Image<br/>(via kaleido)"]
    Thumbnail["Thumbnail<br/>(400px)"]
    Result["MCP Response"]
    
    Code -->|figure created| MatlabHelper
    MatlabHelper -->|save properties| JSONFile
    JSONFile -->|load & convert| StyleMapper
    StyleMapper -->|MATLAB→Plotly| PlotlyJSON
    PlotlyJSON -->|render| StaticImage
    StaticImage -->|shrink| Thumbnail
    PlotlyJSON -->|include in| Result
    StaticImage -->|include in| Result
    Thumbnail -->|include in| Result
Loading

Key Design Decisions & Trade-offs

1. Hybrid Sync/Async Execution

Decision: Auto-promote to async if code exceeds sync_timeout (default 30s)

Rationale:

  • Agents expect quick responses (< 30s ideal for low-latency interaction)
  • Long-running simulations must not block the agent
  • Auto-promotion means agents don't need to pre-declare job type

Trade-off:

  • Agent must poll get_job_status() for long jobs (not ideal for fire-and-forget)
  • Sync timeout adds latency for jobs that barely exceed threshold

2. In-Memory Job Tracker

Decision: Jobs stored in dict with automatic pruning of old terminal jobs

Rationale:

  • Fast lookups (no DB query for get_job_status)
  • Sufficient for typical session durations (1–24 hours)
  • Reduces deployment complexity (no external DB needed)

Trade-off:

  • Jobs lost on server restart (acceptable for MCP use case)
  • Memory grows unbounded without pruning (mitigated by job_retention_seconds)

3. Session-Based Workspace Isolation

Decision: Each user session gets a separate temp directory; workspace cleared on demand

Rationale:

  • Prevents accidental data leakage between users
  • Allows agents to make assumptions about clean state
  • Fits MCP's session-per-request model

Trade-off:

  • Repeated clear all adds per-request overhead (~100ms)
  • Disabling isolation may improve performance but requires trust

4. Elastic Engine Pool

Decision: Dynamic scaling min → max with proactive warmup and idle scale-down

Rationale:

  • Handles bursty agent traffic without pre-allocating many engines
  • MATLAB engines are expensive (memory, startup time ~2–5s)
  • Proactive warmup prevents cold starts during traffic spikes

Trade-off:

  • Pool manager adds complexity
  • Scale-up latency: new engines take time to start
  • Scale-down may kill engines needed again moments later

5. Separate Monitoring Subsystem

Decision: Optional metrics collection + dashboard, disabled by default

Rationale:

  • Monitoring adds overhead (SQLite writes, sampling)
  • Not all deployments need it (stdio personal use)
  • Keeps core execution path lean

Trade-off:

  • Metrics lag real-time (sampled every 10s by default)
  • Dashboard adds HTTP service complexity

6. Smart Security Scanning

Decision: Strip string literals and comments before checking for blocked functions

Rationale:

  • Reduces false positives (e.g., "system() is a function" should be allowed)
  • Users can document/reason about code without triggering blocks

Trade-off:

  • Regex-based stripping is not foolproof (e.g., clever string escapes)
  • Adds overhead to every check_code() call

7. Plotly Conversion with WebGL

Decision: Auto-switch to WebGL for datasets > 10,000 points

Rationale:

  • WebGL rendering keeps browser responsive for large scatter/bar plots
  • Automatic detection avoids user configuration

Trade-off:

  • WebGL visual quality slightly different than SVG (anti-aliasing, color precision)
  • Adds complexity to style mapper

Configuration Impact on Architecture

Setting Impact
pool.min_engines / max_engines Affects resource usage and queueing behavior
execution.sync_timeout Changes when async promotion happens
execution.workspace_isolation Tradeoff between isolation and performance
security.blocked_functions Security vs. functionality
output.plotly_conversion Disabling saves MATLAB helper overhead
monitoring.enabled Enables metrics collection & dashboard
server.transport stdio (single agent) vs. SSE (multi-agent)

Clone this wiki locally