-
Notifications
You must be signed in to change notification settings - Fork 0
Architecture
graph TB
Agent["AI Agent<br/>(Claude, Cursor, Copilot)"]
Agent -->|MCP Protocol<br/>stdio or SSE| Server["MCP Server<br/>(FastMCP)"]
Server -->|Register Tools| Tools["20 Built-in Tools<br/>+ Custom Tools"]
Server -->|Route Requests| Executor["Job Executor<br/>(Sync/Async Hybrid)"]
Server -->|Track Sessions| Sessions["Session Manager<br/>(Per-user Isolation)"]
Server -->|Validate Code| Security["Security Validator<br/>(Blocklist, Sanitize)"]
Server -->|Format Results| Formatter["Result Formatter<br/>(Text, Plotly, Images)"]
Executor -->|Acquire/Release| Pool["Engine Pool Manager<br/>(Elastic Scaling)"]
Pool -->|Start/Stop/Execute| Engines["MATLAB Engines<br/>(2020b+)"]
Executor -->|Track Jobs| Tracker["Job Tracker<br/>(Metadata Store)"]
Server -->|Optional| Monitor["Monitoring Stack<br/>(Collector, Store, Dashboard)"]
Engines -->|Figure Props| PlotlyConvert["Plotly Converter<br/>(Style Mapping, WebGL)"]
PlotlyConvert -->|JSON + PNG| Formatter
The entry point orchestrating all subsystems:
- Tool Registration: Exposes 20 built-in tools + dynamic custom tools from YAML
- Lifecycle Management: Server startup (pool init, monitoring setup), graceful shutdown, job draining
- Request Routing: Delegates tool calls to implementation modules
- Background Tasks: Health checks, session cleanup, metrics collection
- Transport Support: stdio (single session) and SSE (multi-session with HTTP)
Manages an elastic pool of MATLAB engine instances:
-
Elastic Scaling: Starts with
min_engines(default 2), scales up tomax_engines(default 8) under load -
Proactive Warmup: When pool utilization exceeds
proactive_warmup_threshold(80%), starts a new engine preemptively -
Idle Scale-Down: Engines idle longer than
scale_down_idle_timeout(15 minutes) are stopped, down tomin_engines -
Health Checks: Periodic
1+1eval every 30 seconds to verify engines are responsive; unhealthy engines replaced - Work Queue: Async queue for requests when all engines are busy; configurable max queue depth
-
Workspace Reset: Optional
clear allbetween sessions for isolation
Wraps a single matlab.engine instance with lifecycle and state management:
-
States: STOPPED, STARTING, IDLE, BUSY (tracked via
EngineStateenum) -
Lifecycle:
start()applies config (paths, startup commands),stop()callsquit() -
Execution: Sync eval or background eval with
concurrent.futures.Futurewrapping -
Health:
health_check()runs a trivial eval;is_aliveproperty - Workspace: Optional reset to clear variables between sessions
Orchestrates the full lifecycle of MATLAB code execution with hybrid sync/async model:
-
Security Check: Scans code for blocked functions (handled by
SecurityValidator) - Job Creation: Registers job in tracker with initial PENDING state
-
Context Injection: Adds
__mcp_job_id__and__mcp_temp_dir__to workspace -
Synchronous Execution: Runs code, waits up to
sync_timeout(default 30 seconds) -
Async Promotion: If timeout exceeded, promotes to background execution, returns
job_id - Result Capture: Serializes workspace variables, stdout/stderr, execution time, figures
- Monitoring Events: Emits events for metrics collection
Thread-safe in-memory registry of all jobs:
- CRUD: Create, get, list jobs filtered by session
- Lifecycle States: PENDING → RUNNING → COMPLETED/FAILED/CANCELLED
- Metadata: Job ID, session ID, code, engine assignment, timestamps, elapsed time
-
Pruning: Periodic cleanup removes jobs older than
job_retention_seconds(default 86400 = 24h)
Per-user session isolation for multi-user (SSE) deployments:
- Session Creation: Each session gets unique ID and temp directory
- Default Session: stdio transport uses a singleton "default" session
-
Timeout: Sessions expire after
session_timeout(default 3600s); cleanup on access -
Limits: Enforces
max_sessions(default 10) per server -
Workspace Isolation: Optional
clear allbetween sessions
Pre-execution security hardening:
-
Function Blocklist: Detects dangerous patterns:
system(),unix(),dos(),!(shell escape),eval(),feval(),evalc(),evalin(),assignin(),perl(),python() -
Smart Scanning: Strips string literals (
'...'and"...") and comments before regex matching to avoid false positives -
Filename Sanitization: Validates upload filenames to prevent path traversal (
../../../etc/passwd) -
Size Limits: Enforces
max_upload_size_mb(default 100MB) - Customizable: Blocklist and enabled/disabled modes configurable via YAML
Structures tool responses for MCP protocol:
-
Text Formatting: Truncates output exceeding
max_output_chars(default 50000), saves overflow to file - Variable Display: Summarizes workspace variables with type/shape info, hides large arrays for brevity
- Error Handling: Builds error response dicts with exception messages and stack traces
- Delegates: Hands off to Plotly converter and thumbnail generator for rich media
Converts MATLAB figures to interactive Plotly JSON:
-
MATLAB-side (
mcp_extract_props.m): Extracts raw figure structure (axes, traces, styling, limits, labels) - Python-side: Maps MATLAB line styles, marker symbols, colormaps, fonts to Plotly equivalents
- WebGL Optimization: Automatically uses WebGL for line/scatter traces with >10,000 points
- Output: Plotly JSON + static PNG thumbnail + optional base64-encoded image
- Subplot Support: Handles single axes, rectangular grids, and tiled layouts with domain splitting
When monitoring.enabled: true:
-
MetricsCollector (
monitoring/collector.py): Aggregates counters (jobs, sessions, errors), execution time ring buffer (p95), pool/system metrics -
MetricsStore (
monitoring/store.py): Async SQLite database for time-series metrics and structured events -
Health Evaluator (
monitoring/health.py): Classifies server health (healthy/degraded/unhealthy) based on utilization, error rate, capacity -
HTTP Dashboard (
monitoring/dashboard.py,static/): Starlette sub-app serving/health,/metrics,/dashboardwith real-time Plotly charts
sequenceDiagram
participant Agent
participant Server
participant Pool
participant Engine
participant Tracker
participant Executor
Agent->>Server: execute_code("x = magic(3)")
Server->>Executor: execute(session_id, code)
Executor->>Tracker: create_job()
Tracker-->>Executor: job_id
Executor->>Pool: acquire_engine()
Pool-->>Executor: engine
Executor->>Engine: workspace["__mcp_job_id__"] = job_id
Executor->>Engine: eval(code, timeout=30s)
Engine-->>Executor: result
Executor->>Tracker: mark_completed(result)
Executor->>Pool: release_engine()
Executor-->>Server: {status: "completed", output: "...", time: 0.23}
Server-->>Agent: MCP response
sequenceDiagram
participant Agent
participant Server
participant Executor
participant Pool
participant Engine
participant Tracker
Agent->>Server: execute_code("long_simulation()")
Server->>Executor: execute(session_id, code)
Executor->>Tracker: create_job()
Executor->>Pool: acquire_engine()
Executor->>Engine: eval(code, background=True)
Engine-->>Executor: Future (running)
Note over Executor: Timeout (30s) exceeded
Executor->>Tracker: mark_running(job)
Executor->>Pool: release_engine()
Executor-->>Server: {status: "running", job_id: "abc123"}
Server-->>Agent: MCP response
Note over Engine: Code continues in background
rect rgb(200, 200, 200)
Agent->>Server: get_job_status("abc123")
Server->>Tracker: get_job("abc123")
Tracker-->>Server: {status: "running", progress: 45%}
Server-->>Agent: {progress: 45%, elapsed: 120s}
end
rect rgb(200, 200, 200)
Agent->>Server: get_job_result("abc123")
Server->>Executor: fetch_result("abc123")
Executor->>Engine: result
Engine-->>Executor: workspace snapshot
Executor-->>Server: full result
Server-->>Agent: {status: "completed", output: "...", ...}
end
sequenceDiagram
participant Agent1
participant Agent2
participant Server
participant SessionMgr
participant Pool
Agent1->>Server: POST /mcp with session_id="user1"
Server->>SessionMgr: get_session("user1")
SessionMgr-->>Server: Session(temp_dir="/tmp/user1_abc123")
Agent2->>Server: POST /mcp with session_id="user2"
Server->>SessionMgr: get_session("user2")
SessionMgr-->>Server: Session(temp_dir="/tmp/user2_def456")
Agent1->>Server: execute_code("x = 1", session_id="user1")
Agent2->>Server: execute_code("y = 2", session_id="user2")
Server->>Pool: acquire_engine()
Pool-->>Server: engine1
Server->>Pool: acquire_engine()
Pool-->>Server: engine2
Note over Server: Both run in parallel<br/>in separate engines
Decision: Execute code synchronously by default; promote to async on timeout.
Rationale:
- Simplicity: Most code completes fast (< 5s); sync results are simpler for agents
- Responsiveness: No polling overhead for typical queries
- Async Support: Long-running code (Monte Carlo, large matrix ops) still supported via promotion
Trade-off: Timeout must be tuned per deployment. Too short → unnecessary promotion; too long → slow response to agents.
Decision: Scale engines on-demand up to max_engines, with proactive startup at 80% utilization.
Rationale:
- Resource Efficiency: Don't pre-allocate 10 engines if only 2 are needed
- Responsiveness: Proactive warmup avoids startup latency when load spikes
- Stability: Idle scale-down prevents resource bloat over long runs
Trade-off: Adds complexity (pool state machine); startup latency for first request in a cold pool (~2-3s).
Decision: Each session gets a unique temp directory; files isolated across sessions.
Rationale:
- Multi-User Safety: Prevents agent A from reading agent B's uploaded files
- Cleanup: Temp dirs deleted on session timeout; no manual cleanup needed
Trade-off: Requires session management in SSE mode; stdio mode is simpler (single "default" session).
Decision: Convert MATLAB figures to Plotly JSON for agent-side rendering.
Rationale:
- Interactivity: Agents can zoom/pan/hover; better for exploration
- Portability: No need to render/PNG on server; JSON is lightweight
- WebGL: Automatic acceleration for large datasets (10k+ points)
Trade-off: Some MATLAB figure features (custom uicontrols, special objects) don't convert cleanly. Fallback to PNG.
Decision: Blocklist configurable in YAML; can be customized or disabled.
Rationale:
- Flexibility: Different deployments have different risk tolerances
- Auditable: Blocklist visible in config; easy to review what's blocked
Trade-off: Operators must actively decide on security posture; no "safe by default."
Decision: Metrics are optional; when enabled, stored in async SQLite.
Rationale:
- Zero Overhead: If disabled, no background threads or I/O
- Low Footprint: SQLite requires no external service (Prometheus, InfluxDB)
- Async: Database writes don't block execution threads
Trade-off: SQLite scales to ~1000s of metrics/sec; high-volume deployments may need external time-series DB.
-
Python 3.10+: Core server, async I/O with
asyncio, FastMCP framework -
MATLAB 2020b+: Code execution via
matlab.enginePython API - JavaScript: Client-side dashboard with Plotly charting
-
MATLAB Helper Scripts:
mcp_extract_props.mfor figure conversion,mcp_checkcode.mfor linting,mcp_progress.mfor progress reporting - SQLite (Optional): Metrics storage when monitoring enabled
- FastMCP: MCP protocol abstraction (stdio/SSE)
- Starlette (Optional): HTTP sub-app for monitoring dashboard
- Pydantic: Configuration validation and serialization