Skip to content

Async Jobs

github-actions[bot] edited this page Mar 18, 2026 · 11 revisions

Async Jobs

Long-running MATLAB code is automatically handled through the async job system.

How It Works

  1. You call execute_code with your MATLAB code
  2. The server starts executing synchronously in the background
  3. If execution exceeds sync_timeout (default 30 seconds), the job is promoted to async
  4. You get back a job_id immediately
  5. Poll get_job_status for progress updates
  6. Call get_job_result when the job completes

Job Lifecycle

PENDING → RUNNING → COMPLETED
                  → FAILED
                  → CANCELLED

State Transitions:

  • PENDINGRUNNING: Job is acquired from the engine pool and begins execution
  • RUNNINGCOMPLETED: Execution finishes successfully within sync_timeout or async background task completes
  • RUNNINGFAILED: Execution raises an exception
  • PENDING/RUNNINGCANCELLED: Job is explicitly cancelled via cancel_job

Sync vs. Async Execution

Synchronous (< sync_timeout)

When code completes quickly:

  1. Engine is acquired from the pool
  2. Job context is injected (e.g., __mcp_job_id__, __mcp_temp_dir__)
  3. Code executes with stdout/stderr captured
  4. Result is returned inline with status="completed"
  5. Engine is released back to the pool
POST /execute_code
← {"status": "completed", "job_id": "j-...", "text": "ans = 42"}

Asynchronous (>= sync_timeout)

When code runs longer than sync_timeout:

  1. Engine is acquired and execution starts (same as sync)
  2. Server waits up to sync_timeout seconds
  3. If future completes in time: return result inline (sync path above)
  4. If timeout occurs: promote to async background task
    • Job status becomes status="pending"
    • Background task (_wait_for_completion) continues execution
    • Engine and future remain active
    • Server returns job_id immediately
POST /execute_code
← {"status": "pending", "job_id": "j-abc123"}

# Poll for completion
GET /get_job_status?job_id=j-abc123
← {"status": "running", "progress": 45, "message": "..."}

GET /get_job_status?job_id=j-abc123
← {"status": "completed", "progress": 100}

GET /get_job_result?job_id=j-abc123
← {"status": "completed", "job_id": "j-abc123", "text": "..."}

Job Status Polling

Use get_job_status to monitor a running job:

GET /get_job_status?job_id=j-abc123

Returns:

{
  "job_id": "j-abc123",
  "status": "running",
  "progress": 67,
  "message": "Processing item 67000/100000",
  "elapsed_seconds": 12.5
}

Fields:

  • status — Current state (pending, running, completed, failed, cancelled)
  • progress — Progress percentage (0–100), if reported via mcp_progress()
  • message — Status message, if reported via mcp_progress()
  • elapsed_seconds — Wall-clock time since job started

Progress Reporting

Use the mcp_progress() helper function in your MATLAB code to report progress back to the agent:

mcp_progress(__mcp_job_id__, percentage, message)
  • __mcp_job_id__ — automatically injected into the workspace by the server
  • percentage — number from 0 to 100
  • message — optional status message

Example

n = 1e6;
results = zeros(n, 1);
for i = 1:n
    results(i) = process_item(i);
    if mod(i, 1e5) == 0
        mcp_progress(__mcp_job_id__, i/n*100, ...
            sprintf('Processed %d/%d items', i, n));
    end
end
disp(mean(results));

The agent sees:

get_job_status → {status: "running", progress: 10, message: "Processed 100000/1000000 items"}
get_job_status → {status: "running", progress: 50, message: "Processed 500000/1000000 items"}
get_job_status → {status: "running", progress: 100, message: "Processed 1000000/1000000 items"}
get_job_result → {status: "completed", job_id: "j-...", text: "0.5023"}

How Progress Works Internally

  1. mcp_progress.m writes a JSON file to __mcp_temp_dir__/<job_id>.progress
  2. get_job_status reads this file and includes progress in the response
  3. The file is cleaned up when the job completes or is cancelled

Job Result Retrieval

Use get_job_result to fetch the complete result of a finished job:

GET /get_job_result?job_id=j-abc123

Returns (on success):

{
  "status": "completed",
  "job_id": "j-abc123",
  "text": "ans = 42",
  "variables": {...}
}

Returns (on failure):

{
  "status": "failed",
  "job_id": "j-abc123",
  "error": {
    "type": "MATLABExecutionException",
    "message": "Undefined variable 'x'",
    "matlab_id": "MATLAB:UndefinedFunction",
    "stack_trace": "Error in my_script (line 5)..."
  }
}

Job Cancellation

Cancel a pending or running job with cancel_job:

POST /cancel_job
{"job_id": "j-abc123"}

Returns:

{
  "job_id": "j-abc123",
  "status": "cancelled"
}

Note: Cancellation is immediate at the tracker level. The underlying MATLAB engine execution may continue briefly if the future has already begun; the engine will be released once execution completes or times out.

Job Listing

List all jobs in a session:

GET /list_jobs?session_id=s-xyz789

Returns:

{
  "jobs": [
    {
      "job_id": "j-abc123",
      "status": "completed",
      "created_at": 1699564800.5,
      "started_at": 1699564800.6,
      "completed_at": 1699564810.2,
      "elapsed_seconds": 9.6
    },
    ...
  ]
}

Job Context Injection

The server automatically injects job-related variables into the MATLAB workspace before execution:

  • __mcp_job_id__ — The job's unique ID; pass to mcp_progress()
  • __mcp_temp_dir__ — Temporary directory for writing progress files, artifacts, etc.

Access these like any other variable:

disp(__mcp_job_id__)     % → "j-abc123"
disp(__mcp_temp_dir__)   % → "/tmp/mcp_session_xyz/jobs"

Retention & Cleanup

Completed, failed, and cancelled jobs are retained in memory for a configurable period before pruning:

  • Default retention: 86400 seconds (24 hours)
  • Pruning: Automatic; triggered periodically by the scheduler
  • Configuration: Set job_retention_seconds in sessions config

After a job is pruned, get_job_status and get_job_result will return "job not found".

sessions:
  job_retention_seconds: 86400  # 24 hours

Configuration

execution:
  sync_timeout: 30           # Seconds before async promotion (default)
  max_execution_time: 86400  # Hard limit on total execution time (24h)

sessions:
  job_retention_seconds: 86400  # Keep job metadata for 24h before pruning

Best Practices

  • Short code (< 30s): Results return inline, no job ID needed
  • Medium code (30s - minutes): Auto-promoted to async; poll with get_job_status
  • Long code (hours): Add mcp_progress() calls so the agent can report live status
  • Cancel: Call cancel_job if you need to stop a running or pending job
  • Increase timeout: Set sync_timeout: 60 if most of your code takes 30–60 seconds
  • Progress granularity: Report progress at meaningful intervals (e.g., every 5% or every N items) to avoid file I/O overhead

Clone this wiki locally