Add worker version detection and on-demand update by gricha · Pull Request #79 · gricha/perry

gricha · 2026-01-10T17:46:17Z

Summary

Worker health endpoint now returns version info
Workspace get API includes workerVersion for running workspaces
New updateWorker API endpoint to update worker binary in a workspace
copyPerryWorker prefers installed binary (~/.perry/bin/perry) over dist

How it works

Worker /health endpoint returns { status: 'ok', version: '0.3.7', sessionCount: 5 }
When fetching workspace details, API queries worker health and includes workerVersion
UI can compare hostVersion (from /info) with workerVersion
If versions differ, UI shows "Update available" nudge with button
Button calls workspaces.updateWorker({ name }) which:
- Kills existing worker process
- Copies new binary from host
- Restarts worker server

Test plan

Start workspace, verify GET /rpc/workspaces/get returns workerVersion
Verify worker health endpoint returns version: curl http://<ip>:7392/health
Test updateWorker endpoint updates binary and restarts worker

🤖 Generated with Claude Code

- Worker health endpoint now returns version info - Workspace get API includes workerVersion for running workspaces - New updateWorker API endpoint to update worker binary in a workspace - copyPerryWorker prefers installed binary (~/.perry/bin/perry) over dist This enables UI to detect version mismatch between host and workers, and provides a button to update outdated workers on demand. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

sentry · 2026-01-10T17:49:55Z

+      ['sh', '-c', 'pkill -f "perry worker serve" || true'],
+      { user: 'workspace' }
+    );
+
+    await this.copyPerryWorker(containerName);
+    await this.startWorkerServer(containerName);
+  }


Bug: The updateWorkerBinary function has a race condition. It kills the old worker but doesn't wait for it to terminate, which can prevent the new worker from starting correctly.
_{Severity: CRITICAL}

🔍 Detailed Analysis

A race condition exists in the updateWorkerBinary function. The function calls pkill to terminate the existing worker process but does not wait for the process to fully shut down and release its port. This leads to two potential failure modes. First, if the old process is still shutting down but its health endpoint is responsive, startWorkerServer will detect a running worker and return early, never starting the new binary. Second, if the health check fails, the attempt to start the new worker may fail with an EADDRINUSE error if the port is still held by the dying process. This failure is silent because the startup command is backgrounded with &, preventing error propagation.

💡 Suggested Fix

Introduce a synchronization mechanism after killing the old worker process to ensure it has fully terminated and released the port before attempting to start the new one. This could be a short sleep, or a more robust solution like polling to confirm the port is free or the process ID is gone. Additionally, avoid backgrounding the startup command with & within execInContainer to ensure that startup errors like EADDRINUSE are properly propagated and handled, rather than failing silently.

🤖 Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: src/workspace/manager.ts#L312-L318 Potential issue: A race condition exists in the `updateWorkerBinary` function. The function calls `pkill` to terminate the existing worker process but does not wait for the process to fully shut down and release its port. This leads to two potential failure modes. First, if the old process is still shutting down but its health endpoint is responsive, `startWorkerServer` will detect a running worker and return early, never starting the new binary. Second, if the health check fails, the attempt to start the new worker may fail with an `EADDRINUSE` error if the port is still held by the dying process. This failure is silent because the startup command is backgrounded with `&`, preventing error propagation.

_{Did we get this right? 👍 / 👎 to inform future reviews.}
_{Reference ID: 8435010}

sentry bot reviewed Jan 10, 2026

View reviewed changes

gricha merged commit 1d08258 into main Jan 10, 2026
8 checks passed

gricha deleted the worker-version-updates branch January 10, 2026 17:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add worker version detection and on-demand update#79

Add worker version detection and on-demand update#79
gricha merged 1 commit intomainfrom
worker-version-updates

gricha commented Jan 10, 2026

Uh oh!

sentry bot Jan 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gricha commented Jan 10, 2026

Summary

How it works

Test plan

Uh oh!

sentry bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant