Skip to content

fix(sdk): markOffline stops auto-heartbeat to prevent re-registration#74

Merged
khaliqgant merged 5 commits intomainfrom
fix/mark-offline-stops-heartbeat
Mar 10, 2026
Merged

fix(sdk): markOffline stops auto-heartbeat to prevent re-registration#74
khaliqgant merged 5 commits intomainfrom
fix/mark-offline-stops-heartbeat

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

@khaliqgant khaliqgant commented Mar 10, 2026

Problem

The e2e staging test markOffline transitions agent to offline was failing:

❌ markOffline transitions agent to offline: InfraAgent did not transition to offline after markOffline

Root cause: When an agent with an active WebSocket called markOffline(), it only POSTed to /v1/agents/disconnect. But the auto-heartbeat timer (started on WS open) kept running and immediately re-registered the agent as online via /v1/agents/heartbeat, making markOffline() effectively a no-op.

Fix

markOffline() now calls stopAutoHeartbeat() before posting the disconnect request. This ensures the heartbeat timer doesn't race against the offline transition.

The Rust SDK already handled this correctly — disconnect() closes the WS (which stops heartbeat) along with the REST call.

Tests

  • markOffline stops auto-heartbeat timer — verifies no heartbeat fetches fire after markOffline
  • markOffline prevents auto-heartbeat from re-registering agent — full lifecycle: connect → markOffline → verify no re-registration

All 240 SDK tests pass.

Failed CI run

https://github.com/AgentWorkforce/relaycast/actions/runs/22896829279


Open with Devin

When an agent called markOffline() while connected via WebSocket, the
auto-heartbeat timer kept firing and immediately re-registered the agent
as online, making markOffline() effectively a no-op.

Now markOffline() stops the auto-heartbeat timer before posting the
disconnect request. This fixes the e2e 'markOffline transitions agent
to offline' test that was failing on staging.

Added tests:
- markOffline stops auto-heartbeat timer
- markOffline prevents auto-heartbeat from re-registering agent
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

The SDK fix alone wasn't enough — the server-side AgentDO refreshes
presence in PresenceDO on every WS ping, which re-registered the agent
as online even after REST /v1/agents/disconnect was called.

Changes:
- AgentDO: add presenceSuppressed flag, skip presence heartbeat on pings
  when suppressed
- POST /agents/disconnect: notify AgentDO to suppress presence
- POST /agents/heartbeat: notify AgentDO to unsuppress presence
- New WS connections clear the suppression flag

This ensures markOffline() actually transitions the agent to offline
even with an active WebSocket connection.
devin-ai-integration[bot]

This comment was marked as resolved.

@github-actions
Copy link
Copy Markdown

Preview deployed!

Environment URL
API https://pr74-api.relaycast.dev
Health https://pr74-api.relaycast.dev/health
Observer https://pr74-observer.relaycast.dev/observer

This preview shares the staging database and will be cleaned up when the PR is merged or closed.

Run E2E tests

npm run e2e -- https://pr74-api.relaycast.dev --ci

Open observer dashboard

https://pr74-observer.relaycast.dev/observer

Addresses Devin review: fire-and-forget suppress call left a window
where a WS ping could re-register the agent before suppression took
effect. Now awaited so the AgentDO processes suppression before the
disconnect response returns.
devin-ai-integration[bot]

This comment was marked as resolved.

…nnect

Addresses two Devin review comments:

1. presenceSuppressed was in-memory only — lost on DO hibernation,
   allowing WS pings to re-register the agent after eviction. Now
   persisted to DO storage following the same pattern as agentSeq.

2. Suppress was called AFTER PresenceDO disconnect, leaving a race
   window. Reordered: suppress AgentDO first, then disconnect from
   PresenceDO.
devin-ai-integration[bot]

This comment was marked as resolved.

@khaliqgant khaliqgant merged commit 0ef75ee into main Mar 10, 2026
4 checks passed
@khaliqgant khaliqgant deleted the fix/mark-offline-stops-heartbeat branch March 10, 2026 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant