Skip to content

fix up command#537

Merged
khaliqgant merged 12 commits intomainfrom
fix-up
Mar 10, 2026
Merged

fix up command#537
khaliqgant merged 12 commits intomainfrom
fix-up

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

@khaliqgant khaliqgant commented Mar 10, 2026

Summary

Collection of CLI and broker fixes addressing entrypoint symlink resolution, agent management improvements, and deduplication correctness.

Changes

CLI Entrypoint

  • Fix symlink resolution in CLI entrypoint guard — uses fs.realpathSync to resolve symlinks before comparing paths, with path.resolve fallback (src/cli/index.ts)
  • Update cliScript fallback from bootstrap.js to index.js to match the single-entrypoint model (src/cli/commands/core.ts)
  • New test suite validating entrypoint behavior and package.json bin mapping (src/cli/entrypoint.test.ts)

Agent Management

  • Forward release reason in HTTP agent management client — sends { reason } in DELETE body (src/cli/commands/agent-management.ts)
  • Refactor agent management to use HTTP-based client with broker auto-discovery, port scanning, and auto-start capability
  • Always resolve API port in broker lifecycle, even without dashboard (src/cli/lib/broker-lifecycle.ts)

Broker (Rust)

  • Fix dedup key priority for spawn requests — prefers event_id before falling back to agent name, preventing false-positive deduplication when multiple spawns share a name (src/main.rs)
  • Auto-format Rust code with cargo fmt

CI/CD

  • Add standalone macOS smoke test to package-validation workflow
  • Add ci-standalone-smoke.sh script for standalone binary validation
  • Update publish workflow

SDK

  • Various test fixes and updates for lifecycle, workflow runner, and integration tests
  • Python SDK client improvements

Known limitation

The release reason is now sent from the TS HTTP client, but the Rust listen_api.rs DELETE handler does not yet parse the request body — so the reason is silently dropped for HTTP-based releases. The WebSocket-based release path (release_agent in main.rs) correctly handles the reason field. A follow-up PR should add body parsing to the Rust HTTP endpoint.

Commits

  • d1c770e7 fix up command
  • ca63089e test fixes and updates for lifecycle
  • 4fde8ed5 style: auto-format Rust code with cargo fmt
  • 738cb7c7 fix: resolve symlinks in CLI entrypoint guard
  • 89ca732a fix: forward release reason in HTTP agent management client
  • 2d2bf7f4 fix: dedup key priority — event_id before agent name for spawn requests

Breaking changes

None.

Testing done

  • All 584 TypeScript tests pass
  • All 233 Rust tests pass
  • CLI --version and --help work correctly
  • Symlink resolution verified manually (ln -sf ... && node /tmp/test-relay-symlink --version)
  • who command works against running broker
  • release --help confirms command structure

🤖 Generated with Claude Code

devin-ai-integration[bot]

This comment was marked as resolved.

@khaliqgant
Copy link
Copy Markdown
Member Author

Bug report from the latest full integration rerun:

The original agent-relay up / installer / stale-broker fixes are in, but the full broker integration suite is still not green because workflow "local mode" is incomplete.

Root cause:

  • WorkflowRunner sets AGENT_RELAY_WORKFLOW_DISABLE_RELAYCAST=1
  • but it still constructs AgentRelay with channels: [channel]
  • AgentRelayClient always launches the broker with init --channels ...
  • the broker then still attempts Relaycast registration / workspace calls
  • so tests that should be purely local still fail against https://api.relaycast.dev/...

Observed failure pattern:

  • after moving some workflow suites to useRelaycast: false, they still fail with broker exit / Relaycast workspace errors
  • this shows the disable flag currently stops JS-side channel setup, but does not stop broker-side Relaycast usage

Representative failing suites after the rerun:

  • workflow-dag
  • workflow-lifecycle
  • workflow-verification
  • parts of workflow-runner
  • parts of workflow-patterns

There are also a couple of likely real behavior mismatches behind the infrastructure noise:

  • workflow-dag: detects dependency cycle expects a rejected execution path, but the runner currently returns a failed run instead
  • workflow-lifecycle: abort cancels a running workflow currently completes instead of cancelling

Repro:

RELAY_INTEGRATION_CLI_LIFECYCLE=1 node --test tests/integration/broker/dist/*.test.js

Recommended follow-up:

  1. Make workflow local mode truly local end-to-end: when Relaycast is disabled, do not start the broker with Relaycast channels / workspace registration.
  2. Rerun the workflow integration slice in local mode.
  3. Fix the remaining semantic regressions (dependency cycle, abort cancels) once the Relaycast dependency is removed from those tests.

github-actions Bot and others added 3 commits March 10, 2026 21:04
Use fs.realpathSync to resolve symlinks when checking if index.ts is the
entrypoint. Without this, npm global installs create a symlink that
causes path.resolve(argv[1]) to differ from import.meta.url, silently
skipping runCli().

Falls back to path.resolve on error (e.g., non-existent paths).
devin-ai-integration[bot]

This comment was marked as resolved.

The release method was dropping the reason parameter. Now sends it
as JSON body in the DELETE request when provided.
Reorder dedup key resolution: spawn_token → event_id → agent_name.
Previously agent_name was checked before event_id, causing the dedup
key to use the agent name even when a unique event_id was available.

Also fix test data: falls_back_to_agent_name test should not include
event_id (it tests the fallback path when no event_id is present).
devin-ai-integration[bot]

This comment was marked as resolved.

khaliqgant and others added 4 commits March 10, 2026 22:37
The waitForExit helper used Promise.race with setTimeout but never
cleared the timer when the exit promise won. This leaked timers that
could delay Node.js process exit.
- Added reason field to ListenApiRequest::Release enum
- Parse optional { reason } JSON body from DELETE request
- Log release reason in handler when provided
When AGENT_RELAY_WORKFLOW_DISABLE_RELAYCAST=1 is set, the broker was
still receiving --channels which triggered Relaycast registration. Now:

- Skip --channels arg when channel list is empty (client.ts)
- Pass empty channels to broker when Relaycast is disabled (runner.ts)
- Validate config in execute() to catch DAG cycles before run starts
- Initialize abortController early so abort() works during setup
- Mark pending/running steps as failed on cancellation
- Fix continue strategy: treat failed steps as done for run completion
- Fix verification test: avoid anti-injection false positive

All 28 tests across workflow-dag, workflow-lifecycle, workflow-runner,
workflow-verification, and workflow-patterns now pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@khaliqgant
Copy link
Copy Markdown
Member Author

Fixed in 79b1616. Here's what was wrong and what changed:

Root cause: When AGENT_RELAY_WORKFLOW_DISABLE_RELAYCAST=1 was set, the runner still passed channels: [channel] to the AgentRelay constructor, which caused the broker to run init --channels <channel> and attempt Relaycast registration.

Fixes applied (all in one commit on fix-up):

  1. Don't pass channels when Relaycast is disabledrunner.ts now passes channels: [] when relaycastDisabled is true
  2. Skip --channels arg when emptyclient.ts no longer emits --channels '' which confused the broker
  3. Validate config in execute() — DAG cycle detection (detectCycles) was only called via validateConfig during YAML parsing, not in execute(). Added this.validateConfig(resolved) so assert.rejects() works correctly
  4. Initialize abortController early — it was set in runWorkflowCore but abort() could be called before that method ran (race condition). Moved to execute()/resume() so abort works immediately
  5. Mark pending steps as failed on cancel — when abort fires, pending/running steps now emit step:failed events
  6. Fix continue error strategyallCompleted check now treats failed steps as done when using continue/skip strategy, so the run completes instead of failing
  7. Fix verification test false positive — changed task text to avoid triggering the anti-injection check (taskHasToken path)

All 28 tests across workflow-dag, workflow-lifecycle, workflow-runner, workflow-verification, and workflow-patterns pass.

@khaliqgant khaliqgant merged commit 231c45d into main Mar 10, 2026
45 of 48 checks passed
@khaliqgant khaliqgant deleted the fix-up branch March 10, 2026 22:38
khaliqgant added a commit that referenced this pull request Mar 25, 2026
* fix up command

* test fixes and updates for lifecycle

* style: auto-format Rust code with cargo fmt

* fix: resolve symlinks in CLI entrypoint guard

Use fs.realpathSync to resolve symlinks when checking if index.ts is the
entrypoint. Without this, npm global installs create a symlink that
causes path.resolve(argv[1]) to differ from import.meta.url, silently
skipping runCli().

Falls back to path.resolve on error (e.g., non-existent paths).

* fix: forward release reason in HTTP agent management client

The release method was dropping the reason parameter. Now sends it
as JSON body in the DELETE request when provided.

* fix: dedup key priority — event_id before agent name for spawn requests

Reorder dedup key resolution: spawn_token → event_id → agent_name.
Previously agent_name was checked before event_id, causing the dedup
key to use the agent name even when a unique event_id was available.

Also fix test data: falls_back_to_agent_name test should not include
event_id (it tests the fallback path when no event_id is present).

* update wording

* fix: clear timeout timers in SDK client shutdown

The waitForExit helper used Promise.race with setTimeout but never
cleared the timer when the exit promise won. This leaked timers that
could delay Node.js process exit.

* fix: thread release reason through Rust HTTP API handler

- Added reason field to ListenApiRequest::Release enum
- Parse optional { reason } JSON body from DELETE request
- Log release reason in handler when provided

* style: auto-format Rust code with cargo fmt

* fix: workflow local mode — don't connect to Relaycast when disabled

When AGENT_RELAY_WORKFLOW_DISABLE_RELAYCAST=1 is set, the broker was
still receiving --channels which triggered Relaycast registration. Now:

- Skip --channels arg when channel list is empty (client.ts)
- Pass empty channels to broker when Relaycast is disabled (runner.ts)
- Validate config in execute() to catch DAG cycles before run starts
- Initialize abortController early so abort() works during setup
- Mark pending/running steps as failed on cancellation
- Fix continue strategy: treat failed steps as done for run completion
- Fix verification test: avoid anti-injection false positive

All 28 tests across workflow-dag, workflow-lifecycle, workflow-runner,
workflow-verification, and workflow-patterns now pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant