Skip to content

test(cli): speed up startup auto-pair tests#4913

Merged
cv merged 1 commit into
mainfrom
codex/cli-start-unit-speedups
Jun 7, 2026
Merged

test(cli): speed up startup auto-pair tests#4913
cv merged 1 commit into
mainfrom
codex/cli-start-unit-speedups

Conversation

@cv

@cv cv commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator

Summary

Speeds up the next CLI runtime target by replacing the auto-pair watcher's test-only wall-clock waits with a local fake clock. The PR keeps the real subprocess timeout regression coverage, shortens those timeout windows, and ratchets the test/nemoclaw-start.test.ts legacy size budget down after trimming lines.

Related Issue

Part of #4892

Changes

  • Redirect time.time() and time.sleep() inside the embedded auto-pair watcher harness to deterministic test helpers, without monkey-patching Python subprocess internals.
  • Shorten the two real subprocess timeout regression tests from multi-second waits to sub-second watcher deadlines while preserving timeout assertions.
  • Ratchet test/nemoclaw-start.test.ts in ci/test-file-size-budget.json from 5310 to 5300 lines.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • Tests
    • Improved auto-pairing device watcher timeout and deadline handling
    • Enhanced test clock simulation for more realistic deadline behavior validation
    • Updated test file size budget configuration

@cv cv self-assigned this Jun 7, 2026
@coderabbitai

coderabbitai Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6d61120d-7c58-4430-b43f-134b0d495602

📥 Commits

Reviewing files that changed from the base of the PR and between 34cac80 and a1ed750.

📒 Files selected for processing (2)
  • ci/test-file-size-budget.json
  • test/nemoclaw-start.test.ts

📝 Walkthrough

Walkthrough

This PR improves deadline-aware testing in the auto-pair watcher by introducing a bounded test clock that advances simulated time on sleep calls, and tightens two existing watcher timeout tests to exercise deadline logic with stricter parameters. The test file size budget is updated to reflect the net reduction in lines.

Changes

Watcher Timeout Testing and Time Mocking

Layer / File(s) Summary
Test clock implementation for time mocking
test/nemoclaw-start.test.ts
The autoPairPythonScript generator now patches Python's time.time() and time.sleep() with a bounded test clock that advances simulated time on each sleep call (capped at 0.25s) instead of fully disabling sleep, enabling deadline logic to execute while keeping wall-clock behavior controlled.
CLI invocation timeout test hardening
test/nemoclaw-start.test.ts
The "bounds the openclaw CLI invocation" test shortens the simulated wedged devices list sleep from 5s to 2s, tightens deadline and interval parameters, sets per-subprocess run timeout to 0.25s, and adds a wall-clock assertion (elapsedMs < 1800) to verify the watcher exits via deadline rather than subprocess completion.
Transient approve timeout test hardening
test/nemoclaw-start.test.ts
The "retries a transient approve timeout" test reduces the simulated devices approve hang from 5s to 2s and tightens the watcher timing parameters (fast/overall deadlines, interval, and 0.25s run timeout) to match stricter deadline expectations.
Test file size budget adjustment
ci/test-file-size-budget.json
The test/nemoclaw-start.test.ts line budget is reduced from 5310 to 5300 in legacyMaxLines, reflecting the net 10-line reduction from the test file changes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#4786: Both PRs update test/nemoclaw-start.test.ts to adjust startup auto-pair watcher hardening around deadlines and timeouts (the main PR refines time/deadline simulation, while the retrieved PR adds tightened deadline logic for scope rejections).
  • NVIDIA/NemoClaw#4905: The main PR's update to ci/test-file-size-budget.json ties into the retrieved PR's introduction and enforcement of test-file size budgets.

Suggested labels

area: cli, area: ci, chore

Suggested reviewers

  • prekshivyas
  • cjagwani

Poem

A rabbit bounds through time with care,
Where clocks advance through testing air—
Deadlines checked with walls so tight,
Five seconds shortened, tests burn bright! ⏱️✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: optimizing test performance by speeding up auto-pair startup tests through implementation of a fake clock mechanism.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/cli-start-unit-speedups

Comment @coderabbitai help to get the list of available commands and usage tips.

@cv cv mentioned this pull request Jun 7, 2026
2 tasks
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: None

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • None. No E2E is recommended because this PR is tests-only plus CI budget metadata. It adjusts test timing/fake-clock mechanics and line-count budget for an existing Vitest suite, with no changes to runtime start scripts, OpenClaw sandbox lifecycle, credentials, security boundaries, network policy, inference routing, deployment, or real assistant user flows.

Optional E2E

  • None.

New E2E recommendations

  • None.

@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • None. Changed files are CI test-size budget metadata and a non-scenario unit/integration test under test/, with no changes under test/e2e-scenario/ or scenario workflows; scenario E2E is not affected.

Optional scenario E2E

  • None.

Relevant changed files

  • None.

@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 0 worth checking, 0 nice ideas
Top item: No actionable code findings

Consider writing more tests for
  • **Runtime validation** — Validate that the fake-clock auto-pair slow-mode test still approves a late CLI scope upgrade after browser convergence. The code changes are test-harness-only, but they affect timing around sandbox/startup auto-pair behavior. The logic appears covered by existing assertions; runtime validation is useful mainly to catch CI scheduling flakes from the tightened wall-clock timeout windows.
  • **Runtime validation** — Validate that the fake-clock auto-pair tests still reject unknown, malformed-scope, and disallowed-scope requests without recording approvals. The code changes are test-harness-only, but they affect timing around sandbox/startup auto-pair behavior. The logic appears covered by existing assertions; runtime validation is useful mainly to catch CI scheduling flakes from the tightened wall-clock timeout windows.
  • **Runtime validation** — Validate that the real subprocess timeout test still emits the `devices list` timeout log without using the fake clock. The code changes are test-harness-only, but they affect timing around sandbox/startup auto-pair behavior. The logic appears covered by existing assertions; runtime validation is useful mainly to catch CI scheduling flakes from the tightened wall-clock timeout windows.
  • **Runtime validation** — Validate that a transient `devices approve` timeout is retried and records exactly one successful approval without using the fake clock. The code changes are test-harness-only, but they affect timing around sandbox/startup auto-pair behavior. The logic appears covered by existing assertions; runtime validation is useful mainly to catch CI scheduling flakes from the tightened wall-clock timeout windows.
  • **Runtime validation** — Validate that the file-size budget check accepts the ratcheted 5300-line cap for `test/nemoclaw-start.test.ts`. The code changes are test-harness-only, but they affect timing around sandbox/startup auto-pair behavior. The logic appears covered by existing assertions; runtime validation is useful mainly to catch CI scheduling flakes from the tightened wall-clock timeout windows.
  • **Acceptance clause:** Part of ci: safely split slow CLI coverage suites #4892 — add test evidence or identify existing coverage. The PR body references ci: safely split slow CLI coverage suites #4892, but the deterministic context provided no linked issue body or comments to extract literal acceptance clauses from.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@cv cv merged commit e2edaad into main Jun 7, 2026
39 checks passed
@cv cv deleted the codex/cli-start-unit-speedups branch June 7, 2026 15:45
@cv cv mentioned this pull request Jun 7, 2026
12 tasks
cv added a commit that referenced this pull request Jun 7, 2026
## Summary
This PR trims the next slow subprocess-heavy CLI target after #4913 by
removing avoidable test-time waits from sandbox connect route repair and
the stuck-sandbox timeout fixture. In the profiled six-file slow bucket,
wall-clock runtime dropped from 28.07s to 12.14s locally.

## Related Issue
Part of #4892

## Changes
- Skip `sleepSync` route-repair backoff waits when CLI subprocesses run
under Vitest or `NEMOCLAW_TEST_NO_SLEEP=1`, matching the existing
inference probe test-wait convention.
- Keep the route-repair production retry contract intact; the unit tests
still assert the real 3-attempt, 2s-delay probe options.
- Reduce the stuck Provisioning subprocess fixture from a 3s connect
timeout to the shortest accepted 1s timeout.

## Type of Change
- [x] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification
- [x] `npx prek run --all-files` passes
- [x] `npm test` passes
- [x] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [ ] Docs updated for user-facing behavior changes
- [ ] `npm run docs` builds without warnings (doc changes only)
- [ ] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

---
Signed-off-by: Carlos Villela <cvillela@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Tests**
* Improved test execution performance by skipping sleep operations
during test runs.
* Reduced test execution time for sandbox connectivity validation tests.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@cv cv added the v0.0.61 Release target label Jun 8, 2026
@wscurran wscurran added the chore Build, CI, dependency, or tooling maintenance label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore Build, CI, dependency, or tooling maintenance v0.0.61 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants