test(e2e): retry empty TUI chat event captures by cv · Pull Request #4255 · NVIDIA/NemoClaw

cv · 2026-05-26T16:38:03Z

Summary

Retries the OpenClaw TUI chat correlation live repro only when an accepted-send attempt captures zero chat events, which matches the dominant harness flake. The strict correlation assertions remain unchanged for empty finals, duplicate turns, missing replies with observed events, and uncorrelated replies.

Changes

Add a one-time retry path for zero-event live TUI/webchat correlation captures with a fresh session key.
Add per-attempt diagnostics to the failure summary, including sent runs, event counts, and compact history messages.
Add a unit regression check that the retry predicate applies only to the zero-chat-events case and does not mask empty final events.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
make docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

Tests
- Improved test infrastructure with per-run unique session identifiers.
- Added detection for a specific event-capture failure mode and a retry wrapper to rerun repros when that condition occurs.
- Enhanced failure reporting to include richer diagnostic context and aggregated attempt data.
- Increased live-test timeout and added a unit test validating the event-capture failure classifier.

coderabbitai · 2026-05-26T16:38:16Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 20a72d7e-3745-41ff-b5b9-0496b79f7a4f

📥 Commits

Reviewing files that changed from the base of the PR and between 76bd808 and 749bc90.

📒 Files selected for processing (1)

test/openclaw-tui-chat-correlation.test.ts

📝 Walkthrough

Walkthrough

This test update adds intelligent retry logic for chat event capture failures in Issue 2603 live reproduction. A new session key generation scheme with UUID suffixes ensures per-run uniqueness, detection logic identifies zero-event failure modes, and a retry wrapper conditionally re-executes the test. Failure reporting now captures attempt histories and trace-derived metrics.

Changes

Issue 2603 Event Capture Resilience

Layer / File(s)	Summary
Dependencies and data model `test/openclaw-tui-chat-correlation.test.ts`	Import `randomUUID` from `node:crypto` and define `LiveIssue2603Run` type to bundle a single repro trace with an ordered list of all attempts.
Failure reporting pipeline `test/openclaw-tui-chat-correlation.test.ts`	Add `compactHistoryMessages` and `summarizeAttempt` helpers; extend `buildFailureSummary` to accept optional trace and attempt list so emitted failure JSON includes expectations, trace-derived event counts, and per-attempt summaries.
Event capture retry mechanism `test/openclaw-tui-chat-correlation.test.ts`	Update session key generation to include UUID suffix; introduce `looksLikeEventCaptureFailure` classifier to detect zero-captured-events scenarios; add `runLiveIssue2603ReproWithEventCaptureRetry` wrapper to conditionally retry on capture failure and return both final repro and all attempts.
Test coverage and integration `test/openclaw-tui-chat-correlation.test.ts`	Add unit test validating event-capture failure classifier behavior, and update live sandbox regression test to invoke retry wrapper, pass trace and attempts to failure summary builder, and increase timeout allowance.

Sequence Diagram

sequenceDiagram
  participant LiveTest as Live Sandbox Test
  participant RetryWrapper as runLiveIssue2603ReproWithEventCaptureRetry
  participant ReproExec as Live Repro Execution
  participant Classifier as looksLikeEventCaptureFailure
  participant Reporting as buildFailureSummary

  LiveTest->>RetryWrapper: trigger with UUID session key
  RetryWrapper->>ReproExec: run repro (attempt 1)
  ReproExec-->>RetryWrapper: repro trace
  RetryWrapper->>Classifier: analyze trace for zero captured events
  alt Capture failure detected
    RetryWrapper->>ReproExec: run repro (attempt 2)
    ReproExec-->>RetryWrapper: repro trace
  end
  RetryWrapper-->>LiveTest: return {repro, attempts}
  LiveTest->>Reporting: pass trace + attempts
  Reporting-->>LiveTest: enriched failure JSON

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibble logs and chase each missing trace,
UUID tails keep every run in place,
When captures vanish, I hop back to try,
Two runs recorded, then the summaries fly—
A rabbit's cheers for tests that don't say die.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'test(e2e): retry empty TUI chat event captures' directly and specifically describes the main change: adding retry logic for empty TUI chat event captures in end-to-end tests.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/tui-correlation-empty-event-retry

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-26T16:39:10Z

E2E Advisor Recommendation

Required E2E: None
Optional E2E: openclaw-tui-chat-correlation-e2e

Dispatch hint: openclaw-tui-chat-correlation-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

None. No merge-blocking E2E is required because this PR changes only an existing test file/E2E harness and cannot affect NemoClaw runtime behavior or real user flows. The directly affected E2E is recommended as optional to validate the harness change itself.

Optional E2E

openclaw-tui-chat-correlation-e2e (high): Optional confidence check for the modified live E2E harness and retry behavior. This is the directly affected existing job and runs test/e2e/test-openclaw-tui-chat-correlation.sh, which invokes test/openclaw-tui-chat-correlation.test.ts with NEMOCLAW_ISSUE_2603_LIVE=1.

New E2E recommendations

None.

Dispatch hint

Workflow: nightly-e2e.yaml
jobs input: openclaw-tui-chat-correlation-e2e

github-actions · 2026-05-26T16:39:11Z

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

None. No scenario workflow, scenario metadata, scenario runtime, or validation-suite files changed.

Optional scenario E2E

None.

Relevant changed files

None.

github-actions · 2026-05-26T16:39:54Z

PR Review Advisor

Findings: 0 needs attention, 0 worth checking, 0 nice ideas
Since last review: 2 prior items resolved, 0 still apply, 0 new items found

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

cv · 2026-05-26T18:20:23Z

Addressed the review advisor note in 749bc90 by adding source-boundary and removal-criteria context next to the retry helper. The comment now documents that the tolerated zero-event state is a live repro observability race at the pinned OpenClaw gateway/websocket boundary, that this PR keeps the strict #2603/#3145 assertions intact, and when to remove the guard.

Verification re-run:

npx vitest run test/openclaw-tui-chat-correlation.test.ts --reporter=verbose
npx prek run --all-files

test(e2e): retry empty TUI chat event captures

76bd808

cv self-assigned this May 26, 2026

cv added the v0.0.51 Release target label May 26, 2026

wscurran added E2E End-to-end testing — Brev infrastructure, test cases, nightly failures, and coverage gaps fix labels May 26, 2026

test(e2e): document TUI event retry boundary

749bc90

cv requested a review from ericksoa May 26, 2026 18:21

cv added v0.0.52 Release target and removed v0.0.51 Release target labels May 26, 2026

cv merged commit 4bb00d3 into main May 26, 2026
28 checks passed

cv deleted the fix/tui-correlation-empty-event-retry branch May 27, 2026 21:16

Conversation

cv commented May 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Scenario Advisor Recommendation

E2E Scenario Advisor

Required scenario E2E

Optional scenario E2E

Relevant changed files

Uh oh!

github-actions Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

Uh oh!

cv commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cv commented May 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading

github-actions Bot commented May 26, 2026 •

edited

Loading

github-actions Bot commented May 26, 2026 •

edited

Loading

github-actions Bot commented May 26, 2026 •

edited

Loading

cv commented May 26, 2026 •

edited

Loading