test(e2e): classify quick tunnel flakes as external#4154
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
📝 WalkthroughWalkthroughThis PR improves test resilience by adding Cloudflare transient failure detection to the tunnel lifecycle test. Two new helper functions identify transient Cloudflare errors from log text and HTTP status codes. Error classification is updated to use these helpers, then applied across three failure paths to skip tests on known transient errors instead of failing. ChangesCloudflare Transient Error Detection
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: None Dispatch hint: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
E2E Scenario Advisor RecommendationRequired scenario E2E: None Full scenario advisor summaryE2E Scenario AdvisorBase: Required scenario E2E
Optional scenario E2E
Relevant changed files
|
PR Review AdvisorFindings: 0 needs attention, 0 worth checking, 0 nice ideas This is an automated advisory review. A human maintainer must make the final merge decision. |
…e-external-skips # Conflicts: # test/e2e/test-hermes-inference-switch.sh # test/e2e/test-openclaw-inference-switch.sh
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/e2e/test-tunnel-lifecycle.sh (1)
172-174: 💤 Low valueConsider breaking the regex into documented patterns.
The single 300+ character regex is functional but hard to maintain. Consider refactoring to multiple grep calls or a documented pattern list for easier modification and review.
♻️ Example refactor (optional)
is_cloudflare_transient_text() { - grep -qiE 'failed to unmarshal quick Tunnel|quick tunnels? (are )?(temporarily )?disabled|failed to (dial|register)|tunnel server.*error|i/o timeout|EOF.*tunnel|couldn.?t start tunnel|tunnel creation failed|bad gateway|\b50[234]\b' <<<"$1" + local patterns=( + 'failed to unmarshal quick Tunnel' + 'quick tunnels? (are )?(temporarily )?disabled' + 'failed to (dial|register)' + 'tunnel server.*error' + 'i/o timeout' + 'EOF.*tunnel' + "couldn.?t start tunnel" + 'tunnel creation failed' + 'bad gateway' + '\b50[234]\b' + ) + local pattern + pattern=$(IFS='|'; echo "${patterns[*]}") + grep -qiE "$pattern" <<<"$1" }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/test-tunnel-lifecycle.sh` around lines 172 - 174, The is_cloudflare_transient_text() function currently uses a single long regex which is hard to maintain; refactor it into a list of named pattern variables or an array of simpler regexes (e.g., PATTERN_UNMARSHAL, PATTERN_DISABLED, PATTERN_DIAL_REGISTER, PATTERN_TIMEOUT, PATTERN_EOF, PATTERN_START_FAIL, PATTERN_HTTP_ERRORS) and then test the input against each pattern with multiple grep -qiE checks or a loop, preserving the same matching semantics and returning success if any pattern matches; update the function to include short comments above each pattern explaining what it detects so future reviewers can easily modify or extend the transient-error list.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/e2e/test-tunnel-lifecycle.sh`:
- Around line 172-174: The is_cloudflare_transient_text() function currently
uses a single long regex which is hard to maintain; refactor it into a list of
named pattern variables or an array of simpler regexes (e.g., PATTERN_UNMARSHAL,
PATTERN_DISABLED, PATTERN_DIAL_REGISTER, PATTERN_TIMEOUT, PATTERN_EOF,
PATTERN_START_FAIL, PATTERN_HTTP_ERRORS) and then test the input against each
pattern with multiple grep -qiE checks or a loop, preserving the same matching
semantics and returning success if any pattern matches; update the function to
include short comments above each pattern explaining what it detects so future
reviewers can easily modify or extend the transient-error list.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: a799de06-5ec3-4f94-85f3-1a5937c91df2
📒 Files selected for processing (1)
test/e2e/test-tunnel-lifecycle.sh
Selective E2E Results — ✅ All requested jobs passedRun: 26366332493
|
|
Prepared for review after #4153 merged:
Validation:
Latest |
## Summary The post-#4154 nightly sweep still shows `hermes-inference-switch-e2e` failing after the route/config/hash checks pass, when live Hermes `inference.local` or API chat requests time out against the upstream model. This PR keeps route/config regressions blocking but classifies explicit post-switch live timeout/5xx probes as external/transient skips. ## Changes - Capture HTTP status for post-switch Hermes `inference.local` and API chat probes. - Track transient state structurally from curl exit 28 or HTTP 502/503/504, instead of matching arbitrary response text. - Convert post-switch `inference.local` or Hermes API transient exhaustion to SKIP after earlier route/config checks have passed. - Preserve FAIL for non-timeout wrong-content responses, unexpected HTTP statuses, and all route/config/hash/registry regressions. ## Type of Change - [x] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [ ] Doc only (includes code sample changes) ## Verification - [x] `npx prek run --all-files` passes - [ ] `npm test` passes - [x] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [ ] Docs updated for user-facing behavior changes - [ ] `make docs` builds without warnings (doc changes only) - [ ] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) --- Signed-off-by: Carlos Villela <cvillela@nvidia.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Tests** * Improved live-inference test reliability with enhanced transient failure detection and retry behavior. * Upgraded HTTP response handling to separately capture status codes and body for clearer diagnostics and enriched failure messages. * Added conditional API authentication in test requests and skip-on-transient-failure behavior to reduce flaky failures. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/4158?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
The nightly flake sweep had a tunnel lifecycle failure caused by Cloudflare quick-tunnel registration/parsing instability (
failed to unmarshal quick Tunnel). This PR keeps NemoClaw tunnel failures blocking, but classifies known Cloudflare quick-tunnel registration/edge instability as an external skip instead of a NemoClaw test failure.Changes
failed to unmarshal quick Tunnelmessages.nemoclaw tunnel startnon-zero exits caused by known quick-tunnel external failures as SKIP, with diagnostics and best-effort cleanup.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)Signed-off-by: Carlos Villela cvillela@nvidia.com
Summary by CodeRabbit
Release Notes