Skip to content

chore: run all tests even on failure#481

Merged
olivermeyer merged 1 commit intomainfrom
chore/fix-coverage
Mar 18, 2026
Merged

chore: run all tests even on failure#481
olivermeyer merged 1 commit intomainfrom
chore/fix-coverage

Conversation

@olivermeyer
Copy link
Copy Markdown
Collaborator

@olivermeyer olivermeyer commented Mar 17, 2026

Currently if one test suite fails, the following suites never run (e.g. if integration fails we never run E2E regular + long-running). This leads to under-reported code coverage.

Instead, we now run all tests regardless of whether the previous suite failed and then catch all failures in a single step.

Copilot AI review requested due to automatic review settings March 17, 2026 14:57
@olivermeyer olivermeyer added the claude Trigger Claude Code automation label Mar 17, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts CI execution flow so coverage from long-running E2E tests is still produced on main/release tags even when regular E2E tests fail (e.g., due to transient 500/503s), helping keep Codecov coverage reporting stable for default-branch builds.

Changes:

  • Adds an explicit if: condition so the “E2E / long running” step runs on main and refs/tags/v* even after earlier step failures.
  • Preserves existing PR behavior by only running long-running E2E tests when earlier steps succeeded.

Comment thread .github/workflows/_test.yml Outdated
@claude
Copy link
Copy Markdown

claude Bot commented Mar 17, 2026

Claude finished @olivermeyer's task in 2m 11s —— View job


PR Review: chore: run all tests even on failure

I'll analyze this PR and provide a comprehensive review.

Checklist:

  • Read workflow changes
  • Review CI/CD documentation
  • Analyze conditional logic
  • Check for unintended side effects
  • Verify documentation needs
  • Provide final review feedback

Summary

Excellent change! This PR solves a real problem with CodeCov coverage reporting by ensuring all test suites run to completion even when earlier suites fail. The implementation is well-designed and follows best practices.

Review Findings

✅ Strengths

  1. Problem Clearly Identified: Under-reported coverage due to incomplete test execution is a legitimate issue that affects metrics accuracy.

  2. Well-Structured Solution: The continue-on-error: true + final assertion gate pattern is industry standard and appropriate for this use case.

  3. Comprehensive Coverage: All 5 test stages properly configured (unit, integration, e2e, long_running, very_long_running).

  4. Clear Documentation: Inline comments explain the rationale effectively.

  5. Proper Conditional Logic: Uses !cancelled() to allow manual cancellation while continuing on test failures.

🔍 Observations

1. Test Failures (Unrelated to PR Changes)

Two failing tests appear to be flaky/transient and not caused by this PR:

  • test_gui_cli_submit_to_run_result_delete - Assertion timing issue ("Application run cancelled!" found but assertion failed)
  • test_cli_run_submit_and_describe_and_cancel_and_download_and_delete - Timeout after 600s reading from SSL socket

These are environmental/timing issues, not workflow logic issues.

2. Conditional Logic is Correct

The PR description mentions ensuring E2E long-running tests run on main/tags even after regular E2E failures. Let me verify this is achieved:

Current logic (lines 207, 219):

# Long running step
if: ${{ !cancelled() }}

# Very long running step  
if: |
  !cancelled() && (
    contains(inputs.commit_message, 'enable:test:very_long_running') ||
    contains(github.event.pull_request.labels.*.name, 'enable:test:very_long_running')
  )

Analysis: ✅ The !cancelled() condition combined with continue-on-error: true on all preceding steps means:

  • Long-running tests will run on main/tags even if regular E2E fails
  • Very long-running tests will run (if enabled) even if all prior tests fail
  • This achieves the stated goal in the PR description

3. Documentation Update Needed

The .github/CLAUDE.md currently states (lines 215-222):

CI Behavior:

  • Draft PRs: Always skipped
  • Non-draft PRs: Run by default UNLESS: [skip conditions]
  • Main branch: Always run
  • Releases: Always run

This is still accurate but could be enhanced to clarify the new behavior that tests run even after failures on main/tags.

📝 Suggestions

1. Documentation Enhancement (Non-blocking)

Recommendation: Add a subsection to .github/CLAUDE.md around line 328 (after "Test Execution Flow in CI") to document this execution strategy:

### Test Execution Resilience

**All test steps use `continue-on-error: true`** to ensure every suite runs regardless of failures in earlier suites. This guarantees:

1. **Complete Coverage Reports**: CodeCov always receives data from all test suites, preventing under-reported coverage
2. **Full Visibility**: All test failures are visible in CI logs, not just the first failure
3. **Efficient Debugging**: Developers see all issues at once rather than fixing one at a time

**Final Gate**: A single "Assert no test failures" step at the end checks all test outcomes and fails the job if any suite failed. This preserves the expected CI behavior (job fails on test failure) while allowing reporting steps (Codecov, SonarQube, artifact upload) to complete.

**Execution Flow**:
```text
Unit Tests → Integration Tests → E2E Tests → Long Running → Very Long Running
     ↓              ↓                ↓             ↓              ↓
  (failures)   (failures)        (failures)    (failures)    (failures)
     ↓              ↓                ↓             ↓              ↓
     └──────────────┴────────────────┴─────────────┴──────────────┘

                       Coverage Upload (always runs)
                       SonarQube Upload (always runs)
                       Artifact Upload (always runs)

                       Assert no test failures
                         (fails job if any suite failed)

**Why this helps**: Future maintainers will understand the execution strategy's purpose and won't accidentally remove `continue-on-error` thinking it's a mistake.

#### 2. Consider Adding Job Summary Context (Optional Enhancement)

The final assertion step could optionally add context to the job summary showing which suites passed/failed:

```yaml
- name: Assert no test failures
  if: ${{ !cancelled() }}
  shell: bash
  run: |
    failed=()
    [[ "${{ steps.unit.outcome }}"              == "failure" ]] && failed+=("unit")
    [[ "${{ steps.integration.outcome }}"       == "failure" ]] && failed+=("integration")
    [[ "${{ steps.e2e.outcome }}"               == "failure" ]] && failed+=("e2e")
    [[ "${{ steps.long_running.outcome }}"      == "failure" ]] && failed+=("long-running")
    [[ "${{ steps.very_long_running.outcome }}" == "failure" ]] && failed+=("very-long-running")
    
    # Add summary to job output
    echo "## Test Suite Results" >> $GITHUB_STEP_SUMMARY
    echo "| Suite | Status |" >> $GITHUB_STEP_SUMMARY
    echo "|-------|--------|" >> $GITHUB_STEP_SUMMARY
    echo "| Unit | ${{ steps.unit.outcome }} |" >> $GITHUB_STEP_SUMMARY
    echo "| Integration | ${{ steps.integration.outcome }} |" >> $GITHUB_STEP_SUMMARY
    echo "| E2E | ${{ steps.e2e.outcome }} |" >> $GITHUB_STEP_SUMMARY
    echo "| Long Running | ${{ steps.long_running.outcome }} |" >> $GITHUB_STEP_SUMMARY
    echo "| Very Long Running | ${{ steps.very_long_running.outcome }} |" >> $GITHUB_STEP_SUMMARY
    
    if [[ ${#failed[@]} -gt 0 ]]; then
      echo "The following test suites failed: ${failed[*]}"
      exit 1
    fi
    echo "All test suites passed."

Benefit: Provides at-a-glance visibility of all suite results in the GitHub Actions summary.

Note: This is optional and not required for the PR to be approved.

Edge Cases Considered

Manual Cancellation: !cancelled() allows users to cancel workflow without triggering unwanted test runs

Skipped Steps: If a step is skipped (e.g., due to skip markers), outcome will be skipped not failure, so the assertion won't falsely fail

PR vs Main Behavior: The logic works identically on PRs and main/tags, which aligns with the goal (complete coverage on both)

Very Long Running Opt-in: Correctly preserves the enable-only behavior while applying the continue-on-error pattern

Workflow Validation

Commit Message: ✅ Follows conventional commits (chore: ...)

Changed Files: ✅ Only workflow file (.github/workflows/_test.yml)

No Breaking Changes: ✅ PR and main branch CI behavior preserved, only execution strategy changed

Verdict

APPROVED (with documentation suggestion)

Blocking Issues: None

Suggestions:

  1. 📚 Documentation enhancement (non-blocking) - Add execution resilience section to .github/CLAUDE.md
  2. 🎯 Optional improvement - Add test suite summary table to final assertion step

Ready to Merge: Yes, pending CI completion (the two failing tests appear to be flaky environmental issues unrelated to this change)

Next Steps

  1. For this PR: Monitor CI completion. If the two flaky tests continue failing, consider:

    • Adding @pytest.mark.flaky(retries=3) to both tests
    • Investigating root cause of timing/network issues
  2. Follow-up PR (optional): Add documentation enhancement to .github/CLAUDE.md as suggested above


Excellent work on improving CI reliability and coverage reporting accuracy! 🎉

Comment thread .github/workflows/_test.yml Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.
see 8 files with indirect coverage changes

@olivermeyer olivermeyer changed the title chore: run long running tests even if regular tests fail on main chore: run all tests even on failure Mar 18, 2026
Copilot AI review requested due to automatic review settings March 18, 2026 07:17
Comment thread .github/workflows/_test.yml
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the reusable CI test workflow to ensure all test suites run even if an earlier suite fails, improving the completeness of coverage/reporting uploads (e.g., to Codecov) while still failing the job at the end when any suite fails.

Changes:

  • Mark Unit/Integration/E2E/long-running/very-long-running test steps as continue-on-error: true and assign step IDs.
  • Add a final “Assert no test failures” gate step that fails the job if any suite’s step outcome is failure.
  • Refine the conditional for the “very long running” suite to include !cancelled() alongside the enable markers.

Comment thread .github/workflows/_test.yml
Comment thread .github/workflows/_test.yml
@sonarqubecloud
Copy link
Copy Markdown

@olivermeyer olivermeyer merged commit 485f86f into main Mar 18, 2026
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

claude Trigger Claude Code automation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants