[Feature]: Allow reporters to influence test outcome / exit code for known-flaky test suppression

### 🚀 Feature Request

Allow a custom Playwright reporter to influence the exit code by modifying `result.status` in `onTestEnd()` or `result.status` in `onEnd()`. Currently, Playwright's reporter API is read-only — mutating results has no effect on the runner's exit code, which is computed independently after all reporters run.

Specifically, we'd like one of:

**Option A (preferred):** Make `result.status` writable in `onTestEnd(test, result)`, with the runner respecting the final value when computing the exit code.

**Option B:** Make `result.status` writable in `onEnd(result)`, allowing a reporter to downgrade the overall suite status before the exit code is determined.

**Option C:** A built-in config option (e.g., `softFailFile: '.flaky-tests.json'`) that accepts a list of test names whose failures should not affect the exit code.

Additionally, for reliable test matching at scale, we'd benefit from a stable, deterministic, human-readable test identifier — something like `titlePath().slice(1).join(' › ')` (the describe chain + test name, without the file path). The current matching mechanisms (`grep`/`grepInvert` regex, `titlePath()` with file path, `testId` hash) are either fragile, path-dependent, or opaque.


### Example


```typescript
class SoftFailReporter implements Reporter {
  private flakyTests = loadFlakyTestList(); // from a JSON file generated by CI

  onTestEnd(test: TestCase, result: TestResult) {
    if (
      result.status === "failed" &&
      this.flakyTests.has(test.titlePath().slice(1).join(" › "))
    ) {
      // Capture real failure data to a separate artifact for telemetry
      this.recordFlakyFailure(test, result);
      // Downgrade failure so it doesn't affect exit code
      result.status = "passed";
    }
  }
}
```

This reporter would:
1. Load a list of known-flaky test names from a JSON file (generated by a CI pipeline querying test telemetry)
2. Let all tests run normally under real conditions
3. When a known-flaky test fails, capture the real failure to a separate artifact (for telemetry and auto-re-enable decisions)
4. Rewrite the status so the failure doesn't break the build or pollute console output


### Motivation


We manage a large monorepo (~200 packages, thousands of tests, 50+ engineers) and have struggled with flaky test management at scale:

1. **Skipping flaky tests** creates a blind spot — skipped tests aren't validated under real conditions. Our rate of disabling tests has outpaced the rate at which developers can fix and re-enable them.

2. **Running skipped tests in a quarantine pipeline** doesn't work either — these tests run in isolation, pass in a vacuum, get re-enabled, and promptly fail again under real CI conditions (CPU pressure, parallel execution, real network calls, etc.).

3. **The approach that works** (which we've built for Jest) is to always run all tests under real conditions, but suppress the build-breaking consequence of known-flaky failures. The test still executes, the failure is still recorded in artifacts, but the exit code stays 0. This gives us:
   - **Ground-truth signal** — flaky tests run under the same conditions as every other test
   - **Confident auto-re-enabling** — if a test passes N consecutive runs under full load, we can safely remove it from the flaky list
   - **No build disruption** — known-flaky failures don't block PRs or CI

In Jest, this works because the reporter API passes **mutable** `TestResult` and `AggregatedResult` objects. We'd love to have the same capability in Playwright.

### Prior art

- **Jest**: Mutable `AggregatedResult` in reporter hooks (we use this today)
- **pytest**: `@pytest.mark.xfail(strict=False)` — expected failures don't fail the suite
- **RSpec**: `pending` blocks — failures in pending tests are recorded but don't fail the suite

We are a team at Microsoft and would be happy to submit a PR implementing this if the team is aligned on the approach. Happy to discuss the best design.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Allow reporters to influence test outcome / exit code for known-flaky test suppression #40414

🚀 Feature Request

Example

Motivation

Prior art

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature]: Allow reporters to influence test outcome / exit code for known-flaky test suppression #40414

Description

🚀 Feature Request

Example

Motivation

Prior art

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions