Skip to content

Fix flaky E2E tests with API-based setup and data isolation#64024

Draft
choo121600 wants to merge 2 commits intoapache:mainfrom
choo121600:refactor/e2e-data-isolation
Draft

Fix flaky E2E tests with API-based setup and data isolation#64024
choo121600 wants to merge 2 commits intoapache:mainfrom
choo121600:refactor/e2e-data-isolation

Conversation

@choo121600
Copy link
Member

@choo121600 choo121600 commented Mar 21, 2026

Summary

E2E tests have been intermittently failing in CI, causing unreliable builds and frequent reruns, due to two root causes:

  • UI-based data setup in beforeAll
    • Tests created seed data by opening a browser and clicking through the UI. This made setup fragile: dialog animations, rendering delays, and element timing issues could cause the setup itself to fail before any assertions ran.
  • Insufficient data isolation between parallel workers
    • Test identifiers were generated with Date.now(), which can collide when multiple Playwright workers start at the same time. Leftover data from previous runs could also interfere with assertions that assumed exact row counts.

This PR makes E2E tests deterministic and reliable by removing UI-driven setup and enforcing strict data isolation.

Change

New shared utilities (tests/e2e/utils/test-helpers.ts)

Replaces fragile UI-based setup with reliable API-driven helpers:

  • uniqueRunId(prefix)— UUID-based ID generation to prevent cross-worker collisions.
  • waitForDagReady() — Polls GET /api/v2/dags/{id} until the Dag is parsed and available.
    apiTriggerDagRun() / apiCreateDagRun() / apiSetDagRunState()* — Create and manipulate Dag runs via the API with 409-conflict handling for parallel safety.
  • waitForDagRunStatus() / waitForTaskInstanceState() — Poll until a Dag run or task instance reaches the expected state.
  • apiRespondToHITL() / setupHITLFlowViaAPI() — Drive the full HITL operator flow via API, removing the need for browser-based setup.
  • apiCreateVariable() / apiDeleteVariable() — Variable CRUD.
  • apiCreateBackfill() / apiCancelBackfill() — Backfill lifecycle management with automatic retry on 409.
  • waitForTableLoad() / waitForStableRowCount() — DOM stability helpers for reliable table assertions.

Custom Playwright fixtures (tests/e2e/fixtures.ts)

Standardizes test setup and removes boilerplate across all specs:

  • All Page Object Models are now provided as Playwright test fixtures, removing manual new PageObject(page) usage in every spec.
  • authenticatedRequest (worker-scoped) provides an APIRequestContext with stored auth state, enabling API calls in beforeAll/afterAll without creating a browser context.

Spec and Page Object updates (22 specs, 18 page objects)

Every spec file follows the same migration pattern, ensuring consistency:

  • import { test, expect } from "@playwright/test" --> import { test, expect } from "tests/e2e/fixtures"
  • beforeAll setup: browser.newContext() + UI interactions --> authenticatedRequest + API helper calls
  • Date.now() identifiers --> uniqueRunId() for collision-free IDs
  • afterAll cleanup: tracks created resources and deletes via API(404 treated as success for idempotency)

After

  • Setup failures eliminated — Data creation no longer depends on UI rendering, dialog animations, or element timing. API calls either succeed or return a well-defined status code.
  • Parallel-safe by design — UUID-based identifiers (uniqueRunId()) guarantee zero collisions between concurrent Playwright workers.
  • 409 conflict handling — All API helpers treat 409 Conflict as acceptable (resource already exists), making tests idempotent across retries and re-runs.
  • Faster simple setup — Straightforward data seeding (variables, connections) completes in seconds rather than tens of seconds, since it no longer requires browser rendering. Complex flows like HITL still require extended timeouts but are more reliable since they bypass UI interaction entirely.
  • No extra browser contextsbeforeAll no longer opens a separate browser.newContext() just for seeding data, reducing memory pressure and startup overhead per worker.

related: #63036


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)
    claude

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

@boring-cyborg boring-cyborg bot added the area:UI Related to UI/UX. For Frontend Developers. label Mar 21, 2026
@choo121600 choo121600 force-pushed the refactor/e2e-data-isolation branch 16 times, most recently from 2601f28 to 8b472a1 Compare March 23, 2026 18:54
@choo121600 choo121600 marked this pull request as ready for review March 23, 2026 22:41
@choo121600 choo121600 force-pushed the refactor/e2e-data-isolation branch 2 times, most recently from eb7f956 to d909103 Compare March 24, 2026 03:47
@choo121600 choo121600 marked this pull request as draft March 24, 2026 05:15
@choo121600 choo121600 force-pushed the refactor/e2e-data-isolation branch 3 times, most recently from b9a46f5 to 91fec17 Compare March 24, 2026 11:02
@choo121600 choo121600 force-pushed the refactor/e2e-data-isolation branch 3 times, most recently from f7e1225 to ca0e788 Compare March 24, 2026 13:37
@choo121600 choo121600 force-pushed the refactor/e2e-data-isolation branch from ca0e788 to e39499e Compare March 24, 2026 15:26
@bbovenzi
Copy link
Contributor

Nice! Could we possibly break this up into smaller PRs?

It is a lot of changes to review at once.

@choo121600
Copy link
Member Author

choo121600 commented Mar 25, 2026

@bbovenzi Thanks!
UI E2E tests have been pretty unstable, and this PR helped validate some improvements in stability. (Finally!)
image

I agree it's quite large — it currently mixes structural changes with fixes for flaky tests.
Given the current test structure, it was also difficult to identify root causes, so I worked in a single branch.

I’ll split it into smaller PRs soon 😉

@choo121600 choo121600 force-pushed the refactor/e2e-data-isolation branch from 8fb0d2a to 42f0a5e Compare March 25, 2026 07:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:UI Related to UI/UX. For Frontend Developers.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants