Skip to content

Minor incident test flakyness#26237

Merged
TeddyCr merged 5 commits intoopen-metadata:mainfrom
TeddyCr:MINOR-IncidentTest-Flakyness
Mar 5, 2026
Merged

Minor incident test flakyness#26237
TeddyCr merged 5 commits intoopen-metadata:mainfrom
TeddyCr:MINOR-IncidentTest-Flakyness

Conversation

@TeddyCr
Copy link
Collaborator

@TeddyCr TeddyCr commented Mar 5, 2026

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • Test stability improvements:
    • Extracted waitForDataIndexed() method with automatic re-creation of incident statuses if data is lost during search index rebuilds
    • Added filtering by first test case FQN to detect data loss and trigger re-indexing within 60-second intervals
  • Polling configuration tuning:
    • Increased await timeout from 30s to 60s and adjusted poll delays/intervals in filtered search assertion

This will update automatically on new commits.

harshach
harshach previously approved these changes Mar 5, 2026
@gitar-bot
Copy link

gitar-bot bot commented Mar 5, 2026

🔍 CI failure analysis for e919fd0: The maven-collate-ci failure is an unrelated Git submodule checkout infrastructure issue. A prior related failure (IncidentPaginationIT timeout) and unrelated Playwright flakiness are also present.

Issue

Three CI failures are present across different jobs: one infrastructure issue (new), one integration test timeout related to this PR (from a prior run), and Playwright flakiness (unrelated).


Failure 1: maven-collate-ci — Git Submodule Checkout Failure (Unrelated)

The maven-collate-ci job triggered a downstream Collate build (openmetadata-collate/actions/runs/22731411722), which failed during the "Checkout OpenMetadata Submodule" step:

git checkout e919fd0828e13a4084eb3b264c8218ec1c0ea836
fatal: unable to read tree (e919fd0828e13a4084eb3b264c8218ec1c0ea836)
Process completed with exit code 128.

The Collate workflow updates its OpenMetadata submodule to the pinned base commit, then attempts to force-checkout the PR head SHA (e919fd082...). Since the submodule fetch only retrieves the pinned commit (not unmerged PR branches), the PR head object is not found. No code was compiled or tested — the failure is purely a Git infrastructure problem with how the Collate CI resolves PR branch commits in the submodule context. This is unrelated to this PR's changes.


Failure 2: Integration Test Timeout (Related to this PR — from prior run)

The IncidentPaginationIT.setup test timed out in the await() block added by this PR:

org.awaitility.core.ConditionTimeoutException: Condition was not fulfilled within 2 minutes.
    at org.openmetadata.it.tests.IncidentPaginationIT.setup(IncidentPaginationIT.java:99)

The polling condition waits up to 2 minutes for a testCaseFQN-filtered search to return exactly 1 result (response.getData().size() == 1). In CI, this condition was not met — the filtered index lags significantly in CI environments.

The fix should either:

  • Increase the timeout on the second await() (e.g., from 2 to 3+ minutes), or
  • Relax the condition to size() >= 1 in case the query occasionally returns more than one result

Failure 3: Playwright E2E Tests (Unrelated)

The playwright-ci-postgresql (6, 6) job has 1 hard failure and 7 flaky tests unrelated to IncidentPaginationIT.java:

  • Strict mode violations (ExplorePageRightPanel.spec.ts:1773, 1971): Multiple [data-testid="loader"] elements violating Playwright's single-element requirement
  • Lineage element visibility timeout (Lineage.spec.ts:509): Node with special characters in ID not visible within 5000ms
  • Tooltip overlay blocking clicks (Lineage.spec.ts:106): Ant Design tooltip intercepting pointer events, causing 300s timeout
  • Glossary API 404 (Glossary.spec.ts:2339): Cleanup deletion failing with 404
  • SearchIndex undefined fields (EntityVersionPages.spec.ts:169): this.entityResponseData.fields[0] is undefined
  • Browser context closed (Users.spec.ts:570): Target page closed unexpectedly

These are pre-existing infrastructure/flakiness issues unrelated to this PR.

Code Review ✅ Approved

Addresses minor incident test flakyness by merging upstream main branch changes. No issues found.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@TeddyCr TeddyCr merged commit afcf1a9 into open-metadata:main Mar 5, 2026
16 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants