Skip to content

Fix flaky domain & data product rename by proceeding on search version conflicts#28580

Merged
sonika-shah merged 2 commits into
1.12.10from
sid/fix-domain-rename-conflicts-1.12.10
Jun 2, 2026
Merged

Fix flaky domain & data product rename by proceeding on search version conflicts#28580
sonika-shah merged 2 commits into
1.12.10from
sid/fix-domain-rename-conflicts-1.12.10

Conversation

@siddhant1
Copy link
Copy Markdown
Member

Describe your changes:

Domains.spec.ts flakes in the nightly AUT runs (e.g. PostgreSQL 1.9.1 ➡ 1.12.10) when a domain or data product is renamed.

Root cause. On rename, the new FQN is propagated to every related search document with updateByQuery. When concurrent writes touch the same assets, Elasticsearch/OpenSearch raise version_conflict_engine_exception, which aborts the entire updateByQuery and leaves the rename half-applied in the index (stale FQNs). The UI then reads stale data and the spec assertions on the renamed entity/its subdomains/assets fail intermittently.

Fix. Set conflicts=proceed on the rename-propagation updateByQuery calls so a per-document version conflict is retried/skipped instead of aborting the batch. This mirrors the handling already present on main (#25751); this PR backports just that conflict handling to the 1.12 line.

Methods updated in both ElasticSearchEntityManager and OpenSearchEntityManager:

Rename path Methods
Domain rename updateDomainFqnByPrefix, updateAssetDomainFqnByPrefix
Data product rename updateDataProductReferences, updateAssetDomainsForDataProduct, updateAssetDomainsByIds

updateAssetDomainFqnByPrefix additionally scopes its query to documents matching the domain FQN prefix (buildDomainFqnPrefixQuery) instead of match_all, so it only rewrites affected assets — fewer documents touched, fewer conflicts.

Surgical: search-layer only, +34/−6 across the two manager classes, no behavior change beyond conflict resilience.

Type of change:

  • Bug fix

Checklist:

  • I have read the CONTRIBUTING document.
  • Ran mvn spotless:check on openmetadata-service (BUILD SUCCESS) and verified the Conflicts enum / conflicts(...) builder resolve against the bundled ES (9.2.4) / OS (3.5.0) clients.

…n conflicts

Domains.spec.ts intermittently fails when renaming a domain or data
product. On rename, the FQN change is propagated to every related search
document via updateByQuery. With concurrent writes touching the same
assets, Elasticsearch/OpenSearch raise version_conflict_engine_exception,
which aborts the whole updateByQuery and leaves the rename half-applied in
the index (stale FQNs), so the assertions on the renamed entity flake.

Make the rename-propagation updateByQuery calls resilient by setting
conflicts=proceed, mirroring the handling already on main (#25751):

  Domain rename:
    - updateDomainFqnByPrefix
    - updateAssetDomainFqnByPrefix (also scope the query to matching
      domain documents instead of match_all)
  Data product rename:
    - updateDataProductReferences
    - updateAssetDomainsForDataProduct
    - updateAssetDomainsByIds

Applied to both ElasticSearchEntityManager and OpenSearchEntityManager.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@siddhant1 siddhant1 added UI UI specific issues safe to test Add this label to run secure Github workflows on PRs labels Jun 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

🔴 Playwright Results — 1 failure(s), 26 flaky

✅ 3463 passed · ❌ 1 failed · 🟡 26 flaky · ⏭️ 97 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 2 705 0 5 9
🟡 Shard 3 713 0 10 6
🔴 Shard 4 726 1 3 19
🟡 Shard 5 672 0 1 35
🟡 Shard 6 647 0 7 28

Genuine Failures (failed on all attempts)

Pages/Entity.spec.ts › Tag Add, Update and Remove for child entities (shard 4)
Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoBeHidden�[2m(�[22m�[2m)�[22m failed

Locator:  locator('.column-detail-panel').locator('.tags-list').getByTestId('tag-PersonalData.SpecialCategory')
Expected: hidden
Received: visible
Timeout:  5000ms

Call log:
�[2m  - Expect "toBeHidden" with timeout 5000ms�[22m
�[2m  - waiting for locator('.column-detail-panel').locator('.tags-list').getByTestId('tag-PersonalData.SpecialCategory')�[22m
�[2m    9 × locator resolved to <div class="tag-item" data-testid="tag-PersonalData.SpecialCategory">…</div>�[22m
�[2m      - unexpected value "visible"�[22m

🟡 26 flaky test(s) (passed on retry)
  • Features/CustomizeDetailPage.spec.ts › Dashboard - customization should work (shard 2, 1 retry)
  • Features/CustomizeDetailPage.spec.ts › Database - customization should work (shard 2, 1 retry)
  • Features/DataProductRename.spec.ts › should handle multiple consecutive renames and preserve assets (shard 2, 1 retry)
  • Features/Glossary/GlossaryHierarchy.spec.ts › should move term with children to different glossary (shard 2, 1 retry)
  • Features/LandingPageWidgets/DomainDataProductsWidgets.spec.ts › Data Product asset count should update when assets are removed (shard 2, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Features/Permissions/ServiceEntityPermissions.spec.ts › SearchIndex Service allow common operations permissions (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Flow/ExploreDiscovery.spec.ts › Should display deleted assets when showDeleted is checked and deleted is not present in queryFilter (shard 3, 1 retry)
  • Flow/Metric.spec.ts › verify metric expression update (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Flow/ServiceDocPanel.spec.ts › should only ever have one section highlighted at a time (shard 3, 1 retry)
  • Flow/ServiceForm.spec.ts › Verify SSL cert upload with long filename and UI overflow handling (shard 3, 1 retry)
  • Pages/CustomProperties.spec.ts › Entity Reference List (shard 3, 1 retry)
  • Pages/CustomThemeConfig.spec.ts › Update Hover and selected Color (shard 4, 1 retry)
  • Pages/DataContracts.spec.ts › Create Data Contract and validate for Api Collection (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Domain Rbac (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Delete Directory (shard 5, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › Column lineage for container -> dashboard (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify LineageSearchSelect in lineage mode (shard 6, 1 retry)
  • Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: container (shard 6, 1 retry)
  • Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab is NOT visible for storageService in platform lineage (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Delete Database Schema (shard 6, 1 retry)
  • Pages/Tag.spec.ts › Add and Remove Assets and Check Restricted Entity (shard 6, 1 retry)
  • Pages/Users.spec.ts › Should navigate to user profile from feed card avatar click (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@sonika-shah sonika-shah merged commit ae6cf7d into 1.12.10 Jun 2, 2026
24 of 54 checks passed
@sonika-shah sonika-shah deleted the sid/fix-domain-rename-conflicts-1.12.10 branch June 2, 2026 05:01
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Jun 2, 2026

Code Review ✅ Approved 1 resolved / 1 findings

Increases domain and data product rename stability by enabling conflict progression in search index updates. Note that the refined prefix query may inadvertently match sibling domains with shared FQN prefixes.

✅ 1 resolved
Performance: Prefix query may over-match sibling domains with shared prefix

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/elasticsearch/ElasticSearchEntityManager.java:1258-1265 📄 openmetadata-service/src/main/java/org/openmetadata/service/search/opensearch/OpenSearchEntityManager.java:1354-1361
The buildDomainFqnPrefixQuery uses a raw prefix query with oldFqn (e.g., "Sales"), which will also match documents belonging to domains like "SalesOps" since "Sales" is a prefix of "SalesOps". The script itself correctly guards against this by checking fqn.equals(params.oldFqn) or fqn.startsWith(params.oldFqn + '.'), so no incorrect updates will occur. However, the query could be tightened to reduce unnecessary script executions on non-matching documents.

This is a minor efficiency concern — the previous code used match_all which was worse, and the script handles correctness. But appending a separator to the prefix query (or using a bool with exact-match OR dot-prefixed) would make the query more precise.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs UI UI specific issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants