Skip to content

fix(search): scrub stale file extension aggregation on 1.12.10 upgrade (#1A file search 500)#28565

Merged
sonika-shah merged 4 commits into
1.12.10from
fix/scrub-stale-file-extension-agg-v11210
Jun 1, 2026
Merged

fix(search): scrub stale file extension aggregation on 1.12.10 upgrade (#1A file search 500)#28565
sonika-shah merged 4 commits into
1.12.10from
fix/scrub-stale-file-extension-agg-v11210

Conversation

@sonika-shah
Copy link
Copy Markdown
Collaborator

@sonika-shah sonika-shah commented Jun 1, 2026

Summary

Backport of main PR #28555 to the 1.12.10 release line.

  • Root cause: PR Fix extension field mapping in ES index files to prevent reindex failures  #27080 (backported to 1.12.5) changed extension field mapping from keywordflattened and renamed the seed aggregation to fileExtension. The SettingsCache startup merge is additive for existing asset types — any cluster whose DB already held the file config (upgraded from pre-1.12.5) keeps the stale extension aggregation. On OpenSearch, a terms agg against a flat_object field throws illegal_argument_exception → 500 on every file search.
  • Fix: New v11210 migration with removeStaleFileExtensionAggregation() targeting only the file assetType aggregation list. Called from both MySQL and Postgres entrypoints. Idempotent.

What's new in this version

  • bootstrap/sql/migrations/native/1.12.10/ — placeholder SQL files (no DDL changes)
  • migration/mysql/v11210/Migration.java + migration/postgres/v11210/Migration.java — migration entrypoints
  • migration/utils/v11210/MigrationUtil.java — scrub logic
  • Unit test: FileExtensionAggregationScrubTest (4 tests)

Test plan

  • 4 unit tests pass: FileExtensionAggregationScrubTest
  • Simulate upgrade: seed DB with pre-1.12.5 file config (containing extension agg), run v11210 migration, verify extension is gone and fileExtension remains
  • File search on OpenSearch returns 200 after migration
  • Idempotent: running migration twice does not modify the DB on second run

Related


Summary by Gitar

  • Refactored search settings:
    • Added removeFlattenedChildrenSearchSettings() to scrub stale indexed fields (columns.children.name, dataModel.columns.children.name, etc.) from highlightFields, searchFields, and allowedFields in SearchSettings.
    • Integrated this cleanup into the v11210 migration suite alongside the file extension scrub.

This will update automatically on new commits.

…e (#1A file search 500)

Backport of the 1.13 fix (main PR #28555) to the 1.12.10 release line.

PR #27080 (backported to 1.12.5) changed the file index mapping from
extension:keyword to extension:flattened and renamed the seed aggregation
to fileExtension. The SettingsCache startup merge is additive for existing
asset types, so any cluster whose DB already held the file config (upgraded
from pre-1.12.5) keeps the stale extension aggregation. On OpenSearch, a
terms agg against a flat_object field throws illegal_argument_exception,
causing a 500 on every file search query.

Adds v11210 migration with removeStaleFileExtensionAggregation() targeting
only the file assetType aggregation list. Idempotent.
@github-actions github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Jun 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

🟡 Playwright Results — all passed (27 flaky)

✅ 3463 passed · ❌ 0 failed · 🟡 27 flaky · ⏭️ 97 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 2 705 0 5 9
🟡 Shard 3 714 0 9 6
🟡 Shard 4 725 0 5 19
✅ Shard 5 673 0 0 35
🟡 Shard 6 646 0 8 28
🟡 27 flaky test(s) (passed on retry)
  • Features/DataQuality/IncidentManagerDateFilter.spec.ts › Date filter persists on page reload (shard 2, 1 retry)
  • Features/ExploreQuickFilters.spec.ts › should show correct count for tier filter options from aggregation (shard 2, 1 retry)
  • Features/Glossary/GlossaryAssets.spec.ts › should remove glossary term tag from entity page (shard 2, 1 retry)
  • Features/Glossary/GlossaryHierarchy.spec.ts › should move nested term to root level of same glossary (shard 2, 1 retry)
  • Features/Glossary/GlossaryHierarchy.spec.ts › should move term with children to different glossary (shard 2, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Features/Permissions/ServiceEntityPermissions.spec.ts › SearchIndex Service allow common operations permissions (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Flow/ExploreDiscovery.spec.ts › Should display deleted assets when showDeleted is checked and deleted is not present in queryFilter (shard 3, 1 retry)
  • Flow/Metric.spec.ts › Metric creation flow should work (shard 3, 1 retry)
  • Flow/Metric.spec.ts › verify metric expression update (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Flow/ServiceForm.spec.ts › Verify form selects are working properly (shard 3, 1 retry)
  • Pages/CustomProperties.spec.ts › Entity Reference List (shard 4, 1 retry)
  • Pages/CustomProperties.spec.ts › Integer (shard 4, 1 retry)
  • Pages/CustomThemeConfig.spec.ts › Update Hover and selected Color (shard 4, 1 retry)
  • Pages/DataContracts.spec.ts › Contract Status badge should be visible on condition if Contract Tab is present/hidden by Persona (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Delete Dashboard (shard 4, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › Column lineage for container -> table (shard 6, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › Column lineage for searchIndex -> mlModel (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify LineageSearchSelect in lineage mode (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Delete Messaging Service (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Delete Database (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Delete Database Schema (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Delete Drive Service (shard 6, 1 retry)
  • Pages/Tag.spec.ts › Add and Remove Assets and Check Restricted Entity (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

PR #27080 renamed both the aggregation and searchField from extension to
fileExtension. Extended stripStaleFileExtensionSettings to also remove
extension from searchFields, preventing multi_match failures on non-empty
file queries on OpenSearch.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

…scrub

Backport branch was missing removeFlattenedChildrenSearchSettings() which
was already in main's v11210 via conflict resolution with the immense-term
children fix. Both scrubs now run in v11210: flattened-children references
and stale file extension aggregation/searchField.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@sonika-shah sonika-shah merged commit 6274c33 into 1.12.10 Jun 1, 2026
24 of 52 checks passed
@sonika-shah sonika-shah deleted the fix/scrub-stale-file-extension-agg-v11210 branch June 1, 2026 15:12
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Jun 1, 2026

Code Review 👍 Approved with suggestions 0 resolved / 1 findings

Backports the file extension aggregation scrub to address 500 errors and expands search settings cleanup to remove stale flattened children references. Add unit tests for the new stripFlattenedChildrenReferences logic to ensure future stability.

💡 Quality: No unit tests for stripFlattenedChildrenReferences

📄 openmetadata-service/src/main/java/org/openmetadata/service/migration/utils/v11210/MigrationUtil.java:46-60 📄 openmetadata-service/src/test/java/org/openmetadata/service/migration/utils/v11210/FileExtensionAggregationScrubTest.java

The new stripFlattenedChildrenReferences method and its helpers (removeStaleHighlightFields, removeStaleSearchFields, removeStaleAllowedFields) have no unit test coverage. The existing FileExtensionAggregationScrubTest only covers the file-extension scrub. Given this is a data migration that modifies production DB state, similar test coverage (e.g., verifying removal from highlightFields, searchFields, allowedFields, and idempotency) would be valuable to prevent regressions.

🤖 Prompt for agents
Code Review: Backports the file extension aggregation scrub to address 500 errors and expands search settings cleanup to remove stale flattened children references. Add unit tests for the new `stripFlattenedChildrenReferences` logic to ensure future stability.

1. 💡 Quality: No unit tests for stripFlattenedChildrenReferences
   Files: openmetadata-service/src/main/java/org/openmetadata/service/migration/utils/v11210/MigrationUtil.java:46-60, openmetadata-service/src/test/java/org/openmetadata/service/migration/utils/v11210/FileExtensionAggregationScrubTest.java

   The new `stripFlattenedChildrenReferences` method and its helpers (`removeStaleHighlightFields`, `removeStaleSearchFields`, `removeStaleAllowedFields`) have no unit test coverage. The existing `FileExtensionAggregationScrubTest` only covers the file-extension scrub. Given this is a data migration that modifies production DB state, similar test coverage (e.g., verifying removal from highlightFields, searchFields, allowedFields, and idempotency) would be valuable to prevent regressions.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant