Skip to content

fix(search): backport immense-term children mapping fix to 1.12.10#28572

Merged
sonika-shah merged 4 commits into
1.12.10from
fix/immense-term-children-1.12.10
Jun 1, 2026
Merged

fix(search): backport immense-term children mapping fix to 1.12.10#28572
sonika-shah merged 4 commits into
1.12.10from
fix/immense-term-children-1.12.10

Conversation

@sonika-shah
Copy link
Copy Markdown
Collaborator

@sonika-shah sonika-shah commented Jun 1, 2026

Summary

Backport of main PR #28509 to the 1.12.10 release branch.

  • Mapping change: Remap children from flattenedobject/enabled:false across all 7 affected index mappings (4 locales × 7 indexes). The flattened type indexed every nested leaf as a keyword term; a leaf longer than Lucene's 32,766-byte limit (e.g. a large Looker/DAX expression) failed the whole document's reindex. object/enabled:false stores the subtree in _source for display but never indexes it.
  • Index class fix: Replace dead columns.children.name / dataModel.columns.children.name search field with columnNamesFuzzy in TableIndex, ContainerIndex, DashboardDataModelIndex, WorksheetIndex.
  • SearchSettings seed: Remove 11 stale *.children field entries from searchSettings.json.
  • v11210 migration: Scrub the same stale entries from DB-stored SearchSettings on upgrade. Idempotent — removeIf on an already-clean list is a no-op, so clusters that later upgrade to 1.13.0 run the scrub again in v1130 safely.
  • Playwright E2E test (SearchIndexNestedColumns.spec.ts): Creates a 25-level-deep column with a >32 KB leaf expression, verifies the table indexes without an immense_term failure, and confirms the deep column name is searchable via columnNamesFuzzy.

What's NOT included

  • v1130 migration (1.13.0 only)
  • Integration tests (SearchIndexImmenseTermIT, FlattenedChildrenHighlightSearchIT)
  • Sample data changes

Test plan

  • Verify reindex completes without immense_term failure on a table/container with deeply nested columns
  • Verify columns.children.name / dataModel.columns.children.name are absent from search settings after upgrade migration runs
  • Confirm container search returns 200 on OpenSearch after upgrade (no "no associated analyzer" on highlight)
  • SearchIndexNestedColumns.spec.ts — 25-level oversized nested column indexes and is searchable by deep column name

Backport of the flattened-children reindex fix from main (PR #28509).

The recursive column/schema `children` field was mapped `flattened`, which
caused Lucene's 32,766-byte per-term limit to fail document indexing for
entities with large nested column expressions. Remap `children` from
`flattened` to `object/enabled:false` across all affected indexes (table,
container, dashboardDataModel, worksheet, topic, apiEndpoint, searchEntity)
so the subtree is stored but never indexed.

Also:
- Replace dead `columns.children.name` / `dataModel.columns.children.name`
  search field references with `columnNamesFuzzy` in the 4 Java index classes
- Remove stale children-field entries from the searchSettings.json seed
- Add v11210 migration to scrub the same stale entries from DB-stored
  SearchSettings on upgrade (idempotent)
@github-actions github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Jun 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

🟡 Playwright Results — all passed (31 flaky)

✅ 3459 passed · ❌ 0 failed · 🟡 31 flaky · ⏭️ 97 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 2 703 0 7 9
🟡 Shard 3 715 0 8 6
🟡 Shard 4 723 0 7 19
🟡 Shard 5 672 0 1 35
🟡 Shard 6 646 0 8 28
🟡 31 flaky test(s) (passed on retry)
  • Features/CustomizeDetailPage.spec.ts › API Collection - customization should work (shard 2, 1 retry)
  • Features/DataQuality/ColumnLevelTests.spec.ts › Column Values Sum To Be Between (shard 2, 1 retry)
  • Features/DataQuality/DataQualityPermissions.spec.ts › Admin can see Data Quality UI controls (add test case, add test suite) (shard 2, 1 retry)
  • Features/DataQuality/IncidentManagerDateFilter.spec.ts › Date filter persists on page reload (shard 2, 1 retry)
  • Features/DataQuality/TableLevelTests.spec.ts › Custom SQL Query (shard 2, 1 retry)
  • Features/Glossary/GlossaryAssets.spec.ts › should remove glossary term tag from entity page (shard 2, 1 retry)
  • Features/Glossary/GlossaryHierarchy.spec.ts › should move nested term to root level of same glossary (shard 2, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Features/Permissions/ServiceEntityPermissions.spec.ts › SearchIndex Service allow common operations permissions (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Flow/ExploreDiscovery.spec.ts › Should display deleted assets when showDeleted is checked and deleted is not present in queryFilter (shard 3, 1 retry)
  • Flow/Metric.spec.ts › verify metric expression update (shard 3, 1 retry)
  • Flow/Navbar.spec.ts › Search Term - Database (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Flow/ServiceForm.spec.ts › Verify SSL cert upload with long filename and UI overflow handling (shard 3, 1 retry)
  • Pages/CustomProperties.spec.ts › Date Time (shard 4, 1 retry)
  • Pages/CustomProperties.spec.ts › Entity Reference List (shard 4, 1 retry)
  • Pages/CustomProperties.spec.ts › Entity Reference List (shard 4, 1 retry)
  • Pages/CustomProperties.spec.ts › Sql Query (shard 4, 1 retry)
  • Pages/DataContractInheritance.spec.ts › Delete Button Disabled - Fully inherited contracts cannot be deleted (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Delete Pipeline (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Tier Add, Update and Remove (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Glossary Term Add, Update and Remove for child entities (shard 5, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › Column lineage for table -> searchIndex (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify LineageSearchSelect in lineage mode (shard 6, 1 retry)
  • Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: container (shard 6, 1 retry)
  • Pages/ODCSImportExport.spec.ts › Multi-object ODCS contract - object selector shows all schema objects (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Delete Database Service (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Announcement create, edit & delete (shard 6, 1 retry)
  • Pages/Tag.spec.ts › Add and Remove Assets and Check Restricted Entity (shard 6, 1 retry)
  • ... and 1 more

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

…x files

The backport was missing the om_analyzer declaration for columnNamesFuzzy
in container, table, and dashboard_data_model across all 4 locales (en/jp/ru/zh).
Without it the field falls back to dynamic text mapping with the default analyzer
instead of om_analyzer, giving weaker tokenization than main. Adds the same
4-line block that the main PR (#28509) included.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

…snapshot

SearchSettings.spec.ts 'Restore default search settings' compares against
the hardcoded snapshot in searchSettingUtils.ts. The field was removed from
searchSettings.json seed but the snapshot wasn't updated, causing the test
to fail (expected columns.children.name, received nothing).
@sonika-shah sonika-shah requested a review from a team as a code owner June 1, 2026 11:33
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

❌ UI Checkstyle Failed

❌ Playwright — ESLint + Prettier + Organise Imports

One or more Playwright test files have linting or formatting issues.

❌ Core Components — ESLint + Prettier

One or more core-component files have linting or formatting issues.


Fix locally (fast — only checks files changed in this branch):

make ui-checkstyle-changed

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

…backport

Backport of SearchIndexNestedColumns.spec.ts from main PR #28509.

Verifies that a 25-level-deep column with a >32 KB leaf expression
indexes without an immense_term failure and is searchable by its deep
column name via columnNamesFuzzy.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@sonika-shah sonika-shah merged commit 39cb088 into 1.12.10 Jun 1, 2026
32 of 59 checks passed
@sonika-shah sonika-shah deleted the fix/immense-term-children-1.12.10 branch June 1, 2026 12:09
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Jun 1, 2026

Code Review ✅ Approved

Backports the immense-term children mapping fix to 1.12.10 by transitioning field types to object/enabled:false and updating search mappings. No issues found.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants