Skip to content

Scope overwriteTableConfigForTier Table config Jackson to sub-configs that can hold tier overrides#18563

Merged
Jackie-Jiang merged 3 commits into
apache:masterfrom
deepthi912:deepthi/scope-tier-overwrite-roundtrip
May 25, 2026
Merged

Scope overwriteTableConfigForTier Table config Jackson to sub-configs that can hold tier overrides#18563
Jackie-Jiang merged 3 commits into
apache:masterfrom
deepthi912:deepthi/scope-tier-overwrite-roundtrip

Conversation

@deepthi912
Copy link
Copy Markdown
Collaborator

@deepthi912 deepthi912 commented May 22, 2026

Problem
overwriteTableConfigForTier runs on every segment during checkSegmentsReload polls. The current code does a full Jackson roundtrip on the entire TableConfig — even though tier overrides can only live on IndexingConfig._tierOverwrites and FieldConfig._tierOverwrites.
Tables with no tierOverwrites defined anywhere would still pay the full serialize cost before, because the existing code serializes upfront and only then checks if there's anything to override. overwriteTableConfigForTier allocated ~4.5 GB (319 calls × ~14 MB each). Largest single allocator on the reload-check path. This multiplies for huge number of segments and allocates unnecessary memory when not needed

Fix:
Only roundtrip the pieces that can have overrides:

  • If IndexingConfig.getTierOverwrites() has an entry for this tier → roundtrip just IndexingConfig. Otherwise return the original reference.
  • For each FieldConfig with an override → roundtrip just that FieldConfig. Otherwise keep the original reference.
  • If nothing changed → return the original TableConfig reference (no allocation).
  • If something changed → shallow-copy TableConfig and swap the two affected fields.

@deepthi912 deepthi912 added performance Related to performance optimization memory Related to memory usage or optimization labels May 22, 2026
@deepthi912 deepthi912 changed the title Scope overwriteTableConfigForTier Jackson roundtrip to sub-configs that can hold tier overrides Scope overwriteTableConfigForTier Table config Jackson to sub-configs that can hold tier overrides May 22, 2026
@deepthi912 deepthi912 force-pushed the deepthi/scope-tier-overwrite-roundtrip branch from 335643c to 4826430 Compare May 22, 2026 02:23
`TableConfigUtils.overwriteTableConfigForTier` previously did a full Jackson
roundtrip over the entire TableConfig — `toJsonNode()` followed by
`jsonNodeToObject(_, TableConfig.class)` — even though tier overrides can only
affect `IndexingConfig` (via its `tierOverwrites` JsonNode) and individual
`FieldConfig` entries (via theirs). Every other field (validationConfig,
tenantConfig, ingestionConfig, taskConfig, etc.) was serialized and
deserialized for no reason.

In hot paths (`SegmentPreProcessor.needProcess` → `IndexLoadingConfig
.refreshIndexConfigs`) this function is invoked once per segment per
IndexType during `checkSegmentsReload`, and the full-TableConfig roundtrip
shows up as a large CPU + allocation hotspot in profiles, driving long
G1 evacuation failures and Old GC pauses on servers with many tiered
segments.

This change refactors the function to:
- Apply the override scoped to `IndexingConfig` (when its `tierOverwrites`
  has an entry for the tier).
- Apply the override scoped to each `FieldConfig` whose `tierOverwrites`
  has an entry for the tier — `FieldConfig`s without overrides skip
  Jackson entirely.
- Return the original `TableConfig` by reference when nothing changed
  (common case for tiers with no overrides).
- Return a shallow-copy `TableConfig` (via the existing copy constructor)
  with only `IndexingConfig` and `FieldConfigList` replaced; all other
  fields are shared by reference with the original.

The per-call cost drops from "serialize entire TableConfig" to "serialize
IndexingConfig + serialize each FieldConfig that has an override" — typically
a 20-100× reduction in serialized bytes and proportional CPU/allocation
savings.

Existing semantics are preserved: `testOverwriteTableConfigForTier` and
`testOverwriteTableConfigForTierWithError` pass unchanged. Two new tests
assert the optimization-relevant invariants: (1) the original TableConfig
instance is returned by reference when no override applies, and (2)
unaffected fields are shared by reference with the original after a
partial overwrite.
@deepthi912 deepthi912 force-pushed the deepthi/scope-tier-overwrite-roundtrip branch from 4826430 to 7920d75 Compare May 22, 2026 02:29
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 22, 2026

Codecov Report

❌ Patch coverage is 69.23077% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.27%. Comparing base (7358d5a) to head (b27a6ad).
⚠️ Report is 10 commits behind head on master.

Files with missing lines Patch % Lines
...he/pinot/segment/local/utils/TableConfigUtils.java 69.23% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18563      +/-   ##
============================================
+ Coverage     64.26%   64.27%   +0.01%     
- Complexity     1126     1137      +11     
============================================
  Files          3311     3314       +3     
  Lines        203829   204075     +246     
  Branches      31722    31765      +43     
============================================
+ Hits         130989   131177     +188     
- Misses        62326    62362      +36     
- Partials      10514    10536      +22     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.27% <69.23%> (+0.01%) ⬆️
temurin 64.27% <69.23%> (+0.01%) ⬆️
unittests 64.27% <69.23%> (+0.01%) ⬆️
unittests1 56.71% <0.00%> (-0.01%) ⬇️
unittests2 35.54% <69.23%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

…e CollectionUtils.isEmpty

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Jackie-Jiang Jackie-Jiang merged commit 10f8735 into apache:master May 25, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

memory Related to memory usage or optimization performance Related to performance optimization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants