Skip to content

Add TableConfigValidator and InstanceConfigValidator SPI for batch restart enforcement#18167

Merged
xiangfu0 merged 3 commits into
apache:masterfrom
suvodeep-pyne:spyne/data-707-batch-restart-validation-spi
Apr 14, 2026
Merged

Add TableConfigValidator and InstanceConfigValidator SPI for batch restart enforcement#18167
xiangfu0 merged 3 commits into
apache:masterfrom
suvodeep-pyne:spyne/data-707-batch-restart-validation-spi

Conversation

@suvodeep-pyne
Copy link
Copy Markdown
Contributor

Summary

  • Adds TableConfigValidator and InstanceConfigValidator SPI interfaces in pinot-spi with list-based registries, enabling pre-mutation validation of table and instance configs
  • Adds InstanceUtils.toInstance(InstanceConfig) reverse converter for the updateInstanceTags validation path
  • Wires validator call sites into PinotTableRestletResource, PinotHelixResourceManager, PinotInstanceRestletResource, and TableConfigsRestletResource
  • Introduces ConfigValidationException (extends RuntimeException) mapped to HTTP 400 at the REST boundary

Test plan

  • InstanceUtilsTest.testToInstance() — roundtrip fidelity for all 4 instance types + unknown type error
  • InstanceConfigValidatorRegistryTest — 5 tests: no-op when empty, rejection propagates, short-circuit, ordering, reset
  • TableConfigValidatorRegistryTest — 5 tests: same coverage as above
  • PinotHelixResourceManagerConfigValidationTest — 4 mocked tests: addInstance/updateInstance/updateInstanceTags rejection before persistence, passing validator allows persistence

@xiangfu0 xiangfu0 added the enhancement Improvement to existing functionality label Apr 11, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds SPI hooks to validate table and instance configuration mutations before they’re persisted, enabling enforcement of batch restart / mutation rules and returning validation failures as client errors.

Changes:

  • Introduces TableConfigValidator / InstanceConfigValidator SPIs and in-memory registries in pinot-spi.
  • Adds ConfigValidationException for validator rejections and wires validator invocations into controller mutation paths.
  • Adds InstanceUtils.toInstance(InstanceConfig) to reconstruct Instance objects for validation (notably for updateInstanceTags) and adds unit tests for the new behavior.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/TableConfigValidator.java New SPI interface for table config mutation validation.
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/TableConfigValidatorRegistry.java Registry for table validators; invoked before persistence.
pinot-spi/src/main/java/org/apache/pinot/spi/config/instance/InstanceConfigValidator.java New SPI interface for instance mutation validation.
pinot-spi/src/main/java/org/apache/pinot/spi/config/instance/InstanceConfigValidatorRegistry.java Registry for instance validators; invoked before persistence.
pinot-spi/src/main/java/org/apache/pinot/spi/exception/ConfigValidationException.java Exception type used by validators to reject mutations.
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java Calls instance validators for add/update/updateTags prior to Helix persistence.
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/TableConfigValidationUtils.java Calls table validators as part of existing table config validation flow.
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/TableConfigsRestletResource.java Calls table validators when validating composite TableConfigs.
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTableRestletResource.java Adds table config validation on the copy-table flow.
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotInstanceRestletResource.java Maps ConfigValidationException to HTTP 400 for instance mutation endpoints.
pinot-common/src/main/java/org/apache/pinot/common/utils/config/InstanceUtils.java Adds toInstance(InstanceConfig) reverse conversion for validation usage.
pinot-common/src/test/java/org/apache/pinot/common/utils/config/InstanceUtilsTest.java Tests round-trip fidelity and unknown-prefix handling for toInstance.
pinot-spi/src/test/java/org/apache/pinot/spi/config/table/TableConfigValidatorRegistryTest.java Tests validator registry semantics (ordering, short-circuit, reset).
pinot-spi/src/test/java/org/apache/pinot/spi/config/instance/InstanceConfigValidatorRegistryTest.java Tests validator registry semantics (ordering, short-circuit, reset).
pinot-controller/src/test/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManagerConfigValidationTest.java Verifies validators reject before persistence and allow persistence when passing.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 11, 2026

Codecov Report

❌ Patch coverage is 84.05797% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.31%. Comparing base (31eac83) to head (ab3b9d9).
⚠️ Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
...er/api/resources/PinotInstanceRestletResource.java 0.00% 6 Missing ⚠️
...oller/api/resources/PinotTableRestletResource.java 0.00% 3 Missing ⚠️
...pinot/spi/exception/ConfigValidationException.java 50.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18167      +/-   ##
============================================
+ Coverage     63.29%   63.31%   +0.01%     
  Complexity     1627     1627              
============================================
  Files          3226     3229       +3     
  Lines        196636   196705      +69     
  Branches      30401    30408       +7     
============================================
+ Hits         124466   124538      +72     
+ Misses        62192    62191       -1     
+ Partials       9978     9976       -2     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.29% <84.05%> (+0.02%) ⬆️
java-21 63.27% <84.05%> (+0.01%) ⬆️
temurin 63.31% <84.05%> (+0.01%) ⬆️
unittests 63.30% <84.05%> (+0.01%) ⬆️
unittests1 55.28% <96.29%> (+0.02%) ⬆️
unittests2 34.97% <71.01%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@xiangfu0 xiangfu0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High-signal issues inline.

}
}
TaskConfigUtils.validateTaskConfigs(tableConfigs.getOffline(), schema, _pinotTaskManager, typesToSkip);
TableConfigValidatorRegistry.validate(offlineTableConfig, schema);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validator still runs inside validateConfig(), but addConfig()/updateConfig() call tuneConfig() only afterwards. That means the SPI is evaluating a different TableConfig than the one that is actually persisted, unlike the /tables endpoints where tuning happens before validation. A validator can therefore reject or allow the same mutation depending on which API the caller uses. If this SPI is meant to protect the stored config, it needs to run after tuning or in PinotHelixResourceManager.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the ordering. I investigated what tuneConfig() actually modifies:

  • applyTunerConfigs() — only indexing config (inverted indices, no-dictionary columns)
  • ensureMinReplicas() — segment replication count
  • ensureStorageQuotaConstraints() — quota config for dimension tables

None of these touch instance assignment, replica groups, pools, or server tags — the fields that config validators (like batch restart enforcement) inspect. So the pre-tuned and post-tuned configs are identical for validation purposes.

I considered moving the SPI call out of validateConfig() and into each mutation path after tuning, but that creates a maintenance risk: any new mutation endpoint that calls validateConfig() would silently skip SPI validation unless the caller remembers to invoke it separately. Keeping it in validateConfig() makes it a single checkpoint that can't be accidentally bypassed.

If a future tuner modifies fields that validators care about, the right fix at that point would be to move tuning before validateConfig() in the /tableConfigs paths (matching what /tables already does), rather than pulling the SPI call out.

…estart enforcement

Introduces validation SPI interfaces in pinot-spi for intercepting table and
instance config mutations before persistence. This enables StarTree to enforce
batch restart invariants (pool tags, replica groups) at mutation time rather
than only at restart time.

Changes:
- TableConfigValidator/InstanceConfigValidator interfaces in pinot-spi
- List-based registries (CopyOnWriteArrayList, first-rejection short-circuits)
- ConfigValidationException for rejection signaling (maps to HTTP 400)
- InstanceUtils.toInstance() reverse converter (InstanceConfig -> Instance)
- Validator call sites in PinotTableRestletResource, PinotHelixResourceManager,
  PinotInstanceRestletResource, TableConfigsRestletResource
- Unit tests for registries, toInstance() roundtrip, and resource manager
  validation wiring with mocked Helix dependencies
…onUtils

Consolidate SPI validation into TableConfigValidationUtils.validateTableConfig()
so every caller gets it automatically. This also adds the missing validation
to the copyTable endpoint, which previously had a TODO placeholder.
Fix copyTable() to return HTTP 400 for ConfigValidationException instead
of 500, matching the pattern used in PinotInstanceRestletResource.

Normalize legacy Server_<hostname> format in InstanceUtils.toInstance()
to prevent double-prefixed Helix instance IDs on older clusters.
@suvodeep-pyne suvodeep-pyne force-pushed the spyne/data-707-batch-restart-validation-spi branch from cdff185 to ab3b9d9 Compare April 13, 2026 22:07
@xiangfu0 xiangfu0 merged commit aa483d3 into apache:master Apr 14, 2026
15 of 16 checks passed
xiangfu0 pushed a commit to pinot-contrib/pinot-docs that referenced this pull request Apr 14, 2026
@xiangfu0
Copy link
Copy Markdown
Contributor

Documentation PR created: pinot-contrib/pinot-docs#737

This documentation covers the new TableConfigValidator and InstanceConfigValidator SPI interfaces, including:

  • Overview of validator interfaces
  • Registration and invocation patterns
  • ConfigValidationException and HTTP 400 mapping
  • Thread-safety requirements
  • Example implementations for both validators

xiangfu0 added a commit to pinot-contrib/pinot-docs that referenced this pull request Apr 14, 2026
…pache/pinot#18167)

Documentation for apache/pinot#18167

This PR adds documentation for the new TableConfigValidator and
InstanceConfigValidator SPI interfaces that were added in
apache/pinot#18167.

The documentation covers:
- Overview of the two validator interfaces
- How to register validators
- The ConfigValidationException and HTTP 400 mapping
- Thread-safety requirements
- Example implementations for table and instance validators
- Integration points with REST APIs

Co-authored-by: Pinot Docs <docs@pinot.apache.org>
tarun11Mavani added a commit to tarun11Mavani/pinot that referenced this pull request Apr 20, 2026
…ter delegation

Builds on the COLUMNAR_MAP storage layer (PR apache#18167) to enable query
execution against the index.

New classes:
- ColumnarMapDataSource: query-side data source backed by ColumnarMapIndexReader.
  Returns a NullDataSource for unknown keys (matches BaseMapDataSource contract;
  callers never see null). Sparse-key forward index throws on JSON parse / numeric
  parse failures with column/key/docId context (was previously silent).
- ColumnarMapForwardIndexReader: per-key forward index reader (immutable)
- ColumnarMapRealtimeInvertedIndex: per-key inverted index (mutable segment).
  Wraps each per-dictId bitmap in ThreadSafeMutableRoaringBitmap and returns a
  synchronized clone from getDocIds() — readers can iterate concurrently with
  the writer's add() calls (mirrors RealtimeInvertedIndex pattern).
- MutableColumnarMapIndexImpl: mutable index for consuming segments

Wire-up:
- ColumnarMapIndexType.createMutableIndex returns MutableColumnarMapIndexImpl
  instead of throwing UnsupportedOperationException.

Storage format:
- OnHeapColumnarMapIndexCreator now writes a per-sparse-key presence bitmap
  (run-optimized) into the SPMX layout for sparse-tier keys, reusing the
  nullBitmap slot. ImmutableColumnarMapIndexReader loads it directly. Without
  this, IS NULL / IS NOT NULL on sparse keys returned wrong results.
- Layout javadoc updated to clarify the slot semantic per tier.

Query operators:
- MapFilterOperator: adds per-key inverted index path and presence bitmap path
  alongside the existing JSON-index and full-scan paths. explainAttributes now
  emits the per_key_inverted_index branch.
- ItemTransformFunction: captures per-key null bitmap for null-aware item()
  evaluation; exposes getKeyPath() for direct MAP key resolution. getNullBitmap
  is gated on the query's nullHandlingEnabled flag and skips allocation when
  the segment-level null bitmap doesn't intersect the block's docId range.

SPI hardening:
- ImmutableColumnarMapIndexReader/MutableColumnarMapIndexImpl getKeyDataSource
  throw UnsupportedOperationException — callers must go through
  ColumnarMapDataSource so the unknown-key fall-back is consistent.

Tests:
- ColumnarMapIndexTest: testSparseKeyPresenceBitmapMatchesPresentDocs verifies
  IS NULL / IS NOT NULL correctness on sparse-tier keys;
  testRealtimeInvertedIndexConcurrentReaderCorrectness exercises 1-writer +
  4-reader stress with strict membership assertions on observed docIds;
  testSparseKeyDataSourceReturnsNullDataSourceForUnknownKey verifies the
  NullDataSource fall-back for unknown keys and the SPI-level UOE.
- ColumnarMapIndexEndToEndTest, ColumnarMapBenchmarkTest

Sourced from columnar-map-split-wip via 3-way merge against the storage
branch's current state. Storage-side conflicts resolved in favor of the
storage branch's deep-review fixes (RLE bitmaps, two-tier flag, JsonUtils,
try-with-resources, type-aware sort comparator).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Improvement to existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants