[fast-client] Version-switch listener on StoreMetadata#2818
Open
ymuppala wants to merge 3 commits into
Open
Conversation
11 tasks
There was a problem hiding this comment.
Pull request overview
This PR introduces a version-switch notification mechanism for the Fast Client metadata layer so consumers can react when the observed “current serving version” changes (without racy polling). In addition, the PR also includes venice-common API additions for external-storage-related fields (enums + Store/Version accessors) plus associated tests.
Changes:
- Add
StoreVersionSwitchListenerandregister/unregisterVersionSwitchListenerto Fast ClientStoreMetadata, implemented inAbstractStoreMetadata, and invoked fromRequestBasedMetadataduring metadata refresh. - Add external-storage enums (
StorageMode,ExternalStorageReadMode,ExternalTableStatus) and wire new accessors intoStore/Versionimplementations (ZKStore,VersionImpl,ReadOnlyStore,SystemStore,StoreInfo). - Add/extend unit tests for both the new listener plumbing and the external-storage accessors/enums.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/venice-common/src/test/java/com/linkedin/venice/meta/TestZKStore.java | Adds tests for externalStorageReadMode default/round-trip/clone behavior. |
| internal/venice-common/src/test/java/com/linkedin/venice/meta/TestVersion.java | Adds tests for new Version external-storage fields (defaults/round-trip/clone). |
| internal/venice-common/src/test/java/com/linkedin/venice/meta/TestStoreInfo.java | Adds tests for StoreInfo.externalStorageReadMode default/null-coercion/round-trip. |
| internal/venice-common/src/test/java/com/linkedin/venice/meta/StorageModeTest.java | Adds enum mapping test for StorageMode. |
| internal/venice-common/src/test/java/com/linkedin/venice/meta/ExternalTableStatusTest.java | Adds enum mapping test for ExternalTableStatus. |
| internal/venice-common/src/test/java/com/linkedin/venice/meta/ExternalStorageReadModeTest.java | Adds enum mapping test for ExternalStorageReadMode. |
| internal/venice-common/src/main/java/com/linkedin/venice/meta/ZKStore.java | Adds transient externalStorageReadMode field/accessors and copy-constructor propagation. |
| internal/venice-common/src/main/java/com/linkedin/venice/meta/VersionImpl.java | Adds transient external-storage fields/accessors and clone propagation. |
| internal/venice-common/src/main/java/com/linkedin/venice/meta/Version.java | Extends Version API with external-storage accessors. |
| internal/venice-common/src/main/java/com/linkedin/venice/meta/SystemStore.java | Delegates getExternalStorageReadMode; rejects setter. |
| internal/venice-common/src/main/java/com/linkedin/venice/meta/StoreInfo.java | Adds externalStorageReadMode field + JSON/POJO accessors and fromStore plumbing. |
| internal/venice-common/src/main/java/com/linkedin/venice/meta/Store.java | Extends Store API with externalStorageReadMode accessors. |
| internal/venice-common/src/main/java/com/linkedin/venice/meta/StorageMode.java | Introduces StorageMode enum with int mapping. |
| internal/venice-common/src/main/java/com/linkedin/venice/meta/ReadOnlyStore.java | Delegates new Version getters and new Store getter; setters throw. |
| internal/venice-common/src/main/java/com/linkedin/venice/meta/ExternalTableStatus.java | Introduces ExternalTableStatus enum with int mapping. |
| internal/venice-common/src/main/java/com/linkedin/venice/meta/ExternalStorageReadMode.java | Introduces ExternalStorageReadMode enum with int mapping. |
| clients/venice-client/src/test/java/com/linkedin/venice/fastclient/meta/RequestBasedMetadataTest.java | Adds tests validating listener firing on initial start and not firing when version unchanged. |
| clients/venice-client/src/test/java/com/linkedin/venice/fastclient/meta/AbstractStoreMetadataTest.java | Adds unit tests for listener registration/dedup/unregister/exception isolation. |
| clients/venice-client/src/main/java/com/linkedin/venice/fastclient/meta/StoreVersionSwitchListener.java | Adds new listener functional interface + contract docs. |
| clients/venice-client/src/main/java/com/linkedin/venice/fastclient/meta/StoreMetadata.java | Adds default no-op register/unregister methods to the metadata interface. |
| clients/venice-client/src/main/java/com/linkedin/venice/fastclient/meta/RequestBasedMetadata.java | Fires the version-switch callback when the refresh updates currentVersion. |
| clients/venice-client/src/main/java/com/linkedin/venice/fastclient/meta/AbstractStoreMetadata.java | Implements listener registration and protected fireVersionSwitch(...). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| currentVersion.set(fetchedCurrentVersion); | ||
| clusterStats.updateCurrentVersion(fetchedCurrentVersion); | ||
| fetchedCurrentVersionPartitionResourceInCompletionRetries.set(0); | ||
| fireVersionSwitch(previousVersion, fetchedCurrentVersion); |
| /** | ||
| * @param previousVersion the version that was current before this refresh, or {@code -1} for the first refresh | ||
| * after client start | ||
| * @param newVersion the new current version observed by the metadata refresh; always {@code > 0} |
Comment on lines
+201
to
+210
| /** | ||
| * Store-level read routing for clients backed by both Venice local storage and a configured external storage | ||
| * system. Defaults to {@link ExternalStorageReadMode#VENICE_ONLY} (no external-storage involvement). Mirrors the | ||
| * {@code externalStorageReadMode} field staged on {@code StoreProperties} (StoreMetaValue v44) by PR #2814. | ||
| * | ||
| * <p>Field plumbing only at the OSS Venice layer; routing semantics live in proprietary client code. | ||
| */ | ||
| ExternalStorageReadMode getExternalStorageReadMode(); | ||
|
|
||
| void setExternalStorageReadMode(ExternalStorageReadMode externalStorageReadMode); |
Comment on lines
+119
to
+121
| * Concrete subclass of {@link AbstractStoreMetadata} used purely to drive the listener-plumbing tests. All | ||
| * methods unrelated to version-switch notification are stubbed to throw, which guarantees that any future | ||
| * accidental dependency on them in a listener test would fail loudly rather than silently. |
6895d34 to
6898af6
Compare
6898af6 to
f7f8153
Compare
Comment on lines
+201
to
+210
| /** | ||
| * Store-level read routing for clients backed by both Venice local storage and a configured external storage | ||
| * system. Defaults to {@link ExternalStorageReadMode#VENICE_ONLY} (no external-storage involvement). Mirrors the | ||
| * {@code externalStorageReadMode} field staged on {@code StoreProperties} (StoreMetaValue v44) by PR #2814. | ||
| * | ||
| * <p>Field plumbing only at the OSS Venice layer; routing semantics live in proprietary client code. | ||
| */ | ||
| ExternalStorageReadMode getExternalStorageReadMode(); | ||
|
|
||
| void setExternalStorageReadMode(ExternalStorageReadMode externalStorageReadMode); |
Comment on lines
+758
to
+776
| /* | ||
| * Store-level externalStorageReadMode is staged on StoreMetaValue v44 (PR #2814) but the build.gradle pin is still | ||
| * v43, so the generated StoreProperties Avro record does not yet carry the field. Held as a transient Java member | ||
| * here until the pinning bump lands; at that point this accessor should read/write from | ||
| * {@code this.storeProperties.externalStorageReadMode} instead. | ||
| * Field plumbing only at the OSS Venice layer; routing semantics live in proprietary client code. | ||
| */ | ||
| private ExternalStorageReadMode externalStorageReadMode = ExternalStorageReadMode.VENICE_ONLY; | ||
|
|
||
| @Override | ||
| public ExternalStorageReadMode getExternalStorageReadMode() { | ||
| return externalStorageReadMode; | ||
| } | ||
|
|
||
| @Override | ||
| public void setExternalStorageReadMode(ExternalStorageReadMode externalStorageReadMode) { | ||
| this.externalStorageReadMode = | ||
| externalStorageReadMode == null ? ExternalStorageReadMode.VENICE_ONLY : externalStorageReadMode; | ||
| } |
Comment on lines
+503
to
+519
| /* | ||
| * Per-version storageMode is staged on StoreMetaValue v44 (PR #2814) but build.gradle still pins to v43, so the | ||
| * Avro-generated StoreVersion record does not yet carry the field. Held as a transient Java member here until the | ||
| * pin advances; at that point this accessor should read/write from {@code this.storeVersion.storageMode} instead. | ||
| * Field plumbing only at the OSS Venice layer; routing semantics live in proprietary client code. | ||
| */ | ||
| private StorageMode storageMode = StorageMode.INTERNAL; | ||
|
|
||
| @Override | ||
| public StorageMode getStorageMode() { | ||
| return storageMode; | ||
| } | ||
|
|
||
| @Override | ||
| public void setStorageMode(StorageMode storageMode) { | ||
| this.storageMode = storageMode == null ? StorageMode.INTERNAL : storageMode; | ||
| } |
832ab2c to
0e68859
Compare
0e68859 to
cc0ec57
Compare
| {"name": "persistenceType", "type": "int", "default": 2, "doc": "Type of persistence storage engine, and default is 'ROCKS_DB'"}, | ||
| {"name": "routingStrategy", "type": "int", "default": 0, "doc": "How to route the key to partition, and default is 'CONSISTENT_HASH'"}, | ||
| {"name": "readStrategy", "type": "int", "default": 0, "doc": "How to read data from multiple replications, and default is 'ANY_OF_ONLINE'"}, | ||
| {"name": "offlinePushStrategy", "type": "int", "default": 1, "doc": "When doing off-line push, how to decide the data is ready to serve, and default is 'WAIT_N_MINUS_ONE_REPLCIA_PER_PARTITION'"}, |
| }, | ||
| {"name": "accessControlled", "type": "boolean", "default": true, "doc": "Store-level ACL switch. When disabled, Venice Router should accept every request."}, | ||
| {"name": "compressionStrategy", "type": "int", "default": 0, "doc": "Strategy used to compress/decompress Record's value, and default is 'NO_OP'"}, | ||
| {"name": "clientDecompressionEnabled", "type": "boolean", "default": true, "doc": "le/Disable client-side record decompression (default: true)"}, |
cc0ec57 to
bfa41fd
Compare
bfa41fd to
f5a3f5c
Compare
| /** | ||
| * @param previousVersion the version that was current before this refresh, or {@code -1} for the first refresh | ||
| * after client start | ||
| * @param newVersion the new current version observed by the metadata refresh; always {@code > 0} |
f5a3f5c to
5e5739a
Compare
5e5739a to
6825171
Compare
| currentVersion.set(fetchedCurrentVersion); | ||
| clusterStats.updateCurrentVersion(fetchedCurrentVersion); | ||
| fetchedCurrentVersionPartitionResourceInCompletionRetries.set(0); | ||
| fireVersionSwitch(previousVersion, fetchedCurrentVersion); |
Comment on lines
+4
to
+6
| * Callback notified whenever the Fast Client observes a change in a store's current serving version. Fires from | ||
| * within the metadata refresh loop after the new version has been committed to the local cache, so listeners may | ||
| * safely call {@link StoreMetadata#getCurrentStoreVersion()} and observe the new value. |
6825171 to
8d85170
Compare
8d85170 to
f793aa1
Compare
f793aa1 to
920b311
Compare
920b311 to
e20a1a9
Compare
Comment on lines
504
to
+507
| currentVersion.set(fetchedCurrentVersion); | ||
| clusterStats.updateCurrentVersion(fetchedCurrentVersion); | ||
| fetchedCurrentVersionPartitionResourceInCompletionRetries.set(0); | ||
| fireVersionSwitch(previousVersion, fetchedCurrentVersion); |
e20a1a9 to
3a7bfba
Compare
3a7bfba to
684f621
Compare
Comment on lines
+4
to
+8
| * Callback notified whenever the Fast Client observes a change in a store's current serving version. Fires from | ||
| * within the metadata refresh loop after the new version has been committed to the local cache, so listeners may | ||
| * safely call {@link StoreMetadata#getCurrentStoreVersion()} and observe the new value. | ||
| * | ||
| * <p>Implementations must be thread-safe and short-running; the metadata refresh thread is shared across all reads |
684f621 to
318fff9
Compare
318fff9 to
493947b
Compare
| /** | ||
| * Store-level read routing for clients backed by both Venice local storage and a configured external storage | ||
| * system. Defaults to {@link ExternalStorageReadMode#VENICE_ONLY} (no external-storage involvement). Mirrors the | ||
| * {@code externalStorageReadMode} field staged on {@code StoreProperties} (StoreMetaValue v44) by PR #2814. |
Comment on lines
+334
to
+335
| * Defaults to {@link StorageMode#INTERNAL}. Mirrors the {@code storageMode} field staged on {@code StoreVersion} | ||
| * (StoreMetaValue v44) by PR #2814. |
| /** | ||
| * Store-level read routing for clients backed by both Venice local storage and a configured external storage system. | ||
| * Applies to the store as a whole, not per-version. Mirrors the {@code externalStorageReadMode} field staged on | ||
| * {@code StoreProperties} (StoreMetaValue v44) by PR #2814. |
Comment on lines
+10
to
+11
| * external storage system. Mirrors the {@code storageMode} field staged on {@code StoreVersion} (StoreMetaValue v44) | ||
| * by PR #2814. |
493947b to
e8d4541
Compare
…dminOperation v99 + Java accessors
Predecessor PR (the schema PR in this stack) stages StoreMetaValue v44 and
AdminOperation v99 with these external-storage-related fields:
* StoreVersion: storageMode, externalDbName, externalTableName (per-version,
all newly-added in v44).
* StoreProperties: externalStorageReadMode (store-level, per-store).
* UpdateStore (admin op): storageMode, externalStorageReadMode,
externalDbName, externalTableName (mirrors for operator-override).
build.gradle's `versionOverrides` list still forces the Avro compiler to use
v43 / v98 instead of letting it pick the highest-numeric directory, so v44 /
v99 are not yet wire-real. Downstream consumers (Fast Client metadata
refresh, controller, server, VPJ producers) need typed Java accessors to
start integrating against.
Drop the `versionOverrides` entries so the Avro compiler picks the latest
schema directories (v44 / v99) naturally — that's exactly the
remove-the-override step the comment above `versionOverrides` describes
("remove the override in a follow-up PR when actually using the new
protocol"). Bump `AvroProtocolDefinition` constants in lockstep so the
in-process protocol version matches the compiled schema. Add Java accessors
that read/write directly from the now-Avro-backed `StoreProperties` /
`StoreVersion` records (no transient POJO mirroring), so updates round-trip
through ZK / metadata-system-store serialization.
- `build.gradle`: remove the `versionOverrides` entries for StoreMetaValue
and AdminOperation. The Avro generator now selects v44 / v99 by the
default max-numeric rule.
- `AvroProtocolDefinition`: bump constants
(`METADATA_SYSTEM_SCHEMA_STORE` 43 -> 44; `ADMIN_OPERATION` 98 -> 99)
so the in-process protocol version matches the compiled schema.
- New enums (`VeniceEnumValue`, `EnumUtils`-backed `valueOf`):
* `StorageMode { INTERNAL, DUAL_WRITE, EXTERNAL }` — mirrors
`StoreVersion.storageMode`.
* `ExternalStorageReadMode { VENICE_ONLY, DUAL_MODE_CONSISTENCY_CHECK,
DUAL_MODE_EARLY_RETURN, EXTERNAL_ONLY }` — mirrors
`StoreProperties.externalStorageReadMode`.
- Store-level Java accessor backed by `StoreProperties` Avro record:
* `Store.get/setExternalStorageReadMode` → `storeProperties.externalStorageReadMode`.
Implemented on `ZKStore` (int <-> enum). `ReadOnlyStore` delegates the
getter and throws UOE on the setter; `SystemStore` does the same via its
existing `throwUnsupportedOperationException` helper. `StoreInfo.fromStore`
mirrors the field for JSON.
- Version-level Java accessors backed by `StoreVersion` Avro record:
* `Version.get/setStorageMode` → `storeVersion.storageMode` (int <-> enum).
* `Version.get/setExternalDbName` → `storeVersion.externalDbName`
(null-coerces to "NOT_SPECIFIED", the schema default).
* `Version.get/setExternalTableName` → `storeVersion.externalTableName`
(null-coerces to "NOT_SPECIFIED", the schema default).
Implemented on `VersionImpl`; `cloneVersion` propagates all three through
the Avro-record copy. `ReadOnlyVersion` delegates getters and throws UOE
on setters.
- `AdminMessageType.UPDATE_STORE.getNewInstance()` initializes the two new
`UpdateStore` string fields (externalDbName, externalTableName) to "" at
construction so the Avro generic writer does not NPE on null values. The
Avro schema default for these fields on UpdateStore is "" (the "no-op for
this admin op" sentinel), which is only applied at deserialization time;
explicit initialization is required so existing controller code paths
that call `getNewInstance()` and then conditionally set fields can
serialize the result without populating every new field.
- [ ] Added new code behind a config. (no.)
- [ ] Introduced new log lines. (no.)
- [x] No race conditions: new fields are plain Avro-record fields on the
existing storeProperties / storeVersion containers, accessed under
the same synchronization invariants as the fields they sit alongside.
- [x] No new synchronization primitives required.
- [x] No blocking calls introduced.
- [x] No new collections; only enum singletons and a primitive-or-enum-or-String
field per type.
- [x] No new threading.
- [x] New unit tests added: `ExternalStorageReadModeTest`, `StorageModeTest`
(`VeniceEnumValueTest`-based int-mapping coverage); new test methods
in `TestStoreInfo`, `TestZKStore`, `TestVersion` for defaults /
round-trip / clone preservation / null coercion for storageMode,
externalStorageReadMode, externalDbName, externalTableName; plus
"persists through Avro data model" tests on `TestZKStore` and
`TestVersion` that read the value back through `dataModel()` to
guard against a regression to transient POJO mirroring.
- [x] Modified or extended existing tests: yes (`TestStoreInfo` /
`TestZKStore` / `TestVersion`).
- [x] Verified the controller-side createStore admin-op flow no longer
NPEs on the new UpdateStore fields (TestVeniceParentHelixAdmin
passes locally; previously failed with NPE on
`externalDbName.toString()` during AdminOperationSerializer write).
- [x] Verified backward compatibility: the v44 schema is forward-compatible
with v43 because the new fields have defaults; downgrading a process
that writes v44 records to one that reads v43 records works as long
as both sides treat unknown fields as ignorable, which Avro does for
record fields with defaults.
Activating the v44 / v99 schemas changes what the controller writes to ZK
and what controllers exchange via the admin topic. All Venice services
must be redeployed at this version before any caller starts setting
`storageMode` / `externalStorageReadMode` / `externalDbName` /
`externalTableName` to non-default values, otherwise a v43-only service
performing a read-modify-write could silently drop the new fields.
- [x] No (defaults preserve existing behavior; new fields are inert until
callers opt in).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Problem Statement
Fast Client consumers that wrap the Venice client (e.g. shadow-clients
maintaining a parallel cache, version-aware routing layers) currently have
no way to be notified when the metadata refresh observes a new current
serving version. They have to poll `getCurrentStoreVersion()` from outside
the refresh loop, which is racy with respect to the rest of the cache
(replicas, partition count, schemas) being updated atomically inside
`RequestBasedMetadata.updateCache`.
## Solution
Add a registration-style callback that fires from inside the refresh loop
after the new current version has been committed to the local cache, so
listeners may safely read the rest of the cache and observe the new value.
- New `StoreVersionSwitchListener` functional interface:
`onVersionSwitch(int previousVersion, int newVersion)`. First refresh
after client start delivers `previousVersion = -1` as a sentinel.
- `StoreMetadata` gains `register/unregisterVersionSwitchListener` with
`default {}` no-op bodies so existing test fakes / mocks aren't forced
to implement them.
- `AbstractStoreMetadata` owns the listener list (`CopyOnWriteArrayList`
for rare-writer / frequent-reader access) and exposes a protected
`fireVersionSwitch(prev, next)` that catches per-listener `Throwable`
and logs at ERROR, so one buggy listener cannot break the refresh
thread or starve sibling listeners.
- `RequestBasedMetadata.updateCache` captures `currentVersion.get()`
immediately before `.set(fetchedCurrentVersion)`, then calls
`fireVersionSwitch(previousVersion, fetchedCurrentVersion)` after the
stat update. The fire happens only on the branch that actually accepts
a new current version (i.e. when `whetherToSwitchToFetchedCurrentVersion`
is true), so transitions that get deferred for partition-resource
readiness do not generate spurious callbacks.
### Code changes
- [ ] Added new code behind a config. (no)
- [ ] Introduced new log lines. (one ERROR-level log on listener exception in
`AbstractStoreMetadata.fireVersionSwitch`. Bounded by listener count, fires
only on bad listener — no rate limiting needed.)
### Concurrency-Specific Checks
- [x] No race conditions: listeners are stored in a `CopyOnWriteArrayList`; the
iterator in `fireVersionSwitch` is snapshot-stable across concurrent
register/unregister calls.
- [x] Synchronization: registration is intrinsically thread-safe via
`CopyOnWriteArrayList.addIfAbsent` / `remove`; fire walks the snapshot
iterator without holding any lock.
- [x] No blocking calls inside critical sections — `fireVersionSwitch` is
called from `RequestBasedMetadata.updateCache` which is `synchronized`,
but the listener iteration itself acquires no further locks. Listener
contracts (Javadoc) require short, non-blocking work; exceptions are
isolated per listener.
- [x] Thread-safe collection used (`CopyOnWriteArrayList`).
- [x] Validated proper exception handling — per-listener `try/catch (Throwable)`
ensures one bad listener cannot kill the refresh thread.
## How was this PR tested?
- [x] New unit tests added: `AbstractStoreMetadataTest` (8 cases: register+fire,
no-listeners-safe, multi-listener fan-out, exception isolation, unregister,
dedup on double-register, null-reject on register, null/unknown tolerance
on unregister).
- [x] Modified or extended existing tests: yes — `RequestBasedMetadataTest`
gains two wiring tests (listener fires on initial `start()` with sentinel
`-1`; no fire when a subsequent `updateCache` returns the same version).
- [x] Verified backward compatibility: yes — the two new `StoreMetadata` methods
are `default` no-ops, so existing implementations of `StoreMetadata`
compile and run unchanged.
## Does this PR introduce any user-facing or breaking changes?
- [x] No.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
e8d4541 to
17f1a2e
Compare
Comment on lines
+64
to
+72
| * <p>Default implementations of {@link StoreMetadata} (e.g. test fakes) treat this as a no-op. | ||
| */ | ||
| default void registerVersionSwitchListener(StoreVersionSwitchListener listener) { | ||
| } | ||
|
|
||
| /** | ||
| * Remove a previously registered version-switch listener. No-op if the listener was never registered. | ||
| */ | ||
| default void unregisterVersionSwitchListener(StoreVersionSwitchListener listener) { |
Comment on lines
+119
to
+121
| * Concrete subclass of {@link AbstractStoreMetadata} used purely to drive the listener-plumbing tests. All | ||
| * methods unrelated to version-switch notification are stubbed to throw, which guarantees that any future | ||
| * accidental dependency on them in a listener test would fail loudly rather than silently. |
| /** | ||
| * Store-level read routing for clients backed by both Venice local storage and a configured external storage | ||
| * system. Defaults to {@link ExternalStorageReadMode#VENICE_ONLY} (no external-storage involvement). Mirrors the | ||
| * {@code externalStorageReadMode} field staged on {@code StoreProperties} (StoreMetaValue v44) by PR #2814. |
Comment on lines
+334
to
+335
| * Defaults to {@link StorageMode#INTERNAL}. Mirrors the {@code storageMode} field staged on {@code StoreVersion} | ||
| * (StoreMetaValue v44) by PR #2814. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem Statement
Fast Client consumers that wrap the Venice client (e.g. shadow-clients maintaining a parallel cache, version-aware routing layers) currently have no way to be notified when the metadata refresh observes a new current serving version. They have to poll
getCurrentStoreVersion()from outside the refresh loop, which is racy with respect to the rest of the cache (replicas, partition count, schemas) being updated atomically insideRequestBasedMetadata.updateCache.Solution
Add a registration-style callback that fires from inside the refresh loop after the new current version has been committed to the local cache, so listeners may safely read the rest of the cache and observe the new value.
New
StoreVersionSwitchListenerfunctional interface:onVersionSwitch(int previousVersion, int newVersion). First refresh after client start deliverspreviousVersion = -1as a sentinel.StoreMetadatagainsregister/unregisterVersionSwitchListenerwithdefault {}no-op bodies so existing test fakes / mocks aren't forced to implement them.AbstractStoreMetadataowns the listener list (CopyOnWriteArrayListfor rare-writer / frequent-reader access) and exposes a protectedfireVersionSwitch(prev, next)that catches per-listenerThrowableand logs at ERROR, so one buggy listener cannot break the refresh thread or starve sibling listeners.RequestBasedMetadata.updateCachecapturescurrentVersion.get()immediately before.set(fetchedCurrentVersion), then callsfireVersionSwitch(previousVersion, fetchedCurrentVersion)after the stat update. The fire happens only on the branch that actually accepts a new current version (i.e. whenwhetherToSwitchToFetchedCurrentVersionis true), so transitions that get deferred for partition-resource readiness do not generate spurious callbacks.Code changes
AbstractStoreMetadata.fireVersionSwitch. Bounded by listener count, fires only on bad listener — no rate limiting needed.Concurrency-Specific Checks
CopyOnWriteArrayList; the iterator infireVersionSwitchis snapshot-stable across concurrent register/unregister calls.CopyOnWriteArrayList.addIfAbsent/remove; fire walks the snapshot iterator without holding any lock.fireVersionSwitchis called fromRequestBasedMetadata.updateCachewhich issynchronized, but the listener iteration itself acquires no further locks. Listener contracts (Javadoc) require short, non-blocking work; exceptions are isolated per listener.CopyOnWriteArrayList).try/catch (Throwable)ensures one bad listener cannot kill the refresh thread.How was this PR tested?
AbstractStoreMetadataTest(8 cases: register+fire, no-listeners-safe, multi-listener fan-out, exception isolation, unregister, dedup on double-register, null-reject on register, null/unknown tolerance on unregister).RequestBasedMetadataTestgains two wiring tests (listener fires on initialstart()with sentinel-1; no fire when a subsequentupdateCachereturns the same version).StoreMetadatamethods aredefaultno-ops, so existing implementations ofStoreMetadatacompile and run unchanged.Does this PR introduce any user-facing or breaking changes?
Middle of a 3-PR stack:
[venice-common][protocol] Java accessors for storageMode + externalStorageReadMode + external-table identity([venice-common][controller][compact] Activate StoreMetaValue v44 / AdminOperation v99 + Java accessors #2817) — independent of this PR[fast-client] Store-config-change listener on StoreMetadata— depends on [venice-common][controller][compact] Activate StoreMetaValue v44 / AdminOperation v99 + Java accessors #2817 forExternalStorageReadMode