fix(server): align cache event actions in legacy EventHub path#3017
fix(server): align cache event actions in legacy EventHub path#3017dpol1 wants to merge 2 commits intoapache:masterfrom
Conversation
- Align legacy cache invalidation producers and listeners on ACTION_INVALID and ACTION_CLEAR, removing the obsolete ACTION_INVALIDED/ACTION_CLEARED constants. - Add EventHub.notifyExcept(...) so cache transactions and the cache notifier bridge can avoid re-processing their own local listener while still delivering events to other listeners. - Track registered graph/schema cache listeners per graph so notifyExcept(...) uses the listener instance actually registered on the EventHub, including multi-transaction cases where later transactions reuse the first listener. - Update cache notifier forwarding to prevent local RPC bridge loops after action names are unified. - Add regression coverage for notifyExcept semantics, graph/schema action names, listener teardown/re-registration, and notifier no-loop behavior.
|
@imbajin I took the liberty to quickly fix a flaky test which failed a workflow twice - if you think is out of the scope with this PR I' ll revert it - Actually, I just noticed the first commit is all greens |
There was a problem hiding this comment.
Pull request overview
This PR fixes a legacy cache invalidation bug where cache event producers emitted past-tense action names (invalided/cleared) that didn’t match the actions local listeners (and some other paths) were expecting (invalid/clear), causing silent drops and stale caches in certain deployments. It also introduces EventHub.notifyExcept(...) to prevent cache-notification bridges from re-consuming their own re-emitted events.
Changes:
- Remove legacy
Cache.ACTION_INVALIDED/Cache.ACTION_CLEAREDand align all cache event producers/listeners toCache.ACTION_INVALID/Cache.ACTION_CLEAR. - Add
EventHub.notifyExcept(event, ignoredListener, args...)and routenotify(...)through the same internal implementation. - Track per-graph cache listeners in static registries and use
notifyExcept(...)to avoid self-delivery/bridge loops; add/adjust unit tests for the new semantics.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/backend/cache/Cache.java | Removes past-tense cache action constants to enforce a single action vocabulary. |
| hugegraph-commons/hugegraph-common/src/main/java/org/apache/hugegraph/event/EventHub.java | Adds notifyExcept(...) and implements ignored-listener filtering in the notifier loop. |
| hugegraph-commons/hugegraph-common/src/test/java/org/apache/hugegraph/unit/event/EventHubTest.java | Adds coverage for notifyExcept(...) (including ANY_EVENT listeners). |
| hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/StandardHugeGraph.java | Updates legacy notifier bridge to present-tense actions and uses notifyExcept(...) to avoid loops. |
| hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/backend/tx/GraphTransaction.java | Replaces the boolean guard with a per-graph cache listener registry map. |
| hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/backend/cache/CachedSchemaTransaction.java | Emits present-tense actions and uses per-graph listener registry + notifyExcept(...) on emits. |
| hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/backend/cache/CachedGraphTransaction.java | Emits present-tense actions and uses per-graph listener registry + notifyExcept(...) on emits. |
| hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/cache/CachedSchemaTransactionTest.java | Adds tests validating action name alignment, listener registry lifecycle, and bridge no-loop behavior. |
| hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/cache/CachedGraphTransactionTest.java | Adds tests validating action name alignment and “non-owner close” listener behavior. |
| hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/cache/CacheManagerTest.java | Speeds up expiration tick test via reflective rescheduling and cancels the ad-hoc task. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| private void unlistenChanges() { | ||
| String graphName = this.params().spaceGraphName(); | ||
| if (graphCacheListenStatus.remove(graphName) != null) { | ||
| if (this.registeredCacheEventListener) { | ||
| EventHub graphEventHub = this.params().graphEventHub(); | ||
| graphEventHub.unlisten(Events.CACHE, this.cacheEventListener); | ||
| if (graphCacheEventListeners.remove(graphName, | ||
| this.cacheEventListener)) { | ||
| graphEventHub.unlisten(Events.CACHE, this.cacheEventListener); | ||
| } | ||
| this.registeredCacheEventListener = false; |
| private void unlistenChanges() { | ||
| // Unlisten store event | ||
| this.store().provider().unlisten(this.storeEventListener); | ||
|
|
||
| // Unlisten cache event | ||
| EventHub schemaEventHub = this.params().schemaEventHub(); | ||
| schemaEventHub.unlisten(Events.CACHE, this.cacheEventListener); | ||
| if (this.registeredCacheEventListener) { | ||
| schemaEventHub.unlisten(Events.CACHE, this.cacheEventListener); | ||
| SCHEMA_CACHE_EVENT_LISTENERS.remove(this.params().spaceGraphName(), | ||
| this.cacheEventListener); | ||
| this.registeredCacheEventListener = false; | ||
| } |
imbajin
left a comment
There was a problem hiding this comment.
Fix looks correct and the new tests are solid. Two concerns worth tracking: (1) storeEventListenStatus has the same ownership bug this PR fixes for the cache listener, acknowledged in the test's finally block — worth a follow-up; (2) the owner-first-close scenario for the cache listener deserves an explicit test and/or a ref-counted ownership model.
| // Closing the secondary transaction exercises the pre-existing | ||
| // store listener guard too; restore it so teardown unregisters the | ||
| // primary transaction's store listener. | ||
| storeListeners.putIfAbsent(graphName, true); |
There was a problem hiding this comment.
putIfAbsent is a workaround for a pre-existing bug in storeEventListenStatus that mirrors exactly the one this PR fixes for graphCacheEventListeners. In unlistenChanges():
if (storeEventListenStatus.remove(graphName) != null) {
this.store().provider().unlisten(this.storeEventListener);
}when a non-owner transaction closes, it (a) pops the map entry even though it doesn't own the registration, (b) unlistens its own per-instance storeEventListener which was never registered on the provider (no-op), and (c) leaves the primary's real store listener registered but un-tracked — so the owner's later close silently skips the unlisten and leaks it.
Suggest a follow-up PR that applies the same ref-counted / owner-aware pattern introduced here for the cache listener (i.e. a Map<String, EventListener> + registeredStoreEventListener flag). Keeping as TODO for now since it's out of scope.
| this.cacheEventListener)) { | ||
| graphEventHub.unlisten(Events.CACHE, this.cacheEventListener); | ||
| } | ||
| this.registeredCacheEventListener = false; |
There was a problem hiding this comment.
Trace:
- T1 opens (owner), registers
listener, puts it ingraphCacheEventListeners. - T2 opens (non-owner), caches
this.cacheEventListener = <T1's listener>. - T1 closes →
unlistenChanges()removes the map entry and callshub.unlisten(...). - T2 still alive. Its
notifyChanges(...)callsnotifyExcept(hub, <T1's listener>, ...). That listener is no longer on the hub, so the exclusion is a no-op and the emit fans out to whatever else is registered (e.g.AbstractCacheNotifier's bridge). Meanwhile T2 has no working local cache listener at all — the per-graph listener slot is empty until a new transaction repopulates it.
Two questions:
- Is T1-closes-before-T2 a realistic lifecycle in practice (e.g. thread-local transactions on a long-lived graph)? If yes, a ref-counted ownership holder (unregister only when the last referent closes) would be safer than "first-wins" ownership.
- Either way, worth adding a companion test to
testClosingNonOwnerKeepsGraphCacheListenerRegisteredthat exercises the owner-first-close path and asserts whatever the intended semantics are, so the invariant is captured.
There was a problem hiding this comment.
Both fair, thanks.
-
storeEventListenStatus- same ownership bug, theputIfAbsentinfinallyis a workaround. Follow-up PR with whatever ownership model we land on. -
Owner-first-close - realistic, yeah. Two options: ref-counted holder, or stop unregistering on tx close (cache is already a
CacheManagersingleton perspaceGraphName, so listener lifetime = graph, not tx; cleanup moves to graph dispose). I'd lean toward the second - kills the race by killing the owner concept.
Wouldn't add a test pinning current behavior - codifies a limitation. Real test needs the fix first.
#3012 is the action-name + notifyExcept fix, so my preference is to merge this as-is and do ownership in a follow-up. Can fold either fix in here instead - your call.
Purpose of the PR
Cache producers in the legacy
EventHubpath were emitting past-tense actionnames (
invalided/cleared) while local cache listeners only matched thepresent-tense actions (
invalid/clear). On single-node deployments, orwhenever the cache RPC bridge in
StandardHugeGraph.AbstractCacheNotifierwasnot active, those local cache invalidation events were silently dropped: no
error, no warning, just stale cache.
The fix aligns producers, listeners, and the bridge on the present-tense
constants, removes the obsolete past-tense ones, and wires
notifyExcept(...)so the bridge does not loop back into itself once both ends speak the same
action name.
Main Changes
Cache.ACTION_INVALIDED/Cache.ACTION_CLEARED. All producers(
CachedGraphTransaction,CachedSchemaTransaction) now emitCache.ACTION_INVALID/Cache.ACTION_CLEAR, matching what local listenersand
RaftContextalready expect.StandardHugeGraph.AbstractCacheNotifierto listen on thepresent-tense actions and to forward through
EventHub.notifyExcept(...)sothe bridge does not re-receive its own re-emit.
EventHub.notifyExcept(event, ignoredListener, args...)so a caller cannotify everyone on the hub except the listener that originated the event.
The signature is documented as a public API on
EventHub,notify(...)is unchanged and now delegates internally.
notifyExcept(...)from the cache transactions and fromAbstractCacheNotifier, with the registered cache listener as the ignoredone, so producers do not feed their own listener and the RPC re-emit does
not loop the bridge.
GraphTransaction.graphCacheEventListenersand in a matching map insideCachedSchemaTransaction, with owner-aware teardown on close. This is theminimum needed for
notifyExcept(...)to exclude the listener that isactually registered on the hub: with the old per-instance lambda +
containsListener(...)/ boolean guard pattern, only the first transactionper graph registered a listener, while later transactions could call
notifyExcept(...)with a lambda that was never registered. In that case thereal registered listener was not excluded and could still process the
transaction's own local emit.
CachedSchemaTransactionV2is intentionally not touched here. That class wasjust adjusted in #3011 for HStore/V2 schema cache propagation through
MetaManager. The remainingcontainsListener(...)guard there is a relatedlistener-lifecycle concern, not the action-name mismatch fixed in this PR; it
should be handled in a follow-up that audits the remaining V2 path and, if
needed, moves cache listener ownership to a ref-counted holder keyed by graph
identity and
EventHub.Verifying these changes
EventHubTest#testNotifyExceptcovers the newnotifyExcept(...)semantics, including
ANY_EVENTlisteners and the ignored-listenerexclusion.
CachedSchemaTransactionTest#testLegacySchemaChangeEmitsActionInvalidasserts the schema producer now emits
ACTION_INVALIDend-to-end.CachedSchemaTransactionTest#testListenerRegistryClearedOnCloseAndRebuiltOnReopenasserts that closing a graph drops its entry from the per-graph
listener registry and that a subsequent transaction registers a fresh
listener instead of reusing the closed reference.
CachedSchemaTransactionTest#testCacheNotifierLocalEventCallsProxyOnceand
testCacheNotifierRpcInvalidDoesNotLoopToProxycover the bridgeno-loop behavior.
CachedGraphTransactionTest#testClearCacheEmitsActionClearassertsgraph cache clears now emit
ACTION_CLEAR.CachedGraphTransactionTest#testVertexMutationEmitsActionInvalidasserts graph vertex mutations now emit
ACTION_INVALID.CachedGraphTransactionTest#testClosingNonOwnerKeepsGraphCacheListenerRegisteredasserts that closing a graph transaction which reused an existing
per-graph listener does not unregister the shared graph cache listener.
Does this PR potentially affect the following parts?
EventHub.notifyExcept(...)is a new public method; existingnotify(...)behavior is unchanged.
Documentation Status
Doc - TODODoc - DoneDoc - No Need