New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HHH-12457 Local Infinispan read-write caches become stale on rollback #6010
Conversation
Needs rebase |
@rvansa Needs rebasing |
@galderz Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments
@@ -21,7 +21,7 @@ | |||
*/ | |||
public abstract class InvalidationCacheAccessDelegate implements AccessDelegate { | |||
protected static final InfinispanMessageLogger log = InfinispanMessageLogger.Provider.getLog( InvalidationCacheAccessDelegate.class ); | |||
protected static final boolean TRACE_ENABLED = log.isTraceEnabled(); | |||
protected static final boolean trace = log.isTraceEnabled(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why? trace
is used later on...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary refactoring of the name ;)
@@ -32,7 +30,6 @@ public void beforeCompletion() {} | |||
|
|||
@Override | |||
public void afterCompletion(int status) { | |||
log.tracef("After completion callback with status %d", status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove log message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find it too useful, and to keep it symmetric with LocalInvalidationSynchronization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found it useful when debugging ISPN-8026. PFLValidator's endInvalidatingKey
can be called from multiple places. This log message clarified for me what the origin of the end invalidation message was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll put that back then to both places.
// Exceptions in #afterCompletion() are silently ignored, since the transaction | ||
// is already committed in DB. However we must not return until we update the cache. | ||
FutureUpdate futureUpdate = new FutureUpdate(uuid, region.nextTimestamp(), success ? this.value : null); | ||
for (;;) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this get noisy? Couldn't we just remove the remove the entry from the cache if we have any issues (and log those silently) rather than trying until it works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A removal is as likely to fail as any other update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right.
sync.registerBeforeCommit(future); | ||
} else { | ||
log.trace("Removal was not applied immediatelly, waiting."); | ||
future.join(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does this wait for? Endlessly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as any sync command would - a combination of locking timeouts and rpc timeouts. But not endlessly.
Object task = tasks[i]; | ||
if (task instanceof CompletableFuture) { | ||
try { | ||
((CompletableFuture) task).join(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing about this join
too
CompletableFuture<?> cf = (CompletableFuture<?>) tasks[i]; | ||
if (cf != null) { | ||
try { | ||
cf.join(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another
log.trace("Tombstone was not applied immediatelly, waiting."); | ||
// Slow path: there's probably a locking conflict and we need to wait | ||
// until the command gets applied (and replicated, too!). | ||
future.join(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
Rebased, still on top of #5964 (that should go in first). |
Needs rebasing |
feb8c27
to
79c7c87
Compare
Rebased. There's one test failure in 2LC, though it's not in local mode... |
If there's a failure this is not ready. |
To be clear: I've spent a lot of time in the past year removing all random failures in 2LC testsuite. Regardless the failure is related to the change or not, the moment one appears it needs to be resolved there and then. Otherwise the changes don't go in. |
@galderz Forum reference: https://developer.jboss.org/message/984624 I am scheduled to other tasks right now but if you could get a trace log from the failing test I could take a peek for couple of hours. |
@rvansa I'll handle this, thx |
@Override | ||
public CompletableFuture<Void> invoke(boolean success) { | ||
if (trace) { | ||
log.tracef("After completion callback, success=%d", success); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be success=%b
@@ -31,7 +33,9 @@ public InvalidationInvocation(NonTxPutFromLoadInterceptor nonTxPutFromLoadInterc | |||
|
|||
@Override | |||
public CompletableFuture<Void> invoke(boolean success) { | |||
log.tracef("After completion callback, success? %s", success); | |||
if (trace) { | |||
log.tracef("After completion callback, success=%d", success); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be success=%b
This comment has been minimized.
This comment has been minimized.
* Also making sure that we use simple cache and not the async chain directly in local mode Signed-off-by: Radim Vansa <rvansa@redhat.com>
@pruivo Galder has reservations since we've seen random failure in the CI on this PR. There's a good chance that it's not related but Galder gave it red and I was not able to reproduce it. |
https://hibernate.atlassian.net/browse/HHH-12457
Backport #5970 already integrated, but this has dependency to #5954 and #5964
https://issues.jboss.org/browse/ISPN-9205 which was introduced in the backport is fixed here.