Use updaters to update read messages to ZK#1002
Conversation
helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
Show resolved
Hide resolved
jiajunwang
left a comment
There was a problem hiding this comment.
I think the design is a little bit awkward. Please check out my comments and see if that would be an easier design.
helix-core/src/main/java/org/apache/helix/messaging/DefaultMessagingService.java
Outdated
Show resolved
Hide resolved
helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
Show resolved
Hide resolved
helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
Show resolved
Hide resolved
helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
Show resolved
Hide resolved
helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
Outdated
Show resolved
Hide resolved
helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
Outdated
Show resolved
Hide resolved
jiajunwang
left a comment
There was a problem hiding this comment.
One minor thing, if the unexpected override happens, shall we log a warning?
After all, this PR is to fix the symptom not fix the root cause. Let's ensure that we still have the capability to debug the root cause.
I moved all your comments on periodic message refresh to #1000. I decided to separate the PR since it's trying to resolve two different issues. Sorry for the confusion, please refer to the link above for the replies on your comments. |
| @Override | ||
| public ZNRecord update(ZNRecord currentData) { | ||
| if (currentData == null) { | ||
| LOG.warn("Message {} targets at {} has already been removed before it is set as READ on instance {}", msg.getId(), msg.getTgtName(), instanceName); |
There was a problem hiding this comment.
@jiajunwang I have added the logging. Can you please check if the message is good?
|
This PR is ready to be merged, approved by @dasahcc @jiajunwang . |
| * We use the updater to avoid race condition between writing message to zk as READ state and removing message after ST is done | ||
| * If there is no message at this path, meaning the message is removed so we do not write the message | ||
| */ | ||
| updaters.add(new DataUpdater<ZNRecord>() { |
There was a problem hiding this comment.
One optimization here maybe to create just one updater instead of one per each message. You can define the update logic in the updater, i.e, read the message and mark the read field as "read".
Use updaters to update messages with READ state to ZK.
Issues
#1001
Description
This PR uses an updater to replace set, so it checks whether there is a message in the path beforehand, and only write the "READ" state message to ZK if there is a message, to avoid write back a message which is already removed.
Tests
TestHelixTaskExecutor -> testNoWriteReadStateForRemovedMessage
[INFO]
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestWorkflowTermination.testWorkflowRunningTimeout:131->verifyWorkflowCleanup:257 expected: but was:
[INFO]
[ERROR] Tests run: 1147, Failures: 1, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:15 h
[INFO] Finished at: 2020-05-12T18:59:45-07:00
Commits
Documentation (Optional)
(Link the GitHub wiki you added)
Code Quality
(helix-style-intellij.xml if IntelliJ IDE is used)