-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-17529. RBF: Improve router state store cache entry deletion #6833
Conversation
💔 -1 overall
This message was automatically generated. |
The failed unit tests really look like they're related to the changes but they aren't. Both tests fail without the patch, and seem to have failed for some past MRs already. edit: UT failure is due to derby version update. 10.17.1.0 is for java 21 and above |
@ZanderXu Can you help me take a look when you're free, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kokonguyen191 for your report. It makes sense.
- It is best to split it into two PRs, one is used to improve deletion performance and the other to asynchronously delete deletable records.
- Refer to
int remove(Class<T> clazz, Query<T> query)
, theremove(Class<T> clazz, List<Query<T>> queries)
should returns the mapping from query to the number of deleted records. - About the asynchronous deletion, I think it's ok that the newRecords still contains these records and just delete them by one asynchronous thread. These deleted records will be removed in the next
loadCache
.
@ZanderXu Thanks for the review, I have updated the codes + changed the ticket/PR title for the deletion part only, will open another PR for the async part later. I'm a bit confused about point 3, can you elaborate a bit on that part? |
💔 -1 overall
This message was automatically generated. |
.../test/java/org/apache/hadoop/hdfs/server/federation/store/TestStateStoreMembershipState.java
Outdated
Show resolved
Hide resolved
...main/java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreBaseImpl.java
Outdated
Show resolved
Hide resolved
...main/java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreBaseImpl.java
Outdated
Show resolved
Hide resolved
...java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreZooKeeperImpl.java
Outdated
Show resolved
Hide resolved
...java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreZooKeeperImpl.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/hadoop/hdfs/server/federation/store/driver/StateStoreRecordOperations.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/hadoop/hdfs/server/federation/store/driver/StateStoreRecordOperations.java
Outdated
Show resolved
Hide resolved
💔 -1 overall
This message was automatically generated. |
* | ||
* @param <T> Record class of the records. | ||
* @param records Records to be removed. | ||
* @return Map of record -> boolean indicating if the record has being removed successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-javadoc-plugin:3.0.1:javadoc-no-fork (default-cli) on project hadoop-hdfs-rbf: An error has occurred in Javadoc report generation:
[ERROR] Exit code: 1 - javadoc: warning - You have specified the HTML version as HTML 4.01 by using the -html4 option.
[ERROR] The default is currently HTML5 and the support for HTML 4.01 will be removed
[ERROR] in a future release. To suppress this warning, please ensure that any HTML constructs
[ERROR] in your comments are valid in HTML5, and remove the -html4 option.
[ERROR] /home/jenkins/jenkins-agent/workspace/hadoop-multibranch_PR-6833/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/StateStoreRecordOperations.java:136: error: bad use of '>'
[ERROR] * @return Map of record -> boolean indicating any entries being deleted by this record.
[ERROR] ^
[ERROR] javadoc: warning - invalid usage of tag >
@kokonguyen191 It seems that ->
is not allowed in the javadoc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM + 1
💔 -1 overall
This message was automatically generated. |
if (!toDeleteRecords.isEmpty()) { | ||
for (Map.Entry<R, Boolean> entry : getDriver().removeMultiple(toDeleteRecords).entrySet()) { | ||
if (entry.getValue()) { | ||
deletedRecords.add(entry.getKey()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here changing to newRecords.remove(entry.getKey())
, we can remove deletedRecords
.
🎊 +1 overall
This message was automatically generated. |
Merged. Thanks @kokonguyen191 for your contribution. |
Description of PR
Current implementation for router state store update is quite inefficient, so much that when routers are removed and a lot of NameNodeMembership records are deleted in a short burst, the deletions triggered a router safemode in our cluster and caused a lot of troubles.
This ticket aims to improve the deletion process for ZK state store implementation. The other half of router state store improvement is at HDFS-17532.
How was this patch tested?
UT
For code changes: