-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-22380 break circle replication when doing bulkload #494
Conversation
Also need to work on some javadocs for modified interfaces. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Fixed failed UTs. Added IT to validate the bulk replication loop condition. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
OK, we have added the UT... Will take a look |
💔 -1 overall
This message was automatically generated. |
HFileReplicator hFileReplicator = | ||
new HFileReplicator(this.provider.getConf(this.conf, replicationClusterId), | ||
if(bulkLoadsPerClusters != null) { | ||
for (String clusterId : bulkLoadsPerClusters.keySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the reported findbugs issue: We do need to iterate over the keySet, as we do need the clusterId further to forward it to HFileReplicator call.
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
…you can replace a few request.getClusterId() calls.'
…er2. This is enough to fail the test, since we do bulk load on cluster1, cluster1 replicates to cluster2, who in turn replicates to cluster3. As we only track the originator id (cluster1), we won't have the infinite loop between cluster1 and cluster2, but cluster2 and cluster3 will keep replicating between each other indefinitely.
💔 -1 overall
This message was automatically generated. |
…we can track all clusters that had already processed a given bulk load request. Needed to change also WAL proto for bulk load event, as well as additional classes logic to now handle a list of ids, instead of a single one.
Also replaced these sleeps by a CountDownLatch being updated by a _postBulkLoadHFile_ CP method in this test, to help prevent potential flakyness, as we can set a huge timeout in the latch.await, without worryig too much with penalysing fast runs.
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Please either make the findbugs correction or annotate things to ack the complaint. |
On the reported findbugs issue: We do need to iterate over the keySet, as we do need the clusterId further to forward it to HFileReplicator call. What do you mean by annotate it @busbey ? Commenting on that line? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only had one question, not sure if String vs UUID was intentional or not. (See my comment)
Otherwise looks good!
HFileReplicator hFileReplicator = | ||
new HFileReplicator(this.provider.getConf(this.conf, replicationClusterId), | ||
if(bulkLoadsPerClusters != null) { | ||
for (List<String> clusterIds : bulkLoadsPerClusters.keySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason String is used for clusterId?
I see UUID used elsewhere in this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That came up as well on the separate PR I created for master branch. Pasting my explanation made there:
HRegionServer already keeps a String clusterId (not UUID), which I'm using to check against the incoming bulk load request. Converting the request List to List when clusterId is already represented as String didn't seem to give much gain here. Maybe we should raise a separate jira to convert all existing String representations of cluster id into UUID, and including this one as part of this work?
We agreed with that suggestion to open a separate jira to address replacing existing String clusterId to UUID.
You can annotate the implementation so that findbugs doesn't report an error. I think the lowest level you can do it for is the method. In this case though, I believe the complaint from findbugs is that we should be using |
Thanks for the tip, @busbey! Had gone with that on the last commit. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks @wchevreuil , LGTM!
Signed-off-by: stack <stack@apache.org> Signed-off-by: Andrew Purtell <apurtell@apache.org> Signed-off-by: Norbert Kalmar <nkalmar@cloudera.com>
Creating an initial PR on branch-2. Had done some manual tests within two single node cluster and it seems to work. Am gonna work on UTs tomorrow and update this PR.