ReplicatedMapProxy - refactored check and retry initialization logic ... #14331

netudima · 2018-12-26T21:46:46Z

…to do it in parallel for different partitions. It is much faster compared to the original way - try to sync a partition and sleep.
Fixes #14330

devOpsHazelcast · 2020-04-05T15:08:31Z

All committers have signed the CLA.

mmedenjak · 2020-04-14T08:28:41Z

Hi @netudima ! After 1.5 years, we finally managed to take a look at the PR. To be honest, we haven't been good stewards of community PRs but things are changing around here.

I believe your PR makes perfect sense but there are a couple of small improvements I'd add here. First off, I think you can use BitSet instead of LinkedHashSet for the partition ID storage.

Secondly, since we're now adding support for many partitions (20k, 50k), I think we need to "throttle" loading a bit so maybe just add some local variable that will keep track of how many concurrent loading processes there are. For now, let's limit it to something like 100 - we can always change that number later.

And lastly, nitpicking - can you replace the while(true) with something like while (!nonLoadedStores.isEmpty()) to simplify the code?

If you're unable to continue with the PR since a lot of time has passed, we understand. I'll make sure we adopt the PR and continue with it and also mention you in the commit as the original author.

netudima · 2020-04-14T21:06:49Z

Hi @mmedenjak, thank you for the feedback :-). I will take a look and try to address your comments during 1-2 weeks.

mmedenjak · 2020-04-17T08:58:40Z

Looks like the issue is checkstyle:
10:51:01 [ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder_2/hazelcast/src/main/java/com/hazelcast/replicatedmap/impl/ReplicatedMapProxy.java:79:1: Class Fan-Out Complexity is 42 (max allowed is 40). [ClassFanOutComplexity]

If the issue persists after rebasing to latest master, I think you can ignore it, like here.

…to do it in parallel for different partitions. It is much faster compared to the original way - try to sync a partition and sleep.

…aneously sent requestDataForPartition

netudima · 2020-04-26T20:07:00Z

@mmedenjak, I have updated the PR based on the comments. Regarding while(true) - I have not found a way to simplify the logic without adding an unnecessary sleep for the case when the 1st iteration is successful for all partitions...

mmedenjak

Looks good, added one minor observation and one suggested change. Thank you for rebasing the fix!

hazelcast/src/main/java/com/hazelcast/replicatedmap/impl/ReplicatedMapProxy.java

mmedenjak · 2020-05-04T12:08:07Z

hazelcast/src/main/java/com/hazelcast/replicatedmap/impl/ReplicatedMapProxy.java

+        int partitionCount = nodeEngine.getPartitionService().getPartitionCount();
+        BitSet nonLoadedStores = new BitSet(partitionCount);
+        int[] retryCount = new int[partitionCount];
+        for (int i = 0; i < partitionCount; i++) {


Minor: you can use nonLoadedStores.set(0, partitionCount); instead of looping.

I think even better option is possible :-) - I can rename the set to loadedStores and mark a bit as set when a partition loading is completed. In this case the init logic is not required.

I tried doing something similar but beware in that the nextClearBit will never return -1 but might instead return a number which is even out of the partition ID range. So if all partitions are loaded, loadedPartitions.nextClearBit() will return partitionCount + 1

tkountis · 2020-05-05T11:01:47Z

@netudima thanks for your PR, much appreciated.
Do you mind including a test-case that demonstrates the fix, similar to the one you have in the linked issue.

netudima · 2020-05-06T10:06:40Z

It is a bit tricky, because the code from the issue shows a difference in a loading time, not in a functional behaviour. Maybe it can be tested using a test with timeout (but it can be unreliable when a test is running on a different hardware). Do you have some examples of unit tests with timeouts in Hazelcast codebase?

tkountis · 2020-05-12T12:32:05Z

@netudima I agree with you, i couldn't think of a solid test that wouldn't cause more pain to maintain.

mmedenjak · 2020-05-12T15:31:38Z

And after several years, we managed to get you to make a contribution. Thank you and here's a medal 🥇 cc @Holmistr

If you want, you can pick another issue. I can say right now everyone's a bit busy but we try to get community PRs merged faster these days.

netudima · 2020-05-12T16:30:11Z

🥳, thank you a lot for your help

…... (hazelcast#14331) ReplicatedMapProxy - refactored check and retry initialization logic to do it in parallel for different partitions. It is much faster compared to the original way - try to sync a partition and sleep.

mmedenjak added Source: Community PR or issue was opened by a community user Type: Enhancement Team: Core Module: ReplicatedMap labels Jan 29, 2019

mmedenjak added this to the 3.12 milestone Feb 8, 2019

mmedenjak modified the milestones: 3.12, 3.13 Mar 25, 2019

mmedenjak modified the milestones: 3.13, 4.0 Apr 17, 2019

mmedenjak modified the milestones: 4.0, 4.1 Dec 24, 2019

hazelcast deleted a comment from devOpsHazelcast Feb 28, 2020

hazelcast deleted a comment from devOpsHazelcast Mar 27, 2020

hazelcast deleted a comment from devOpsHazelcast Apr 6, 2020

mmedenjak self-requested a review April 6, 2020 10:28

mmedenjak removed their request for review April 14, 2020 08:29

ReplicatedMapProxy - refactored check and retry initialization logic …

64daef9

…to do it in parallel for different partitions. It is much faster compared to the original way - try to sync a partition and sleep.

netudima force-pushed the fix-replicated-map-init branch from a499a49 to 64daef9 Compare April 26, 2020 16:55

use BitSet to track non-loaded partitions, limit the number of simult…

29d4e47

…aneously sent requestDataForPartition

mmedenjak self-requested a review April 27, 2020 13:35

fix checkstyle errors

e10f7a8

hazelcast deleted a comment May 4, 2020

mmedenjak approved these changes May 4, 2020

View reviewed changes

mmedenjak requested a review from tkountis May 4, 2020 12:33

tkountis approved these changes May 12, 2020

View reviewed changes

mmedenjak merged commit 0045caf into hazelcast:master May 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReplicatedMapProxy - refactored check and retry initialization logic ... #14331

ReplicatedMapProxy - refactored check and retry initialization logic ... #14331

netudima commented Dec 26, 2018

devOpsHazelcast commented Apr 5, 2020 •

edited

Loading

mmedenjak commented Apr 14, 2020 •

edited

Loading

netudima commented Apr 14, 2020

mmedenjak commented Apr 17, 2020 •

edited

Loading

netudima commented Apr 26, 2020

mmedenjak left a comment •

edited

Loading

mmedenjak May 4, 2020

netudima May 4, 2020 •

edited

Loading

mmedenjak May 4, 2020

tkountis commented May 5, 2020

netudima commented May 6, 2020

tkountis commented May 12, 2020

mmedenjak commented May 12, 2020

netudima commented May 12, 2020

ReplicatedMapProxy - refactored check and retry initialization logic ... #14331

ReplicatedMapProxy - refactored check and retry initialization logic ... #14331

Conversation

netudima commented Dec 26, 2018

devOpsHazelcast commented Apr 5, 2020 • edited Loading

mmedenjak commented Apr 14, 2020 • edited Loading

netudima commented Apr 14, 2020

mmedenjak commented Apr 17, 2020 • edited Loading

netudima commented Apr 26, 2020

mmedenjak left a comment • edited Loading

Choose a reason for hiding this comment

mmedenjak May 4, 2020

Choose a reason for hiding this comment

netudima May 4, 2020 • edited Loading

Choose a reason for hiding this comment

mmedenjak May 4, 2020

Choose a reason for hiding this comment

tkountis commented May 5, 2020

netudima commented May 6, 2020

tkountis commented May 12, 2020

mmedenjak commented May 12, 2020

netudima commented May 12, 2020

devOpsHazelcast commented Apr 5, 2020 •

edited

Loading

mmedenjak commented Apr 14, 2020 •

edited

Loading

mmedenjak commented Apr 17, 2020 •

edited

Loading

mmedenjak left a comment •

edited

Loading

netudima May 4, 2020 •

edited

Loading