Fix wrong scheduled task count caused by backup replicas #9704

tkountis · 2017-01-19T17:44:24Z

When scheduling tasks on random partitions, the getAllScheduled call
returns twice as many futures from the expected result. The reason
seems to be the backup replicas which are also returned, even though they are
just stashed placeholders having no future scheduled. To address this, we introduced
a flag, marking the main copy of the task as master upon initital scheduling or promotions
and using that when building the list of tasks to return to the API call.

Fixes #9694

When scheduling tasks on random partitions, the getAllScheduled call returns twice as many futures from the expected result. The reason seems to be the backup replicas which are also returned, even though they are just stashed placeholders having no future scheduled. To address this, we introduced a flag, marking the main copy of the task as master upon initital scheduling or promotions and using that when building the list of tasks to return to the API call

devOpsHazelcast · 2017-01-19T18:42:48Z

Test PASSed.

jerrinot · 2017-01-19T19:16:30Z

When there is a migration rolllback then the source side will always set itself as a master replica. is this the right thing to do?

tkountis · 2017-01-19T20:47:47Z

@jerrinot thanks for the comment, but to be honest I can't say I understand the concern :/
In a migration rollback, the source side will take ownership of the task and re-schedule it once again, meaning it is going to be the master replica. Am I understanding something wrong from your comment ? What would you expect the correct behaviour to be ?

mmedenjak

Looks ok, just minor change and check the rollbackMigration behaviour in case when the source is not the partition owner.

mmedenjak · 2017-01-20T14:12:10Z

hazelcast/src/main/java/com/hazelcast/scheduledexecutor/impl/ScheduledTaskDescriptor.java

+     * This flag is set to true only on ititial scheduling of a task, and on after a promotion (stashed or migration),
+     * in the latter case the other replicas get disposed.
+     */
+    private transient boolean masterReplica;


Perhaps for clarity rename to isPartitionOwner since "master" is a different thing in the cluster.

tkountis · 2017-01-20T14:32:30Z

@mmedenjak addressed both comments ;)

jerrinot

looks good. it would be good to squash commits before merge.

devOpsHazelcast · 2017-01-20T15:14:06Z

Test PASSed.

devOpsHazelcast · 2017-01-20T15:36:23Z

Test PASSed.

tkountis added 2 commits January 19, 2017 17:21

Improve testing

3efdc29

tkountis added Team: Core Type: Defect labels Jan 19, 2017

tkountis added this to the 3.8 milestone Jan 19, 2017

tkountis self-assigned this Jan 19, 2017

mmedenjak self-requested a review January 20, 2017 09:34

mmedenjak reviewed Jan 20, 2017

View reviewed changes

tkountis added 2 commits January 20, 2017 14:16

Promote stash only when replica index is 0

99a713b

Renaming flag for better clarity

0a6b1c8

mmedenjak approved these changes Jan 20, 2017

View reviewed changes

jerrinot approved these changes Jan 20, 2017

View reviewed changes

tkountis merged commit d7bef66 into hazelcast:master Jan 20, 2017

tkountis deleted the fix/3.8/sched_exec_wrong_task_count branch January 20, 2017 15:36

mmedenjak added the Source: Internal PR or issue was opened by an employee label Apr 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix wrong scheduled task count caused by backup replicas #9704

Fix wrong scheduled task count caused by backup replicas #9704

tkountis commented Jan 19, 2017

devOpsHazelcast commented Jan 19, 2017

jerrinot commented Jan 19, 2017

tkountis commented Jan 19, 2017

mmedenjak left a comment

mmedenjak Jan 20, 2017

tkountis commented Jan 20, 2017

jerrinot left a comment

devOpsHazelcast commented Jan 20, 2017

devOpsHazelcast commented Jan 20, 2017

Fix wrong scheduled task count caused by backup replicas #9704

Fix wrong scheduled task count caused by backup replicas #9704

Conversation

tkountis commented Jan 19, 2017

devOpsHazelcast commented Jan 19, 2017

jerrinot commented Jan 19, 2017

tkountis commented Jan 19, 2017

mmedenjak left a comment

Choose a reason for hiding this comment

mmedenjak Jan 20, 2017

Choose a reason for hiding this comment

tkountis commented Jan 20, 2017

jerrinot left a comment

Choose a reason for hiding this comment

devOpsHazelcast commented Jan 20, 2017

devOpsHazelcast commented Jan 20, 2017