Properly wait for recovery to finish #800

danielmitterdorfer · 2019-10-17T15:29:35Z

With this commit we ensure that a proper schedule is chosen when waiting
for shard recovery to finish. We also ensure that a short time-span where no
recoveries are active (but more might continue soon) is not mistakenly
treated as a condition where all recoveries have already finished.

Closes #796

With this commit we ensure that a proper schedule is chosen when waiting for shard recovery to finish. We also ensure that a short time-span where no recoveries are active (but more might continue soon) is not mistakenly treated as a condition where all recoveries have already finished. Closes elastic#796

dliappis · 2019-10-18T13:22:50Z

docs/track.rst

-With the operation ``wait-for-recovery`` you can wait until an ongoing index recovery finishes. The ``wait-for-recovery`` operation does not support any parameters.
+With the operation ``wait-for-recovery`` you can wait until an ongoing shard recovery finishes. The ``wait-for-recovery`` operation supports the following parameters:
+
+* ``completion-recheck-attempts`` (optional, defaults to 3): It might be possible that the `index recovery API <https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-recovery.html>`_ reports that there are no active shard recoveries when a new one might be scheduled shortly afterwards. Therefore, this operation will check several times whether there are still no active recoveries. In between those attempts, it will wait for a time period specified by ``completion-recheck-wait-period``.


s/It might/It's

dliappis

Thanks for fixing the operation!

I think that there are some tests that will fail if this gets merged onto the current master due to the fixes in #801

e.g. https://github.com/elastic/rally/pull/800/files#diff-ec2b07b543fca9aae6cbecf501cfe53aL227

fails with

E       TypeError: 'ScheduleHandle' object is not iterable

dliappis · 2019-10-18T13:37:56Z

tests/driver/driver_test.py

+            (2.0, metrics.SampleType.Normal, None, {"body": ["a"]}),
+            (3.0, metrics.SampleType.Normal, None, {"body": ["a"]}),
+            (4.0, metrics.SampleType.Normal, None, {"body": ["a"]}),
+        ], invocations, infinite_schedule=True)


I think that when we merge this to current master it's going to fail due to https://github.com/elastic/rally/pull/801/files

Good catch! I've merged master and changed the test now in 78235be.

…ss-runners

dliappis

LGTM thanks!

To run an operation every 10s we specify target-interval rather than target-throughput. Relates elastic#800

To run an operation every 10s we specify target-interval rather than target-throughput. Relates #800 Relates #817

danielmitterdorfer added bug Something's wrong :Load Driver Changes that affect the core of the load driver such as scheduling, the measurement approach etc. labels Oct 17, 2019

danielmitterdorfer added this to the 1.4.0 milestone Oct 17, 2019

danielmitterdorfer requested a review from dliappis October 17, 2019 15:29

danielmitterdorfer self-assigned this Oct 17, 2019

dliappis reviewed Oct 18, 2019

View reviewed changes

dliappis requested changes Oct 18, 2019

View reviewed changes

danielmitterdorfer added 2 commits October 18, 2019 15:46

Merge remote-tracking branch 'origin/master' into schedule-for-progre…

a95178a

…ss-runners

Fix new test

78235be

danielmitterdorfer requested a review from dliappis October 21, 2019 06:34

dliappis approved these changes Oct 21, 2019

View reviewed changes

danielmitterdorfer merged commit 2e2d947 into elastic:master Oct 21, 2019

danielmitterdorfer deleted the schedule-for-progress-runners branch October 21, 2019 06:43

dliappis added a commit to dliappis/rally that referenced this pull request Nov 15, 2019

Fix mistake in wait-for-recovery docs

05764fc

To run an operation every 10s we specify target-interval rather than target-throughput. Relates elastic#800

dliappis mentioned this pull request Nov 15, 2019

Fix mistake in wait-for-recovery docs #817

Merged

dliappis added a commit that referenced this pull request Nov 15, 2019

Fix mistake in wait-for-recovery docs

ff1cbc8

To run an operation every 10s we specify target-interval rather than target-throughput. Relates #800 Relates #817

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Properly wait for recovery to finish #800

Properly wait for recovery to finish #800

danielmitterdorfer commented Oct 17, 2019

dliappis Oct 18, 2019

dliappis left a comment

dliappis Oct 18, 2019

danielmitterdorfer Oct 18, 2019

dliappis left a comment

Properly wait for recovery to finish #800

Properly wait for recovery to finish #800

Conversation

danielmitterdorfer commented Oct 17, 2019

dliappis Oct 18, 2019

Choose a reason for hiding this comment

dliappis left a comment

Choose a reason for hiding this comment

dliappis Oct 18, 2019

Choose a reason for hiding this comment

danielmitterdorfer Oct 18, 2019

Choose a reason for hiding this comment

dliappis left a comment

Choose a reason for hiding this comment