[1pt] test: flaky replication/gh-3704-misc-replica-checks-cluster-id.test.lua #5293

avtikhon · 2020-09-11T08:26:56Z

Tarantool version:
2.6.0-61-g5a9b79fa0

OS version:
All

Bug description:
Issue:

[037] --- replication/gh-3704-misc-replica-checks-cluster-id.result	Thu Sep 10 18:05:22 2020
[037] +++ replication/gh-3704-misc-replica-checks-cluster-id.reject	Fri Sep 11 11:09:38 2020
[037] @@ -25,7 +25,7 @@
[037]  ...
[037]  box.info.replication[2].downstream.status
[037]  ---
[037] -- follow
[037] +- stopped
[037]  ...
[037]  -- change master's cluster uuid and check that replica doesn't connect.
[037]  test_run:cmd("stop server replica")

Steps to reproduce:
Reproduced on slow MAC host tntmac02 with command:

l=0 ; while ./test-run.py -j50 `for r in {1..100} ; do echo replication/gh-3704-misc-replica-checks-cluster-id ; done 2>/dev/null` ; do l=$(($l+1)) ; echo ======== $l ============= ; done

Optional (but very desirable):

coredump
backtrace
netstat

The text was updated successfully, but these errors were encountered:

On heavy loaded hosts found the following issue: [037] --- replication/gh-3704-misc-replica-checks-cluster-id.result Thu Sep 10 18:05:22 2020 [037] +++ replication/gh-3704-misc-replica-checks-cluster-id.reject Fri Sep 11 11:09:38 2020 [037] @@ -25,7 +25,7 @@ [037] ... [037] box.info.replication[2].downstream.status [037] --- [037] -- follow [037] +- stopped [037] ... [037] -- change master's cluster uuid and check that replica doesn't connect. [037] test_run:cmd("stop server replica") It happened because replication downstream status check occurred too early, when it was only in 'stopped' state. To give the replication status check routine ability to reach the needed 'follow' state, it need to wait for it using test_run:wait_downstream() routine. Closes #5293

On heavy loaded hosts found the following issue: [037] --- replication/gh-3704-misc-replica-checks-cluster-id.result Thu Sep 10 18:05:22 2020 [037] +++ replication/gh-3704-misc-replica-checks-cluster-id.reject Fri Sep 11 11:09:38 2020 [037] @@ -25,7 +25,7 @@ [037] ... [037] box.info.replication[2].downstream.status [037] --- [037] -- follow [037] +- stopped [037] ... [037] -- change master's cluster uuid and check that replica doesn't connect. [037] test_run:cmd("stop server replica") It happened because replication downstream status check occurred too early, when it was only in 'stopped' state. To give the replication status check routine ability to reach the needed 'follow' state, it need to wait for it using test_run:wait_downstream() routine. Closes #5293 (cherry picked from commit db3dd8d)

avtikhon added qa Issues related to tests or testing subsystem flaky test labels Sep 11, 2020

avtikhon self-assigned this Sep 11, 2020

avtikhon added this to ON REVIEW in Quality Assurance Sep 11, 2020

avtikhon changed the title ~~test: flaky replication/gh-3704-misc-replica-checks-cluster-id.test.lua~~ [1pt] test: flaky replication/gh-3704-misc-replica-checks-cluster-id.test.lua Sep 14, 2020

kyukhin closed this as completed in db3dd8d Sep 15, 2020

avtikhon moved this from ON REVIEW to DONE in Quality Assurance Sep 15, 2020

avtikhon removed this from DONE in Quality Assurance Sep 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[1pt] test: flaky replication/gh-3704-misc-replica-checks-cluster-id.test.lua #5293

[1pt] test: flaky replication/gh-3704-misc-replica-checks-cluster-id.test.lua #5293

avtikhon commented Sep 11, 2020

[1pt] test: flaky replication/gh-3704-misc-replica-checks-cluster-id.test.lua #5293

[1pt] test: flaky replication/gh-3704-misc-replica-checks-cluster-id.test.lua #5293

Comments

avtikhon commented Sep 11, 2020