Fix cluster consistency-check test #7754

oranagra · 2020-09-07T08:25:55Z

This test was failing from time to time see discussion at the bottom of #7635
This was probably due to timing, the DEBUG SLEEP executed by redis-cli
didn't sleep for enough time.

This commit changes:

use SET-ACTIVE-EXPIRE instead of DEBUG SLEEP
reduce many after sleeps with retry loops to speed up the test.
add many comment explaining the different steps of the test and
it's purpose.
config appendonly before populating the volatile keys, so that they'll
be part of the AOF command stream rather than the preamble RDB portion.

other complications: recently kill_instance switched from SIGKILL to
SIGTERM, and this would sometimes fail since there was an AOFRW running
in the background. now we wait for it to end before attempting the kill.

This test was failing from time to time see discussion at the bottom of redis#7635 This was probably due to timing, the DEBUG SLEEP executed by redis-cli didn't sleep for enough time. This commit changes: 1) use SET-ACTIVE-EXPIRE instead of DEBUG SLEEP 2) reduce many `after` sleeps with retry loops to speed up the test. 3) add many comment explaining the different steps of the test and it's purpose. other complications: recently kill_instance switched from SIGKILL to SIGTERM, and this would sometimes fail since there was an AOFRW running in the background. now we wait for it to end before attempting the kill.

trevor211 · 2020-09-07T09:12:12Z

While the fix LGTM, I got this error when I run the test on your branch:

Testing unit: 14-consistency-check.tcl
17:08:22> (init) Restart killed instances: OK
17:08:22> Cluster nodes are reachable: OK
17:08:22> Cluster nodes hard reset: OK
17:08:22> Cluster Join and auto-discovery test: OK
17:08:26> Before slots allocation, all nodes report cluster failure: OK
17:08:26> Create a 5 nodes cluster: OK
17:08:30> Cluster should start ok: OK
17:08:30> Cluster is writable: OK
17:08:30> Slave expired keys is loaded when restarted: appendonly=no: OK
17:08:33> Slave expired keys is loaded when restarted: appendonly=yes: OK
Cleaning up...
killing stale instance 19845
killing stale instance 19850
killing stale instance 19855
killing stale instance 19860
killing stale instance 19865
killing stale instance 19870
killing stale instance 19875
killing stale instance 19880
killing stale instance 19885
killing stale instance 19895
killing stale instance 19900
killing stale instance 19905
killing stale instance 19910
killing stale instance 19915
killing stale instance 19920
killing stale instance 19925
killing stale instance 19930
killing stale instance 19935
killing stale instance 19940
killing stale instance 19973
no files matched glob pattern "*/err.txt"
    while executing
"glob */err.txt"
    (procedure "log_crashes" line 19)
    invoked from within
"log_crashes"
    (procedure "cleanup" line 7)
    invoked from within
"cleanup"
    (procedure "main" line 8)
    invoked from within
"main"
Cleaning up...
killing stale instance 19845
killing stale instance 19850
killing stale instance 19855
killing stale instance 19860
killing stale instance 19865
killing stale instance 19870
killing stale instance 19875
killing stale instance 19880
killing stale instance 19885
killing stale instance 19895
killing stale instance 19900
killing stale instance 19905
killing stale instance 19910
killing stale instance 19915
killing stale instance 19920
killing stale instance 19925
killing stale instance 19930
killing stale instance 19935
killing stale instance 19940
killing stale instance 19973
no files matched glob pattern "*/err.txt"
    while executing
"glob */err.txt"
    (procedure "log_crashes" line 19)
    invoked from within
"log_crashes"
    (procedure "cleanup" line 7)
    invoked from within
"cleanup"
    invoked from within
"if {[catch main e]} {
    puts $::errorInfo
    if {$::pause_on_error} pause_on_error
    cleanup
    exit 1
}"
    (file "tests/cluster/run.tcl" line 24)

oranagra · 2020-09-07T09:22:39Z

@trevor211 sorry, that's a small fuckup i merged yesterday.. fix is in #7752.

This test was failing from time to time see discussion at the bottom of redis#7635 This was probably due to timing, the DEBUG SLEEP executed by redis-cli didn't sleep for enough time. This commit changes: 1) use SET-ACTIVE-EXPIRE instead of DEBUG SLEEP 2) reduce many `after` sleeps with retry loops to speed up the test. 3) add many comment explaining the different steps of the test and it's purpose. 4) config appendonly before populating the volatile keys, so that they'll be part of the AOF command stream rather than the preamble RDB portion. other complications: recently kill_instance switched from SIGKILL to SIGTERM, and this would sometimes fail since there was an AOFRW running in the background. now we wait for it to end before attempting the kill. (cherry picked from commit b491d47)

This test was failing from time to time see discussion at the bottom of #7635 This was probably due to timing, the DEBUG SLEEP executed by redis-cli didn't sleep for enough time. This commit changes: 1) use SET-ACTIVE-EXPIRE instead of DEBUG SLEEP 2) reduce many `after` sleeps with retry loops to speed up the test. 3) add many comment explaining the different steps of the test and it's purpose. 4) config appendonly before populating the volatile keys, so that they'll be part of the AOF command stream rather than the preamble RDB portion. other complications: recently kill_instance switched from SIGKILL to SIGTERM, and this would sometimes fail since there was an AOFRW running in the background. now we wait for it to end before attempting the kill. (cherry picked from commit b491d47)

This test was failing from time to time see discussion at the bottom of redis#7635 This was probably due to timing, the DEBUG SLEEP executed by redis-cli didn't sleep for enough time. This commit changes: 1) use SET-ACTIVE-EXPIRE instead of DEBUG SLEEP 2) reduce many `after` sleeps with retry loops to speed up the test. 3) add many comment explaining the different steps of the test and it's purpose. 4) config appendonly before populating the volatile keys, so that they'll be part of the AOF command stream rather than the preamble RDB portion. other complications: recently kill_instance switched from SIGKILL to SIGTERM, and this would sometimes fail since there was an AOFRW running in the background. now we wait for it to end before attempting the kill.

This test was failing from time to time see discussion at the bottom of redis#7635 This was probably due to timing, the DEBUG SLEEP executed by redis-cli didn't sleep for enough time. This commit changes: 1) use SET-ACTIVE-EXPIRE instead of DEBUG SLEEP 2) reduce many `after` sleeps with retry loops to speed up the test. 3) add many comment explaining the different steps of the test and it's purpose. 4) config appendonly before populating the volatile keys, so that they'll be part of the AOF command stream rather than the preamble RDB portion. other complications: recently kill_instance switched from SIGKILL to SIGTERM, and this would sometimes fail since there was an AOFRW running in the background. now we wait for it to end before attempting the kill. (cherry picked from commit b491d47)

oranagra requested a review from trevor211 September 7, 2020 08:25

squash-me. improve the AOF test to really test AOF

7fb03fa

oranagra closed this Sep 7, 2020

oranagra reopened this Sep 7, 2020

oranagra requested a review from yossigo September 7, 2020 09:26

yossigo approved these changes Sep 7, 2020

View reviewed changes

oranagra merged commit b491d47 into redis:unstable Sep 7, 2020

oranagra deleted the fix_cluster_consistency_check_test branch September 7, 2020 15:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix cluster consistency-check test #7754

Fix cluster consistency-check test #7754

oranagra commented Sep 7, 2020 •

edited

trevor211 commented Sep 7, 2020

oranagra commented Sep 7, 2020

Fix cluster consistency-check test #7754

Fix cluster consistency-check test #7754

Conversation

oranagra commented Sep 7, 2020 • edited

trevor211 commented Sep 7, 2020

oranagra commented Sep 7, 2020

oranagra commented Sep 7, 2020 •

edited