Skip to content

Commit

Permalink
fix flakyness in gpmovemirror demo cluster without-assert build
Browse files Browse the repository at this point in the history
gpmovemirror demo cluster was flaky and failing at below test
"gpmovemirrors gives warning if pg_basebackup is already running for all
mirrors to be moved ".

In the above test the gprecoverseg is triggered for inplace mirror
recovery in asynchronous mode and then parallely gpmovemirror is
triggered for the same set of mirror's. Since gprecoverseg is running in
background the mirror data directory is not consistent and gomovemirror
will calculates the size of the mirror data directory before invoking
gprecoverseg, which was falking out sometimes in gpdb without-assert
build.

Hence the changes are done so that the original intention of the test is
met to fix this flakynes.
  • Loading branch information
SunilS26 committed Feb 9, 2024
1 parent a332c3c commit e9b720c
Showing 1 changed file with 7 additions and 3 deletions.
10 changes: 7 additions & 3 deletions gpMgmt/test/behave/mgmt_utils/gpmovemirrors.feature
Original file line number Diff line number Diff line change
Expand Up @@ -328,23 +328,27 @@ Feature: Tests for gpmovemirrors
And the segments are synchronized
And all files in gpAdminLogs directory are deleted on all hosts in the cluster
And the information of contents 0,1,2 is saved
And a gprecoverseg directory under '/tmp' with mode '0700' is created
And a gprecoverseg input file is created
And edit the input file to recover mirror with content 0 to a new directory on remote host with mode 0700
And edit the input file to recover mirror with content 1 to a new directory on remote host with mode 0700
And edit the input file to recover mirror with content 2 to a new directory on remote host with mode 0700
And user immediately stops all mirror processes for content 0,1,2
And user can start transactions
And the user suspend the walsender on the primary on content 0
And the user suspend the walsender on the primary on content 1
And the user suspend the walsender on the primary on content 2
And the user asynchronously runs "gprecoverseg -aF" and the process is saved
When the user asynchronously runs gprecoverseg with input file and additional args "-a" and the process is saved
And the user just waits until recovery_progress.file is created in gpAdminLogs
And verify that mirror on content 0,1,2 is down
And the gprecoverseg lock directory is removed
Given a gpmovemirrors directory under '/tmp' with mode '0700' is created
And a gpmovemirrors input file is created
And edit the input file to recover mirror with content 0,1,2 to a new directory with mode 0700
When the user runs gpmovemirrors with input file and additional args "-v"
Then gprecoverseg should print "Found pg_basebackup running for segments with contentIds [0, 1, 2], skipping recovery of these segments" to logfile
And gprecoverseg should return a return code of 0
And gpmovemirrors should return a return code of 0
And check if mirrors on content 0,1,2 are in their original configuration
Then gprecoverseg should print "Found pg_basebackup running for segments with contentIds [0, 1, 2], skipping recovery of these segments" to logfile
And the user reset the walsender on the primary on content 0
And the user reset the walsender on the primary on content 1
And the user reset the walsender on the primary on content 2
Expand Down

0 comments on commit e9b720c

Please sign in to comment.