Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Resume wait correctly following test node restart #1515

Merged

Conversation

AndrewSisley
Copy link
Contributor

Relevant issue(s)

Resolves #1506

Description

Previously the full wait-cycle would be calculated, meaning that if a Wait action was before the Restart action the Second wait action would progress once the expected number of events for the first Wait was received. This changes it so that upon restart the wait calculation is started from just after the last wait action before the restart.

How has this been tested?

Had a few hundred successful runs of the only affect test i(TestP2PPeerReplicatorWithUpdateAndRestart) n the CI.

This isn't really required for the fix, but I made the change whilst investigating and think it is a good one (avoids a sleep on non-p2p tests, and matches/scales the P2P sleeps as they do on normal/non-restart config)
Previously the full wait-cycle would be calculated, meaning that if a Wait action was before the Restart action the Second wait action would progress once the expected number of events for the first Wait was recieved.  This changes it so that upon restart the wait calculation is started from just after the last wait action before the restart.
@AndrewSisley AndrewSisley added area/testing Related to any test or testing suite action/no-benchmark Skips the action that runs the benchmark. labels May 17, 2023
@AndrewSisley AndrewSisley added this to the DefraDB v0.6 milestone May 17, 2023
@AndrewSisley AndrewSisley requested a review from a team May 17, 2023 23:01
@AndrewSisley AndrewSisley self-assigned this May 17, 2023
@codecov
Copy link

codecov bot commented May 17, 2023

Codecov Report

Merging #1515 (44de751) into develop (e361562) will decrease coverage by 0.17%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1515      +/-   ##
===========================================
- Coverage    72.20%   72.03%   -0.17%     
===========================================
  Files          185      185              
  Lines        18295    18295              
===========================================
- Hits         13210    13179      -31     
- Misses        4046     4069      +23     
- Partials      1039     1047       +8     

see 9 files with indirect coverage changes

actionLoop:
for i := actionIndex; i >= 0; i-- {
switch testCase.Actions[i].(type) {
case WaitForSync:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: for all other cases (default case for example) doesn't matter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all other cases, the value of waitGroupStartIndex would remain 0 - the start of the slice

syncChans := []chan struct{}{}
for _, tc := range testCase.Actions {
switch action := tc.(type) {
case ConnectPeers:
// Give the nodes a chance to connect to each other and learn about each other's subscribed topics.
time.Sleep(100 * time.Millisecond)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Perhaps store this as a constant somewhere at the top of this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really not a fan of that - it just means anyone reading has to break flow and scroll to the top of a file in order to view the value of this arbitrary value.

))
case ConfigureReplicator:
// Give the nodes a chance to connect to each other and learn about each other's subscribed topics.
time.Sleep(100 * time.Millisecond)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Perhaps store this as a constant somewhere at the top of this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really not a fan of that - it just means anyone reading has to break flow and scroll to the top of a file in order to view the value of this arbitrary value.

Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 yes! Nice fix Andy!

@AndrewSisley
Copy link
Contributor Author

clap yes! Nice fix Andy!

Thanks, and thanks for your time and help on this too :)

@AndrewSisley AndrewSisley merged commit 8c3984d into sourcenetwork:develop May 18, 2023
9 checks passed
@AndrewSisley AndrewSisley deleted the 1506-peer-rep-restart-3 branch May 18, 2023 14:31
shahzadlone pushed a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
…k#1515)

## Relevant issue(s)

Resolves sourcenetwork#1506

## Description

Previously the full wait-cycle would be calculated, meaning that if a
Wait action was before the Restart action the Second wait action would
progress once the expected number of events for the first Wait was
received. This changes it so that upon restart the wait calculation is
started from just after the last wait action before the restart.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/no-benchmark Skips the action that runs the benchmark. area/testing Related to any test or testing suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TestP2PPeerReplicatorWithUpdateAndRestart is flaky
3 participants