Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Network] refactor lifecycle manager #1031

Merged
merged 14 commits into from
Jul 30, 2021
Merged

Conversation

synzhu
Copy link
Contributor

@synzhu synzhu commented Jul 26, 2021

Context: #1026 (review)

This PR refactors the lifecycle management code in Network into a separate struct.

This struct can be used in the future to refactor many existing ReadyDoneAware modules: https://github.com/dapperlabs/flow-go/issues/5702

closes https://github.com/dapperlabs/flow-go/issues/5704

@codecov-commenter
Copy link

codecov-commenter commented Jul 26, 2021

Codecov Report

Merging #1031 (93b1f8e) into master (2d496c1) will increase coverage by 0.14%.
The diff coverage is 68.29%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1031      +/-   ##
==========================================
+ Coverage   53.27%   53.42%   +0.14%     
==========================================
  Files         318      318              
  Lines       21515    21521       +6     
==========================================
+ Hits        11462    11497      +35     
+ Misses       8487     8455      -32     
- Partials     1566     1569       +3     
Flag Coverage Δ
unittests 53.42% <68.29%> (+0.14%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
network/p2p/network.go 0.00% <0.00%> (ø)
module/lifecycle/lifecycle.go 100.00% <100.00%> (ø)
cmd/util/ledger/migrations/storage_v4.go 41.56% <0.00%> (-0.61%) ⬇️
...sus/approvals/assignment_collector_statemachine.go 50.00% <0.00%> (+7.69%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d496c1...93b1f8e. Read the comment docs.

Comment on lines +99 to +105
func RequireClosed(t *testing.T, ch <-chan struct{}, message string) {
select {
case <-ch:
default:
require.Fail(t, "channel is not closed: "+message)
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is guaranteed to select from the channel if it is closed, and will only run the default case if nothing is available to be read from the channel: https://stackoverflow.com/questions/45580151/priority-of-case-versus-default-in-golang-select-statements

Comment on lines +159 to +165
func RequireNotClosed(t *testing.T, ch <-chan struct{}, message string) {
select {
case <-ch:
require.Fail(t, "channel is closed: "+message)
default:
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor

@yhassanzadeh13 yhassanzadeh13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great 👍

Comment on lines +76 to +82
lm.stateTransition.Lock()
if lm.shutdownCommenced || lm.startupCommenced {
lm.stateTransition.Unlock()
return
}
lm.startupCommenced = true
lm.stateTransition.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lm.stateTransition.Lock()
if lm.shutdownCommenced || lm.startupCommenced {
lm.stateTransition.Unlock()
return
}
lm.startupCommenced = true
lm.stateTransition.Unlock()
func(){
lm.stateTransition.Lock()
defer lm.stateTransition.Unlock()
if lm.shutdownCommenced || lm.startupCommenced {
return
}
lm.startupCommenced = true
}()

Using defer pattern is safer in this scenario.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! In this particular case, I think it may introduce a bit more messiness than your suggested change, because when we return on line 79 we actually want to return from the enclosing function.

If we wrap the logic in a nested function, the return now only returns from the nested function, so we would probably have to add some if / else logic and return a boolean from the nested function instead.

I think that may not be necessary here, since the locking code is very short and contained

Comment on lines +96 to +102
lm.stateTransition.Lock()
if lm.shutdownCommenced {
lm.stateTransition.Unlock()
return
}
lm.shutdownCommenced = true
lm.stateTransition.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same defer pattern is suggested here.


// TestConsecutiveStart tests that calling OnStart multiple times concurrently only
// results in startup being performed once
func (suite *LifecycleManagerSuite) TestConsecutiveStart() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test scenario looks like concurrent than consecutive.


// TestConsecutiveStop tests that calling OnStop multiple times concurrently only
// results in shutdown being performed once
func (suite *LifecycleManagerSuite) TestConsecutiveStop() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test scenario looks like concurrent than consecutive.


suite.lm.OnStart(func() {
// simulate startup processing
time.Sleep(3 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

**General comment: ** sleep durations in this PR are in order of seconds. This slows down building over CI. Would be great if you could please reduce them to milliseconds if possible. For most of the cases, 100 milliseconds is a preferable sleep duration

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or you could use a channel here as a gate. After the call to suite.lm.OnStop(), you can close the channel (a.k.a. open the gate). Then we won't have to rely on a sleep

unittest.RequireCloseBefore(suite.T(), suite.lm.Started(), time.Second, "timed out waiting for startup")
time.Sleep(100 * time.Millisecond) // wait for potential race conditions to occur

suite.Assert().EqualValues(1, numStarts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use assert.Eventuallyf here to factor in the sleep

}()

return n.done
n.cancel()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Copy link
Contributor

@vishalchangrani vishalchangrani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@synzhu synzhu merged commit c9f6e0e into master Jul 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants