Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demonstrate no monitoring after restart #268

Closed
wants to merge 1 commit into from

Conversation

hannahhoward
Copy link
Collaborator

What

When a new instance of go-data-transfer is constructed, and RestartDataTransferChannel is called, no channel monitoring is setup, meaning if the request stalls or errors, it will simply stay stalled.

Demonstration

This is a bit of a contrive example (contains some functions you probably don't want to expose the monitoring status to the public)

Essentially this PR:

  • Enables Channel Monitoring in the restart integration tests
  • Uses some additional functions to log the channel monitoring status on each new DataSent call

Result:

go test --run=^TestRestartPush/Restart_peer_create_push -v ./impl
=== RUN   TestRestartPush
=== RUN   TestRestartPush/Restart_peer_create_push
    restart_integration_test.go:500: peer1 is 13fBgT2iW
    restart_integration_test.go:501: peer2 is 13kJcDpop
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:191: peers unlinked and disconnected, total increments received till now: 20
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:136: Send Channel is monitored
    restart_integration_test.go:223: request was not completed after disconnect
    restart_integration_test.go:232: peers have been connected and datatransfer has restarted
    restart_integration_test.go:181: got restart event for peer 13kJcDpop
    restart_integration_test.go:181: got restart event for peer 13fBgT2iW
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:204: peer 13kJcDpop completed
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:138: Send Channel is NOT monitored
    restart_integration_test.go:204: peer 13fBgT2iW completed
--- PASS: TestRestartPush (2.17s)
    --- PASS: TestRestartPush/Restart_peer_create_push (2.17s)
PASS
ok  	github.com/filecoin-project/go-data-transfer/impl	2.640s

RestartDebounce: time.Second * 10,
RestartBackoff: time.Second * 20,
MaxConsecutiveRestarts: 15,
//RestartAckTimeout: time.Second * 30,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this shouldn't be here, it's not in Config type. Is it a TODO for future work?

The OnRestartComplete below could probably be removed too since it's not being used, or is that another TODO?

@@ -118,6 +132,11 @@ func TestRestartPush(t *testing.T) {
if event.Code == datatransfer.DataSent {
if channelState.Sent() > 0 {
sent <- channelState.Sent()
if rh.dt1.IsChannelMonitored(chid) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there something we could do here to collect state and assert on it later?

Track monitoredCount unmonitoredCount and assert they're above some base number?

@dirkmc
Copy link
Contributor

dirkmc commented Oct 14, 2021

The reason that the monitor appears to stop monitoring the channel after a restart in this test is because after restart, the test swaps out the existing data transfer instance for a new one:
https://github.com/filecoin-project/go-data-transfer/pull/268/files#diff-c0fb275d508bfd9a2ce0815642747464a42224ae05c5622bdd06a922f322a765R67

The new data transfer instance has a new channel monitor instance, so it has no knowledge of the old channel monitor's channels.

In any case the overall concept is correct - we should start the channel monitor when RestartDataTransferChannel is called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants