Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zebrad syncer hangs when there are no peers, despite network timeouts #1633

Closed
teor2345 opened this issue Jan 25, 2021 · 1 comment · Fixed by #1714
Closed

zebrad syncer hangs when there are no peers, despite network timeouts #1633

teor2345 opened this issue Jan 25, 2021 · 1 comment · Fixed by #1714
Assignees
Labels
C-bug Category: This is a bug
Milestone

Comments

@teor2345
Copy link
Contributor

teor2345 commented Jan 25, 2021

Version

zebrad a merge of PRs #1620 and #1622, around 2021-01-22.

Platform

Linux ... 5.4.89 #1-NixOS SMP Tue Jan 12 19:16:25 UTC 2021 x86_64 GNU/Linux

Description

The zebrad syncer hangs when there are no peers, despite the timeouts on tip_network, sync::downloads.network, and sync::downloads.verifier.

I tried this:

Running zebrad on mainnet and testnet.

I expected to see this happen:

zebrad continues trying to sync, even if it has no peers.

Instead, this happened:

The ChainSync component stopped logging output, until a peer connected to the zebrad Zcash listener port, but then it went back to hanging after a few minutes.

Analysis

This bug is probably caused by using ready_and or poll_ready on the peer set, which doesn't become ready until peers are ready. I'm not sure if this is actually a bug - it might be correct behaviour.

This bug seems to be a more specific case of the network hangs in #1435.
It could be caused by buffer misuse in #1593

Commands

zebrad start

Logs

Jan 25 12:53:22.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 12:54:37.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 12:55:52.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 12:56:40.267  INFO {zebrad="a49f4943" net="Main"}:peer{addr=127.0.0.1:40296}:msg_as_req{msg=getheaders}:state: zebra_state::service: responding to peer GetBlocks or GetHeaders final_height=Height(1119240) response_len=158 chain_tip_height=Height(1122481)
stop_height=None intersection_height=Some(Height(1119082))
Jan 25 12:56:46.288  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: exhausted prospective tip set
Jan 25 12:56:46.288  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 12:57:47.289  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 12:57:47.289  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(1122481) min_locator_height=1122382 locators=[Height(1122481), Height(1122480), Height(1122479), Height(1122477), Height(1122473), Height(1122465), Height(1122449), Height(1122417), Height(1122382)]
Jan 25 12:57:47.290  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 1 peers to answer requests
Jan 25 12:57:53.291  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: exhausted prospective tip set
Jan 25 12:57:53.291  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 12:58:54.292  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 12:58:54.292  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(1122481) min_locator_height=1122382 locators=[Height(1122481), Height(1122480), Height(1122479), Height(1122477), Height(1122473), Height(1122465), Height(1122449), Height(1122417), Height(1122382)]
Jan 25 12:58:54.292  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 1 peers to answer
 requests
Jan 25 12:59:00.294  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: exhausted prospective tip set
Jan 25 12:59:00.294  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 13:00:01.295  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 13:00:01.295  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(1122481) min_locator_height=1122382 locators=[Heigh
t(1122481), Height(1122480), Height(1122479), Height(1122477), Height(1122473), Height(1122465), Height(1122449), Height(1122417), Height(1122382)]
Jan 25 13:00:01.295  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 1 peers to answer
 requests
Jan 25 13:00:07.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: exhausted prospective tip set
Jan 25 13:00:07.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 13:01:08.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 13:00:07.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 13:01:08.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 13:01:08.298  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(1122481) min_locator_height=1122382 locators=[Heigh
t(1122481), Height(1122480), Height(1122479), Height(1122477), Height(1122473), Height(1122465), Height(1122449), Height(1122417), Height(1122382)]
Jan 25 13:02:07.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:03:22.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:04:37.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:05:52.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:07:07.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:08:22.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:09:37.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
@teor2345 teor2345 added C-bug Category: This is a bug S-needs-triage Status: A bug report needs triage labels Jan 25, 2021
@zfnd-bot zfnd-bot bot added this to To Do in 🦓 Jan 25, 2021
@teor2345 teor2345 added this to No Estimate in Effort Affinity Grouping via automation Jan 25, 2021
@teor2345 teor2345 added this to the 2021 Sprint 2 milestone Jan 25, 2021
@teor2345 teor2345 moved this from No Estimate to M - 5 in Effort Affinity Grouping Jan 25, 2021
@teor2345 teor2345 added P-High and removed S-needs-triage Status: A bug report needs triage labels Jan 28, 2021
@mpguerra mpguerra modified the milestones: 2021 Sprint 2, 2021 Sprint 3 Feb 8, 2021
@teor2345 teor2345 modified the milestones: 2021 Sprint 2, 2021 Sprint 3 Feb 8, 2021
@teor2345 teor2345 moved this from M - 5 to XS - 2 in Effort Affinity Grouping Feb 9, 2021
@teor2345 teor2345 moved this from XS - 2 to XXS - 1 in Effort Affinity Grouping Feb 9, 2021
@teor2345
Copy link
Contributor Author

teor2345 commented Feb 9, 2021

This bug is caused by #1435, it will probably be fixed by #1709.

But we should add some logs so future hangs are easier to diagnose - see #1714.

🦓 automation moved this from To Do to Done Feb 9, 2021
Effort Affinity Grouping automation moved this from XXS - 1 to Done Feb 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants