Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ctrl-C graceful shutdown during burnchain initialization #4474

Merged
merged 1 commit into from Mar 4, 2024

Conversation

8marz8
Copy link
Contributor

@8marz8 8marz8 commented Mar 4, 2024

Description

There are already signal handlers and flags in place for handling graceful shutdown, but not prior to burnchain sync. This PR propagates an error if a Ctrl-C signal was submitted during burnchain sync instead of panicking.

In the case of network loss while adding bootstrap nodes, it applies a fixed retry before panicking.

Applicable issues

Additional info (benefits, drawbacks, caveats)

This PR does not aim to eliminate panics. It only handles the specific cases mentioned in above issues.

Logs

Ctrl-C:

Running /Users/marzi/Projects/graceful-shutdown/stacks-core/target/debug/stacks-node start --config ./conf/mainnet-follower-conf.toml
INFO [1709428155.649663] [testnet/stacks-node/src/main.rs:295] [main] stacks-node 0.1.0 (:, debug build, macos [aarch64])
INFO [1709428155.649798] [testnet/stacks-node/src/main.rs:357] [main] Loading config at path ./conf/mainnet-follower-conf.toml
WARN [1709428156.429226] [stackslib/src/chainstate/coordinator/mod.rs:3439] [main] Sortition DB /tmp/stacks-node-1709428155650/mainnet/burnchain/sortition does not exist; assuming it will be instantiated with the correct version
WARN [1709428156.429261] [stackslib/src/chainstate/coordinator/mod.rs:3455] [main] Chainstate DB /tmp/stacks-node-1709428155650/mainnet/chainstate does not exist; assuming it will be instantiated with the correct version
INFO [1709428156.429278] [testnet/stacks-node/src/run_loop/neon.rs:430] [main] Start syncing Bitcoin headers, feel free to grab a cup of coffee, this can take a while
WARN [1709428156.429294] [stackslib/src/burnchains/burnchain.rs:664] [main] Failed to stat burnchain DB path '/tmp/stacks-node-1709428155650/mainnet/burnchain/burnchain.sqlite': Os { code: 2, kind: NotFound, message: "No such file or directory" }
INFO [1709428156.644475] [stackslib/src/burnchains/bitcoin/spv.rs:1286] [main] Syncing Bitcoin headers: 0.2% (2000 out of 832891)
INFO [1709428156.844205] [stackslib/src/burnchains/bitcoin/spv.rs:1286] [main] Syncing Bitcoin headers: 0.5% (4000 out of 832891)
WARN [1709428156.865688] [stackslib/src/burnchains/bitcoin/indexer.rs:426] [main] Unhandled error while receiving a message: Io(Custom { kind: ConnectionReset, error: "I/O error when processing message" })
INFO [1709428157.186218] [stackslib/src/burnchains/bitcoin/spv.rs:1286] [main] Syncing Bitcoin headers: 0.7% (6000 out of 832891)
INFO [1709428157.565523] [stackslib/src/burnchains/bitcoin/spv.rs:1286] [main] Syncing Bitcoin headers: 1.0% (8000 out of 832891)
INFO [1709428157.838768] [stackslib/src/burnchains/bitcoin/spv.rs:1286] [main] Syncing Bitcoin headers: 1.2% (10000 out of 832891)
INFO [1709428158.142795] [stackslib/src/burnchains/bitcoin/spv.rs:1286] [main] Syncing Bitcoin headers: 1.4% (12000 out of 832891)
INFO [1709428158.405623] [stackslib/src/burnchains/bitcoin/spv.rs:1286] [main] Syncing Bitcoin headers: 1.7% (14000 out of 832891)
INFO [1709428158.678508] [stackslib/src/burnchains/bitcoin/spv.rs:1286] [main] Syncing Bitcoin headers: 1.9% (16000 out of 832891)
INFO [1709428158.940691] [stackslib/src/burnchains/bitcoin/spv.rs:1286] [main] Syncing Bitcoin headers: 2.2% (18000 out of 832891)
INFO [1709428159.259745] [stackslib/src/burnchains/bitcoin/spv.rs:1286] [main] Syncing Bitcoin headers: 2.4% (20000 out of 832891)
^CGraceful termination request received (signal CtrlC), will complete the ongoing runloop cycles and terminate
INFO [1709428159.530655] [stackslib/src/burnchains/bitcoin/spv.rs:1286] [main] Syncing Bitcoin headers: 2.6% (22000 out of 832891)
ERRO [1709428159.535641] [stackslib/src/burnchains/burnchain.rs:607] [main] Failed to sync initial headers
ERRO [1709428159.635821] [testnet/stacks-node/src/burnchains/bitcoin_regtest_controller.rs:543] [main] Unable to sync with burnchain: Try synchronizing again
INFO [1709428159.635895] [testnet/stacks-node/src/run_loop/neon.rs:457] [main] Shutdown initiated during burnchain initialization: ChainsCoordinator closed
INFO [1709428159.636016] [testnet/stacks-node/src/run_loop/neon.rs:1030] [main] Exiting stacks-node

Network connectivity loss during add_bootstrap_node:

NFO [1709427268.387854] [testnet/stacks-node/src/main.rs:295] [main] stacks-node 0.1.0 (:, debug build, macos [aarch64])
INFO [1709427268.387963] [testnet/stacks-node/src/main.rs:357] [main] Loading config at path ./conf/mainnet-follower-conf.toml
INFO [1709427268.435543] [testnet/stacks-node/src/config.rs:942] [main] About to call set_bootstrap_nodes
INFO [1709427268.435614] [testnet/stacks-node/src/config.rs:1931] [main] about to sleep before hostport.to_socket_addrs
ERRO [1709427278.443863] [testnet/stacks-node/src/config.rs:1952] [main] Attempt 1 - Failed to resolve 'seed.mainnet.hiro.so:20444': failed to lookup address information: nodename nor servname provided, or not known. Retrying in 2s...
ERRO [1709427280.446148] [testnet/stacks-node/src/config.rs:1952] [main] Attempt 2 - Failed to resolve 'seed.mainnet.hiro.so:20444': failed to lookup address information: nodename nor servname provided, or not known. Retrying in 4s...
ERRO [1709427284.451537] [testnet/stacks-node/src/config.rs:1952] [main] Attempt 3 - Failed to resolve 'seed.mainnet.hiro.so:20444': failed to lookup address information: nodename nor servname provided, or not known. Retrying in 8s...
ERRO [1709427292.454097] [testnet/stacks-node/src/config.rs:1952] [main] Attempt 4 - Failed to resolve 'seed.mainnet.hiro.so:20444': failed to lookup address information: nodename nor servname provided, or not known. Retrying in 16s...
ERRO [1709427308.456645] [testnet/stacks-node/src/config.rs:1952] [main] Attempt 5 - Failed to resolve 'seed.mainnet.hiro.so:20444': failed to lookup address information: nodename nor servname provided, or not known. Retrying in 32s...
ERRO [1709427340.462413] [testnet/stacks-node/src/main.rs:272] [main] Process abort due to thread panic: panicked at testnet/stacks-node/src/config.rs:1950:25:
Failed to resolve 'seed.mainnet.hiro.so:20444' after 5 attempts: failed to lookup address information: nodename nor servname provided, or not known
ERRO [1709427340.523360] [testnet/stacks-node/src/main.rs:274] [main] Panic backtrace: 0: backtrace::backtrace::libunwind::trace
at /Users/marzi/.cargo/registry/src/index.crates.io-6f17d22bba15001f/backtrace-0.3.69/src/backtrace/libunwind.rs:93:5
backtrace::backtrace::trace_unsynchronized
at /Users/marzi/.cargo/registry/src/index.crates.io-6f17d22bba15001f/backtrace-0.3.69/src/backtrace/mod.rs:66:5
backtrace::backtrace::trace
at /Users/marzi/.cargo/registry/src/index.crates.io-6f17d22bba15001f/backtrace-0.3.69/src/backtrace/mod.rs:53:14
backtrace::capture::Backtrace::create
at /Users/marzi/.cargo/registry/src/index.crates.io-6f17d22bba15001f/backtrace-0.3.69/src/capture.rs:176:9
1: backtrace::capture::Backtrace::new
at /Users/marzi/.cargo/registry/src/index.crates.io-6f17d22bba15001f/backtrace-0.3.69/src/capture.rs:140:22
2: stacks_node::main::{{closure}}
at src/main.rs:273:18
3: <alloc::boxed::Box<F,A> as core::ops::function::Fn>::call
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/alloc/src/boxed.rs:2029:9
std::panicking::rust_panic_with_hook
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:783:13
4: std::panicking::begin_panic_handler::{{closure}}
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:657:13
5: std::sys_common::backtrace::__rust_end_short_backtrace
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:171:18
6: rust_begin_unwind
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
7: core::panicking::panic_fmt
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
8: stacks_node::config::NodeConfig::add_bootstrap_node
at src/config.rs:1950:25
stacks_node::config::NodeConfig::set_bootstrap_nodes
at src/config.rs:1983:17
9: stacks_node::config::Config::from_config_default
at src/config.rs:943:13
10: stacks_node::config::Config::from_config_file
at src/config.rs:886:13
11: stacks_node::main
at src/main.rs:425:22
12: core::ops::function::FnOnce::call_once
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:250:5
std::sys_common::backtrace::__rust_begin_short_backtrace
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:155:18
13: std::rt::lang_start::{{closure}}
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/rt.rs:166:18
14: core::ops::function::impls::<impl core::ops::function::FnOnce for &F>::call_once
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:284:13
std::panicking::try::do_call
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:552:40
std::panicking::try
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:516:19
std::panic::catch_unwind
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panic.rs:142:14
std::rt::lang_start_internal::{{closure}}
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/rt.rs:148:48
std::panicking::try::do_call
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:552:40
std::panicking::try
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:516:19
std::panic::catch_unwind
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panic.rs:142:14
std::rt::lang_start_internal
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/rt.rs:148:20
15: _main

@8marz8 8marz8 requested a review from cylewitruk March 4, 2024 14:58
@8marz8
Copy link
Contributor Author

8marz8 commented Mar 4, 2024

@cylewitruk Could you let me know if this satisfies the fix you wanted for those issues?

Copy link

codecov bot commented Mar 4, 2024

Codecov Report

Attention: Patch coverage is 42.10526% with 33 lines in your changes are missing coverage. Please review.

Project coverage is 60.08%. Comparing base (9be0b65) to head (5ea730c).

Additional details and impacted files
@@             Coverage Diff             @@
##             next    #4474       +/-   ##
===========================================
- Coverage   77.93%   60.08%   -17.85%     
===========================================
  Files         453      453               
  Lines      325564   325610       +46     
  Branches      318      318               
===========================================
- Hits       253732   195649    -58083     
- Misses      71824   129953    +58129     
  Partials        8        8               
Files Coverage Δ
stackslib/src/burnchains/mod.rs 82.35% <0.00%> (-0.38%) ⬇️
testnet/stacks-node/src/run_loop/nakamoto.rs 84.93% <40.00%> (-1.39%) ⬇️
testnet/stacks-node/src/run_loop/neon.rs 83.97% <50.00%> (-1.10%) ⬇️
testnet/stacks-node/src/config.rs 60.47% <38.09%> (-7.51%) ⬇️

... and 323 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9be0b65...5ea730c. Read the comment docs.

@netrome netrome self-requested a review March 4, 2024 16:17
@8marz8 8marz8 marked this pull request as ready for review March 4, 2024 22:02
@8marz8 8marz8 enabled auto-merge March 4, 2024 22:16
@8marz8 8marz8 added this pull request to the merge queue Mar 4, 2024
Merged via the queue into next with commit afe7e68 Mar 4, 2024
1 of 2 checks passed
@8marz8 8marz8 deleted the chore/graceful-shutdown branch April 8, 2024 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants