Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: possible panic on server shutdown #4298

Merged
merged 1 commit into from
Feb 13, 2024

Conversation

dpc
Copy link
Contributor

@dpc dpc commented Feb 11, 2024

https://github.com/fedimint/fedimint/actions/runs/7860754434/job/21454090481?pr=4292

fedimint-test-all-ci> 00:08:10 2024-02-11T17:47:35.994448Z ERROR AlephBFT-backup-saver: receiver of alert data to save closed early
fedimint-test-all-ci> 00:08:10 thread 'tokio-runtime-worker' panicked at /build/source/fedimint-server/src/atomic_broadcast/spawner.rs:30:29:
fedimint-test-all-ci> 00:08:10 We own the rx.: ()
fedimint-test-all-ci> 00:08:10 stack backtrace:
fedimint-test-all-ci> 00:08:10    0: rust_begin_unwind
fedimint-test-all-ci> 00:08:10              at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5
fedimint-test-all-ci> 00:08:10    1: core::panicking::panic_fmt
fedimint-test-all-ci> 00:08:10              at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:67:14
fedimint-test-all-ci> 00:08:10    2: core::result::unwrap_failed
fedimint-test-all-ci> 00:08:10              at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/result.rs:1652:5
fedimint-test-all-ci> 00:08:10    3: core::result::Result<T,E>::expect
fedimint-test-all-ci> 00:08:10              at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/result.rs:1034:23
fedimint-test-all-ci> 00:08:10    4: {async_block#0}<fedimint_aleph_bft::runway::run::{async_fn#0}::{async_block_env#1}<fedimint_server::atomic_broadcast::network::Hasher, fedimint_server::atomic_broadcast::data_provider::UnitData, fedimint_server::atomic_broadcast::backup::UnitSaver, std::io::cursor::Cursor<alloc::vec::Vec<u8, alloc::alloc::Global>>, fedimint_server::atomic_broadcast::keychain::Keychain, fedimint_server::atomic_broadcast::data_provider::DataProvider, fedimint_server::atomic_broadcast::finalization_handler::FinalizationHandler, fedimint_server::atomic_broadcast::spawner::Spawner>>
fedimint-test-all-ci> 00:08:10              at /build/source/fedimint-server/src/atomic_broadcast/spawner.rs:30:13
fedimint-test-all-ci> 00:08:10    5: poll<fedimint_server::atomic_broadcast::spawner::{impl#2}::spawn_essential::{async_block_env#0}<fedimint_aleph_bft::runway::run::{async_fn#0}::{async_block_env#1}<fedimint_server::atomic_broadcast::network::Hasher, fedimint_server::atomic_broadcast::data_provider::UnitData, fedimint_server::atomic_broadcast::backup::UnitSaver, std::io::cursor::Cursor<alloc::vec::Vec<u8, alloc::alloc::Global>>, fedimint_server::atomic_broadcast::keychain::Keychain, fedimint_server::atomic_broadcast::data_provider::DataProvider, fedimint_server::atomic_broadcast::finalization_handler::FinalizationHandler, fedimint_server::atomic_broadcast::spawner::Spawner>>>

There's nothing guaranteeing that the handle wasn't dropped, e.g. because the whole thing is shutting down.

https://github.com/fedimint/fedimint/actions/runs/7860754434/job/21454090481?pr=4292

```
fedimint-test-all-ci> 00:08:10 2024-02-11T17:47:35.994448Z ERROR AlephBFT-backup-saver: receiver of alert data to save closed early
fedimint-test-all-ci> 00:08:10 thread 'tokio-runtime-worker' panicked at /build/source/fedimint-server/src/atomic_broadcast/spawner.rs:30:29:
fedimint-test-all-ci> 00:08:10 We own the rx.: ()
fedimint-test-all-ci> 00:08:10 stack backtrace:
fedimint-test-all-ci> 00:08:10    0: rust_begin_unwind
fedimint-test-all-ci> 00:08:10              at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5
fedimint-test-all-ci> 00:08:10    1: core::panicking::panic_fmt
fedimint-test-all-ci> 00:08:10              at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:67:14
fedimint-test-all-ci> 00:08:10    2: core::result::unwrap_failed
fedimint-test-all-ci> 00:08:10              at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/result.rs:1652:5
fedimint-test-all-ci> 00:08:10    3: core::result::Result<T,E>::expect
fedimint-test-all-ci> 00:08:10              at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/result.rs:1034:23
fedimint-test-all-ci> 00:08:10    4: {async_block#0}<fedimint_aleph_bft::runway::run::{async_fn#0}::{async_block_env#1}<fedimint_server::atomic_broadcast::network::Hasher, fedimint_server::atomic_broadcast::data_provider::UnitData, fedimint_server::atomic_broadcast::backup::UnitSaver, std::io::cursor::Cursor<alloc::vec::Vec<u8, alloc::alloc::Global>>, fedimint_server::atomic_broadcast::keychain::Keychain, fedimint_server::atomic_broadcast::data_provider::DataProvider, fedimint_server::atomic_broadcast::finalization_handler::FinalizationHandler, fedimint_server::atomic_broadcast::spawner::Spawner>>
fedimint-test-all-ci> 00:08:10              at /build/source/fedimint-server/src/atomic_broadcast/spawner.rs:30:13
fedimint-test-all-ci> 00:08:10    5: poll<fedimint_server::atomic_broadcast::spawner::{impl#2}::spawn_essential::{async_block_env#0}<fedimint_aleph_bft::runway::run::{async_fn#0}::{async_block_env#1}<fedimint_server::atomic_broadcast::network::Hasher, fedimint_server::atomic_broadcast::data_provider::UnitData, fedimint_server::atomic_broadcast::backup::UnitSaver, std::io::cursor::Cursor<alloc::vec::Vec<u8, alloc::alloc::Global>>, fedimint_server::atomic_broadcast::keychain::Keychain, fedimint_server::atomic_broadcast::data_provider::DataProvider, fedimint_server::atomic_broadcast::finalization_handler::FinalizationHandler, fedimint_server::atomic_broadcast::spawner::Spawner>>>
```

There's nothing guaranteeing that the handle wasn't dropped, e.g.
because the whole thing is shutting down.
@dpc dpc requested a review from a team as a code owner February 11, 2024 18:30
@justinmoon justinmoon added this pull request to the merge queue Feb 12, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 12, 2024
@dpc dpc added this pull request to the merge queue Feb 13, 2024
Merged via the queue into fedimint:master with commit b425e52 Feb 13, 2024
22 checks passed
@dpc dpc deleted the 24-02-11-fix-panic-alephbft branch February 13, 2024 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants