Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken fetcher invariant debug panic #6765

Closed
Rjected opened this issue Feb 23, 2024 · 8 comments · Fixed by #7417
Closed

Broken fetcher invariant debug panic #6765

Rjected opened this issue Feb 23, 2024 · 8 comments · Fixed by #7417
Labels
A-tx-pool Related to the transaction mempool C-bug An unexpected or incorrect behavior

Comments

@Rjected
Copy link
Member

Rjected commented Feb 23, 2024

Ran into this panic while running in debug mode:

2024-02-23T19:12:08.285804Z  INFO reth::cli: Status connected_peers=5 freelist=92241 latest_block=1005517
thread 'tokio-runtime-worker' panicked at crates/net/network/src/transactions/fetcher.rs:259:13:
`%hash` in `@buffered_hashes` that's not in `@unknown_hashes`, `@buffered_hashes` should be a subset of keys in `@unknown_hashes`, broken invariant `@buffered_hashes` and `@unknown_hashes`,
`%hash`: 0xf4dae5e5ca9b1aeb204e7998a5544dabf632b2ed760e0f404017a8e83232d0f4,
... OMITTED
stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_fmt
   2: reth_network::transactions::fetcher::TransactionFetcher::buffer_hashes
   3: reth_network::transactions::TransactionsManager<Pool>::request_buffered_hashes
   4: <reth_network::transactions::TransactionsManager<Pool> as core::future::future::Future>::poll
   5: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::future::future::Future>::poll
   6: <futures_util::future::future::catch_unwind::CatchUnwind<Fut> as core::future::future::Future>::poll::{{closure}}
   7: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
   8: std::panicking::try::do_call
   9: ___rust_try
  10: std::panicking::try
  11: std::panic::catch_unwind
  12: <futures_util::future::future::catch_unwind::CatchUnwind<Fut> as core::future::future::Future>::poll
  13: <F as futures_core::future::TryFuture>::try_poll
  14: <futures_util::future::try_future::into_future::IntoFuture<Fut> as core::future::future::Future>::poll
  15: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
  16: <futures_util::future::future::Map<Fut,F> as core::future::future::Future>::poll
  17: <futures_util::future::try_future::MapErr<Fut,F> as core::future::future::Future>::poll
  18: <tracing_futures::Instrumented<T> as core::future::future::Future>::poll
  19: <core::pin::Pin<P> as core::future::future::Future>::poll
  20: futures_util::future::future::FutureExt::poll_unpin
  21: <futures_util::future::select::Select<A,B> as core::future::future::Future>::poll
  22: reth_tasks::TaskExecutor::spawn_critical_as::{{closure}}
  23: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
  24: tokio::runtime::task::core::Core<T,S>::poll
  25: tokio::runtime::task::harness::poll_future::{{closure}}
  26: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
  27: std::panicking::try::do_call
  28: ___rust_try
  29: std::panicking::try
  30: std::panic::catch_unwind
  31: tokio::runtime::task::harness::poll_future
  32: tokio::runtime::task::harness::Harness<T,S>::poll_inner
  33: tokio::runtime::task::harness::Harness<T,S>::poll
  34: tokio::runtime::task::raw::poll
  35: tokio::runtime::task::raw::RawTask::poll
  36: tokio::runtime::task::LocalNotified<S>::run
  37: tokio::runtime::scheduler::multi_thread::worker::Context::run_task::{{closure}}
  38: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
  39: tokio::runtime::scheduler::multi_thread::worker::Context::run
  40: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}::{{closure}}
  41: tokio::runtime::context::scoped::Scoped<T>::set
  42: tokio::runtime::context::set_scheduler::{{closure}}
  43: std::thread::local::LocalKey<T>::try_with
  44: std::thread::local::LocalKey<T>::with
  45: tokio::runtime::context::set_scheduler
  46: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}
  47: tokio::runtime::context::runtime::enter_runtime
  48: tokio::runtime::scheduler::multi_thread::worker::run
  49: tokio::runtime::scheduler::multi_thread::worker::Launch::launch::{{closure}}
  50: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
  51: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
  52: tokio::runtime::task::core::Core<T,S>::poll
  53: tokio::runtime::task::harness::poll_future::{{closure}}
  54: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
  55: std::panicking::try::do_call
  56: ___rust_try
  57: std::panicking::try
  58: std::panic::catch_unwind
  59: tokio::runtime::task::harness::poll_future
  60: tokio::runtime::task::harness::Harness<T,S>::poll_inner
  61: tokio::runtime::task::harness::Harness<T,S>::poll
  62: tokio::runtime::task::raw::poll
  63: tokio::runtime::task::raw::RawTask::poll
  64: tokio::runtime::task::UnownedTask<S>::run
  65: tokio::runtime::blocking::pool::Task::run
  66: tokio::runtime::blocking::pool::Inner::run
  67: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
2024-02-23T19:12:28.608538Z ERROR reth_tasks: Critical task `p2p txpool` panicked: ``%hash` in `@buffered_hashes` that's not in `@unknown_hashes`, `@buffered_hashes` should be a subset of keys in `@unknown_hashes`, broken invariant `@buffered_hashes` and `@unknown_hashes`,
`%hash`: 0xf4dae5e5ca9b1aeb204e7998a5544dabf632b2ed760e0f404017a8e83232d0f4,

(the rest of the log is omitted because it's a gigantic LRU)

@Rjected Rjected added C-bug An unexpected or incorrect behavior A-tx-pool Related to the transaction mempool labels Feb 23, 2024
@Rjected Rjected changed the title Broken fetcher invariant debug paninc Broken fetcher invariant debug panic Feb 23, 2024
@emhane
Copy link
Member

emhane commented Feb 27, 2024

possibly related contain-rs/linked-hash-map#110

Copy link
Contributor

This issue is stale because it has been open for 21 days with no activity.

@github-actions github-actions bot added the S-stale This issue/PR is stale and will close with no further activity label Mar 20, 2024
@mattsse
Copy link
Collaborator

mattsse commented Mar 20, 2024

@emhane is still an issue?

@github-actions github-actions bot removed the S-stale This issue/PR is stale and will close with no further activity label Mar 21, 2024
@emhane
Copy link
Member

emhane commented Mar 23, 2024

@emhane is still an issue?

no, I don't think so, I think this is defo due to a memory leak in LinkedHashSet dep as linked in previous comment, since I proofed where we call remove on @buffered_hashes and @unknown_hashes. removing an entry from @unknown_hashes is done atomically with removing it from @buffered_hashes.

@emhane emhane closed this as completed Mar 23, 2024
@ryanschneider
Copy link
Contributor

FWIW I seem to be hitting this same panic or at least a very similar one when debugging as well, this is on main commit 0c363ea with reth sitting idle in my debugger (also my CL is currently stopped).

2024-04-01T19:54:19.061407Z  INFO Status connected_peers=10 freelist=5176003 latest_block=19542357
thread 'tokio-runtime-worker' panicked at crates/net/network/src/transactions/fetcher.rs:378:13:
`%hash` in `@buffered_hashes` that's not in `@hashes_fetch_inflight_and_pending_fetch`, `@buffered_hashes` should be a subset of keys in `@hashes_fetch_inflight_and_pending_fetch`, broken invariant `@buffered_hashes` and `@hashes_fetch_inflight_and_pending_fetch`,
`%hash`: 0xe6198b6989ba9cfde9cd7d7d807325890f6d6f918ed7b4b72aa3bd9f67e02113
stack backtrace:
   0: rust_begin_unwind
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/std/src/panicking.rs:647:5
   1: core::panicking::panic_fmt
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/core/src/panicking.rs:72:14
   2: reth_network::transactions::fetcher::TransactionFetcher::buffer_hashes
             at ./crates/net/network/src/transactions/fetcher.rs:378:13
   3: reth_network::transactions::TransactionsManager<Pool>::on_new_pooled_transaction_hashes
             at ./crates/net/network/src/transactions/mod.rs:755:13
   4: reth_network::transactions::TransactionsManager<Pool>::on_network_tx_event
             at ./crates/net/network/src/transactions/mod.rs:837:17
   5: <reth_network::transactions::TransactionsManager<Pool> as core::future::future::Future>::poll::{{closure}}
             at ./crates/net/network/src/transactions/mod.rs:1321:21
   6: <reth_network::transactions::TransactionsManager<Pool> as core::future::future::Future>::poll
             at ./crates/net/network/src/transactions/mod.rs:1315:36
   7: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::future::future::Future>::poll
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/core/src/panic/unwind_safe.rs:297:9
   8: <futures_util::future::future::catch_unwind::CatchUnwind<Fut> as core::future::future::Future>::poll::{{closure}}
             at /home/ryanschneider/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/future/catch_unwind.rs:36:42
   9: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/core/src/panic/unwind_safe.rs:272:9
  10: std::panicking::try::do_call
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/std/src/panicking.rs:554:40

@Rjected
Copy link
Member Author

Rjected commented Apr 1, 2024

FWIW I seem to be hitting this same panic or at least a very similar one when debugging as well, this is on main commit 0c363ea with reth sitting idle in my debugger (also my CL is currently stopped).

2024-04-01T19:54:19.061407Z  INFO Status connected_peers=10 freelist=5176003 latest_block=19542357
thread 'tokio-runtime-worker' panicked at crates/net/network/src/transactions/fetcher.rs:378:13:
`%hash` in `@buffered_hashes` that's not in `@hashes_fetch_inflight_and_pending_fetch`, `@buffered_hashes` should be a subset of keys in `@hashes_fetch_inflight_and_pending_fetch`, broken invariant `@buffered_hashes` and `@hashes_fetch_inflight_and_pending_fetch`,
`%hash`: 0xe6198b6989ba9cfde9cd7d7d807325890f6d6f918ed7b4b72aa3bd9f67e02113
stack backtrace:
   0: rust_begin_unwind
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/std/src/panicking.rs:647:5
   1: core::panicking::panic_fmt
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/core/src/panicking.rs:72:14
   2: reth_network::transactions::fetcher::TransactionFetcher::buffer_hashes
             at ./crates/net/network/src/transactions/fetcher.rs:378:13
   3: reth_network::transactions::TransactionsManager<Pool>::on_new_pooled_transaction_hashes
             at ./crates/net/network/src/transactions/mod.rs:755:13
   4: reth_network::transactions::TransactionsManager<Pool>::on_network_tx_event
             at ./crates/net/network/src/transactions/mod.rs:837:17
   5: <reth_network::transactions::TransactionsManager<Pool> as core::future::future::Future>::poll::{{closure}}
             at ./crates/net/network/src/transactions/mod.rs:1321:21
   6: <reth_network::transactions::TransactionsManager<Pool> as core::future::future::Future>::poll
             at ./crates/net/network/src/transactions/mod.rs:1315:36
   7: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::future::future::Future>::poll
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/core/src/panic/unwind_safe.rs:297:9
   8: <futures_util::future::future::catch_unwind::CatchUnwind<Fut> as core::future::future::Future>::poll::{{closure}}
             at /home/ryanschneider/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/future/catch_unwind.rs:36:42
   9: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/core/src/panic/unwind_safe.rs:272:9
  10: std::panicking::try::do_call
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/std/src/panicking.rs:554:40

yeah, I recently ran into this as well and don't think it's due to LinkedHashSet or memory leaks, so I'm reopening this issue

@Rjected Rjected reopened this Apr 1, 2024
@emhane
Copy link
Member

emhane commented Apr 1, 2024

ah I think this has to do with the change of bounding the LruMap that stores @hashes_fetch_inflight_and_pending_fetch. Originally it was unlimited. will remove that assertion, sine being a bounded LruMap now, we cannot strictly assumes that @buffered_hashes is a subset of @hashes_fetch_inflight_and_pending_fetch.

@ryanschneider
Copy link
Contributor

ryanschneider commented Apr 2, 2024

Hmm, even with #7417 I'm still hitting the new panic! (the full output scrolls way beyond my scrollback buffer I was scrolling too far back, I had the panic happen twice, full output is now below):

thread 'tokio-runtime-worker' panicked at crates/net/network/src/transactions/fetcher.rs:671:21:
`%new_announced_hashes` should been taken out of buffer before packing in a request, breaks invariant `@hashes_pending_fetch` and
`@inflight_requests`,
`@hashes_fetch_inflight_and_pending_fetch` for `%new_announced_hashes`: [(0x6df1268b09cef28cbcef825c5abed33abfabc1927f84b661ad70d3faed907c19, Some(TxFetchMetadata { retries: 1, fallback_peers: LruCache { limit: 3, res_fn_iter: Iter: { } }, tx_encoded_length: Some(108) })), (0x4dce8f2a7c25ebddd233eeb58a2b275e30be3fbdc4d13a977f39594998d479a4, Some(TxFetchMetadata { retries: 2, fallback_peers: LruCache { limit: 3, res_fn_iter: Iter: { } }, tx_encoded_length: Some(109) })), (0x18ae947d79aa95fa56ec906774a81dc7576e84f481c3905c2a8eaed54d1042de, Some(TxFetchMetadata { retries: 0, fallback_peers: LruCache { limit: 3, res_fn_iter: Iter: { } }, tx_encoded_length: None })), (0xabb8b020e1bcb5604b7ad4c6d51ed3445908c9798c7fbdf0a96c5a033fd1812a, Some(TxFetchMetadata { retries: 0, fallback_peers: LruCache { limit: 3, res_fn_iter: Iter: { } }, tx_encoded_length: None })), (0xb8209630696c1efe18b93057867e453a0dd11ffb6886617f0c15cb97b27071b9, Some(TxFetchMetadata { retries: 2, fallback_peers: LruCache { limit: 3, res_fn_iter: Iter: { } }, tx_encoded_length: Some(180) })), (0xe6e4aaa83633a5924825e0aaea35adc2b56acb033b99879829988a5cf1983cdd, Some(TxFetchMetadata { retries: 0, fallback_peers: LruCache { limit: 3, res_fn_iter: Iter: { } }, tx_encoded_length: None })), (0x1dccbb5714862f3a713981e655f91c23ab8d6cbd7fa824b5a1dbe77d687e37ea, Some(TxFetchMetadata { retries: 0, fallback_peers: LruCache { limit: 3, res_fn_iter: Iter: { } }, tx_encoded_length: None }))]
stack backtrace:
   0: rust_begin_unwind
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/std/src/panicking.rs:647:5
   1: core::panicking::panic_fmt
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/core/src/panicking.rs:72:14
   2: reth_network::transactions::fetcher::TransactionFetcher::request_transactions_from_peer
             at ./crates/net/network/src/transactions/fetcher.rs:671:21
   3: reth_network::transactions::TransactionsManager<Pool>::on_new_pooled_transaction_hashes
             at ./crates/net/network/src/transactions/mod.rs:798:13
   4: reth_network::transactions::TransactionsManager<Pool>::on_network_tx_event
             at ./crates/net/network/src/transactions/mod.rs:837:17
   5: <reth_network::transactions::TransactionsManager<Pool> as core::future::future::Future>::poll::{{closure}}
             at ./crates/net/network/src/transactions/mod.rs:1321:21
   6: <reth_network::transactions::TransactionsManager<Pool> as core::future::future::Future>::poll
             at ./crates/net/network/src/transactions/mod.rs:1315:36
   7: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::future::future::Future>::poll
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/core/src/panic/unwind_safe.rs:297:9
   8: <futures_util::future::future::catch_unwind::CatchUnwind<Fut> as core::future::future::Future>::poll::{{closure}}
             at /home/ryanschneider/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/future/catch_unwind.rs:36:42
   9: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/core/src/panic/unwind_safe.rs:272:9
  10: std::panicking::try::do_call
             at /rustc/aedd173a2c086e558c2b66d3743b344f977621a7/library/std/src/panicking.rs:554:40

In this particular case the hash in question is:

p hash
(&alloy_primitives::bits::fixed::FixedBytes<32>) 0x00007bff7000ff60
p *hash
(alloy_primitives::bits::fixed::FixedBytes<32>) {
  0 = {
    [0] = 24
    [1] = 174
    [2] = 148
    [3] = 125
    [4] = 121
    [5] = 170
    [6] = 149
    [7] = 250
    [8] = 86
    [9] = 236
    [10] = 144
    [11] = 103
    [12] = 116
    [13] = 168
    [14] = 29
    [15] = 199
    [16] = 87
    [17] = 110
    [18] = 132
    [19] = 244
    [20] = 129
    [21] = 195
    [22] = 144
    [23] = 92
    [24] = 42
    [25] = 142
    [26] = 174
    [27] = 213
    [28] = 77
    [29] = 16
    [30] = 66
    [31] = 222
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tx-pool Related to the transaction mempool C-bug An unexpected or incorrect behavior
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants