[`6.0.0-beta.1`] panic with `partial-tracing` for `debug` lvl on reconnect #73

DDtKey · 2022-12-07T11:00:24Z

When I tested re-connection with current version in the main branch, I faced with panic when I enabled debug level tracing.

2022-12-07T09:55:31.463755Z DEBUG fred::multiplexer::clustered: fred-mpjrO24y5w: Finish synchronizing cluster connections.
    at /usr/local/cargo/git/checkouts/fred.rs-3fb6d817444a1870/4f0c3f4/src/multiplexer/clustered.rs:622 on reactor
thread 'reactor' panicked at 'tried to clone Id(2251799813685263), but no span exists with that ID
This may be caused by consuming a parent span (`parent: span`) rather than borrowing it (`parent: &span`).', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tracing-subscriber-0.3.16/src/registry/sharded.rs:311:32
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

It doesn't appear when I use info level and reconnection works totally fine (with this fix)
Thanks for all your efforts @aembke!

The text was updated successfully, but these errors were encountered:

aembke · 2022-12-07T15:13:33Z

Ah yeah, fixing the tracing logic is the last thing on my TODO list before I push the beta to crates.io. I'll hopefully have an update here later today or tomorrow.

aembke · 2022-12-08T17:04:37Z

@DDtKey can you provide a bit more info on how you configured the tracing logic?

I just pushed some changes to fix/tracing that update the docs and OTEL dependency versions, but everything seems to be working fine for me with both partial-tracing and full-tracing now on that branch.

Would you mind trying to repro this with the bin/pipeline_test module against a local jaeger instance on fix/tracing? I have some install steps in the README there that show how I set everything up. I tried the following and they all worked for me, but perhaps we're configuring things differently.

RUST_LOG=debug cargo run --release --features partial-tracing -- -C 1 -c 50 -P 1 -t
RUST_LOG=debug cargo run --release --features full-tracing -- -C 1 -c 50 -P 1 -t
RUST_LOG=debug cargo run --release --features stdout-tracing -- -C 1 -c 50 -P 1 -t

DDtKey · 2022-12-09T12:26:16Z

@aembke Hey, sure.
Take a look at #74 (draft PR with description)

DDtKey · 2022-12-20T10:20:50Z

@aembke hey!
Unfortunately it's still relevant issue. I've encountered with the same behavior for 6.0.0-beta.2 under the high-load even with log-level INFO. And moreover it wasn't related to restart of redis or similar. I just overloaded redis with requests.
But other nodes(that not overloaded) connected to the same redis(also with fred) works well at the same time.

I'll try to repro that. It looks like some race-conditions, isn't it possible that command is timed out after the corresponding check, but before creating span?

thread 'reactor' panicked at 'tried to clone Id(2252074691592806), but no span exists with that ID
This may be caused by consuming a parent span (`parent: span`) rather than borrowing it (`parent: &span`).', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tracing-subscriber-0.3.16/src/registry/sharded.rs:311:32
   0: rust_begin_unwind
             at ./rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at ./rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:142:14
   2: <tracing_subscriber::registry::sharded::Registry as tracing_core::subscriber::Subscriber>::clone_span
   3: <tracing_subscriber::layer::layered::Layered<L,S> as tracing_core::subscriber::Subscriber>::new_span
   4: tracing::span::Span::make_with
   5: fred::multiplexer::utils::prepare_command
   6: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   7: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   8: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   9: fred::multiplexer::commands::process_command::{{closure}}
  10: tokio::runtime::task::core::CoreStage<T>::poll
  11: tokio::runtime::task::raw::poll
  12: tokio::runtime::scheduler::multi_thread::worker::Context::run_task

Before that I see a lot of:

[fred::trace::enabled::wait_for_response - start]
[fred::trace::enabled::wait_for_response - end]

DDtKey · 2022-12-20T11:00:48Z

Oh, I was able to repro that. It looks related to back-pressure, because I start to face with that under high value of concurrency
Check this PR pls #88

DDtKey mentioned this issue Dec 9, 2022

test: repro panic for tracing #74

Closed

aembke mentioned this issue Dec 10, 2022

6.0.0-beta.1 #75

Merged

aembke closed this as completed in #75 Dec 10, 2022

DDtKey mentioned this issue Dec 20, 2022

test: repro panic for tracing #88

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`6.0.0-beta.1`] panic with `partial-tracing` for `debug` lvl on reconnect #73

[`6.0.0-beta.1`] panic with `partial-tracing` for `debug` lvl on reconnect #73

DDtKey commented Dec 7, 2022

aembke commented Dec 7, 2022

aembke commented Dec 8, 2022

DDtKey commented Dec 9, 2022

DDtKey commented Dec 20, 2022 •

edited

DDtKey commented Dec 20, 2022

[6.0.0-beta.1] panic with partial-tracing for debug lvl on reconnect #73

[6.0.0-beta.1] panic with partial-tracing for debug lvl on reconnect #73

Comments

DDtKey commented Dec 7, 2022

aembke commented Dec 7, 2022

aembke commented Dec 8, 2022

DDtKey commented Dec 9, 2022

DDtKey commented Dec 20, 2022 • edited

DDtKey commented Dec 20, 2022

[`6.0.0-beta.1`] panic with `partial-tracing` for `debug` lvl on reconnect #73

[`6.0.0-beta.1`] panic with `partial-tracing` for `debug` lvl on reconnect #73

DDtKey commented Dec 20, 2022 •

edited