Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[6.0.0-beta.1] panic with partial-tracing for debug lvl on reconnect #73

Closed
DDtKey opened this issue Dec 7, 2022 · 5 comments · Fixed by #75
Closed

[6.0.0-beta.1] panic with partial-tracing for debug lvl on reconnect #73

DDtKey opened this issue Dec 7, 2022 · 5 comments · Fixed by #75

Comments

@DDtKey
Copy link
Contributor

DDtKey commented Dec 7, 2022

When I tested re-connection with current version in the main branch, I faced with panic when I enabled debug level tracing.

2022-12-07T09:55:31.463755Z DEBUG fred::multiplexer::clustered: fred-mpjrO24y5w: Finish synchronizing cluster connections.
    at /usr/local/cargo/git/checkouts/fred.rs-3fb6d817444a1870/4f0c3f4/src/multiplexer/clustered.rs:622 on reactor
thread 'reactor' panicked at 'tried to clone Id(2251799813685263), but no span exists with that ID
This may be caused by consuming a parent span (`parent: span`) rather than borrowing it (`parent: &span`).', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tracing-subscriber-0.3.16/src/registry/sharded.rs:311:32
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

It doesn't appear when I use info level and reconnection works totally fine (with this fix)
Thanks for all your efforts @aembke!

@aembke
Copy link
Owner

aembke commented Dec 7, 2022

Ah yeah, fixing the tracing logic is the last thing on my TODO list before I push the beta to crates.io. I'll hopefully have an update here later today or tomorrow.

@aembke
Copy link
Owner

aembke commented Dec 8, 2022

@DDtKey can you provide a bit more info on how you configured the tracing logic?

I just pushed some changes to fix/tracing that update the docs and OTEL dependency versions, but everything seems to be working fine for me with both partial-tracing and full-tracing now on that branch.

Would you mind trying to repro this with the bin/pipeline_test module against a local jaeger instance on fix/tracing? I have some install steps in the README there that show how I set everything up. I tried the following and they all worked for me, but perhaps we're configuring things differently.

RUST_LOG=debug cargo run --release --features partial-tracing -- -C 1 -c 50 -P 1 -t
RUST_LOG=debug cargo run --release --features full-tracing -- -C 1 -c 50 -P 1 -t
RUST_LOG=debug cargo run --release --features stdout-tracing -- -C 1 -c 50 -P 1 -t

@DDtKey
Copy link
Contributor Author

DDtKey commented Dec 9, 2022

@aembke Hey, sure.
Take a look at #74 (draft PR with description)

@aembke aembke mentioned this issue Dec 10, 2022
@DDtKey
Copy link
Contributor Author

DDtKey commented Dec 20, 2022

@aembke hey!
Unfortunately it's still relevant issue. I've encountered with the same behavior for 6.0.0-beta.2 under the high-load even with log-level INFO. And moreover it wasn't related to restart of redis or similar. I just overloaded redis with requests.
But other nodes(that not overloaded) connected to the same redis(also with fred) works well at the same time.

I'll try to repro that. It looks like some race-conditions, isn't it possible that command is timed out after the corresponding check, but before creating span?

thread 'reactor' panicked at 'tried to clone Id(2252074691592806), but no span exists with that ID
This may be caused by consuming a parent span (`parent: span`) rather than borrowing it (`parent: &span`).', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/tracing-subscriber-0.3.16/src/registry/sharded.rs:311:32
   0: rust_begin_unwind
             at ./rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at ./rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:142:14
   2: <tracing_subscriber::registry::sharded::Registry as tracing_core::subscriber::Subscriber>::clone_span
   3: <tracing_subscriber::layer::layered::Layered<L,S> as tracing_core::subscriber::Subscriber>::new_span
   4: tracing::span::Span::make_with
   5: fred::multiplexer::utils::prepare_command
   6: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   7: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   8: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   9: fred::multiplexer::commands::process_command::{{closure}}
  10: tokio::runtime::task::core::CoreStage<T>::poll
  11: tokio::runtime::task::raw::poll
  12: tokio::runtime::scheduler::multi_thread::worker::Context::run_task

Before that I see a lot of:

[fred::trace::enabled::wait_for_response - start]
[fred::trace::enabled::wait_for_response - end]

@DDtKey
Copy link
Contributor Author

DDtKey commented Dec 20, 2022

Oh, I was able to repro that. It looks related to back-pressure, because I start to face with that under high value of concurrency
Check this PR pls #88

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants