Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix interleaved non-deferred spans #78

Merged
merged 4 commits into from
Mar 14, 2024

Conversation

ten3roberts
Copy link
Contributor

@ten3roberts ten3roberts commented Dec 12, 2023

Fixes: #74

Previously, spans were not retraced if they were printed by a nested span open event, as it was wrongfully assumed a span only printed once an event was generated by it, and not by its entering alone.

This further fixes verbose_entry combined with retrace, as the latter broke the indent line connectivity and only printed the immediate successor of the span, not the parent of the entered tree (as multiple spans can be entered at once during retracing)

1:main┐concurrent_eager::spawn_fut key="a"
1:main┐concurrent_eager::spawn_fut key="b"
1:main├───┐concurrent_eager::a 
1:main├───┘
1:main├───┐concurrent_eager::b 
1:main├───┘
1:main┘
1:main┘

This is now fixed, which means that retrace=true, deferred=false should now work properly for interleaved or multithreaded futures.

1:main┐concurrent_eager::spawn_fut key="a"
1:main┐concurrent_eager::spawn_fut key="b"
1:main┐concurrent_eager::spawn_fut key="a"
1:main├───┐concurrent_eager::a 
1:main├───┘
1:main┐concurrent_eager::spawn_fut key="b"
1:main├───┐concurrent_eager::b 
1:main├───┘
1:main┘
1:main┘

@ten3roberts
Copy link
Contributor Author

@safinaskar, let me know if this fixes your issue 😊

@ten3roberts
Copy link
Contributor Author

This is how it now looks with eager and retraced spans (fixed behavior)

image

All events and span events now correctly lie under their parents

And how it would look using eager without retrace (default behavior, unchanged)

image

With deferred spans, it will wait to print any span until some event is generated, thus

image

@ten3roberts
Copy link
Contributor Author

Would we consider making the retrace behavior default, as it does not change anything for non-concurrent code, but rather improves the concurrent aspect?

The current default behavior makes the tree misleading as events seem to come from spans which they do not.

@ten3roberts ten3roberts marked this pull request as ready for review December 13, 2023 12:32
@ten3roberts
Copy link
Contributor Author

Anyhow, this is now ready for review.

@oli-obk, would you mind taking a look?

@safinaskar
Copy link

Yes, my original issue is fixed in 09adf54 .

Span events still don't have timestamps. I asked for this in #74 (comment) . (Of course, this is feature request, not a bug.)

with_verbose_entry remains to be buggy. Example from #74 (comment) prints this:

 t1 
 t2 
 t1 
 t1 
     a1 
 t2 
 t2 
     a2 
     a1 
 t1 
     a1 
         b1 
     a2 
 t2 
     a2 
         b2

This output suggests that a1 is child of t2, and a2 is child of t1, which is wrong.

Also, I don't understand why with_verbose_entry will be needed at all if this pull request is to be accepted. Maybe it is good solution just to mark it as obsolete. If you think it is still needed, then, please, properly document, what it is needed for

@ten3roberts
Copy link
Contributor Author

Awesome.

Could you enable with_targets and with_indent_lines, it makes it easier to see which span is which and if it was opened or closed in these kinds of issues.

For the "if I enable with_deferred_spans nothing is printed", that is intended behavior, since it will defer all span opening and closing printing until there is an event, and if there isn't it is as if the span never happened. This is intended for cases where the spans are used to tell where an event came from, and the span itself is not necessarily important, such as if you open a span in a function which only logs on an error. You want the span to be printed with the error, but not necessarily every time when there is not error. I agree with you that "just make them an artificial event" is not a solution, because it loses the extra context of a "span" that wraps its content if it is flattened like that. I would like it to remain a branch like thing instead of a message what makes it hard to tell if the next message is related or not (which is the biggest strenght of this crate compared to tracing-subscriber::fmt).

As for the timestamps, I think it would be beneficial to open another issue with the timestamp. Though I am not a maintainer of this crate, I think it would be beneficial for some kind of "Formatter" for span prefixes, which would also allow you to customize the current duration within a span, E.g;13ms INFO ... and instead roll your own or have a set of included ones.

These features would be low maintenance since they do not interact with other functionality but rather just prints something before, in contrast to the span retracing which requires more testing due to its interaction and wider behavior change.

I'll be happy to add both if the crate authors deem them useful.

@safinaskar
Copy link

This is output of program from #74 (comment) with 09adf54 with with_targets and with_indent_lines, as you requested:

┐ooo::t1 
┐ooo::t2 
├─┐ooo::t1 
┐ooo::t1 
├───┐ooo::a1 
├─┐ooo::t2 
┐ooo::t2 
├───┐ooo::a2 
│   ├─┐ooo::a1 
┐ooo::t1 
├───┐ooo::a1 
│   ├───┐ooo::b1 
│   ├───┘
├───┘
┘
│   ├─┐ooo::a2 
┐ooo::t2 
├───┐ooo::a2 
│   ├───┐ooo::b2 
│   ├───┘
├───┘
┘

As you can see, the output is absolutely wrong

@ten3roberts
Copy link
Contributor Author

This is output of program from #74 (comment) with 09adf54 with with_targets and with_indent_lines, as you requested:

┐ooo::t1 
┐ooo::t2 
├─┐ooo::t1 
┐ooo::t1 
├───┐ooo::a1 
├─┐ooo::t2 
┐ooo::t2 
├───┐ooo::a2 
│   ├─┐ooo::a1 
┐ooo::t1 
├───┐ooo::a1 
│   ├───┐ooo::b1 
│   ├───┘
├───┘
┘
│   ├─┐ooo::a2 
┐ooo::t2 
├───┐ooo::a2 
│   ├───┐ooo::b2 
│   ├───┘
├───┘
┘

As you can see, the output is absolutely wrong

Thank you.

I've managed to look into this a little bit now, and the ooo::a2 is a pre-open span event, which prints the parent of any new span to show "where it opened from".

This does currently not play too well with multi-level retracing, so I'll fix that up.

pre-open

"open span b, parent of a":

├─ Event
├─ Event
├┐b (span b) // this one will look detached and disconned with the retrace line, and will not match the indent as a more than 1 level of spans may be printed depending on the previous span
 └┐ a (new span a)

@ten3roberts ten3roberts force-pushed the fix-interleaved-eager branch 2 times, most recently from e490512 to aa43680 Compare January 22, 2024 12:55
@ten3roberts
Copy link
Contributor Author

@safinaskar, I have now revised the logic for handling span pre-open logic.

Let me if it fixes your issue

@davidbarsky
Copy link
Owner

@ten3roberts sorry for the delay in looking at this; reviewing now!

Copy link
Owner

@davidbarsky davidbarsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good, thanks!

@ten3roberts
Copy link
Contributor Author

Thank you, glad to help

@ten3roberts
Copy link
Contributor Author

@davidbarsky

It seems Safin is not coming back soon.

Do you think we can merge this as is, as it is in a working state (or at least working much better than prior)?

@davidbarsky
Copy link
Owner

@davidbarsky

It seems Safin is not coming back soon.

Do you think we can merge this as is, as it is in a working state (or at least working much better than prior)?

@ten3roberts I think so? I'm not sure if I fully understand the implications of this change, but I'll run it tomorrow and merge it then.

@ten3roberts
Copy link
Contributor Author

Thanks.

This PR contains two bugfixes.

The first one is fixing an invalid assumption for non-deferred retraced spans. The retracing only happened for events, and not other span open events. This means that for deferred=false, retrace=true, events in previously opened spans would not be retraced to their ancestor, as the span was only printed when in on_new_span. This is now fixed.

The other bug that safin pointed out was in combining pre_open and retrace.

The previous behavior printed the new spans parent before the entire retraced chain.

Consider the following example, where the current span before the event in D is a a sibling of B, causing the tree (B - C - D) to be printed from the common ancestor of A, hereby referred to as "retrace chain". This can happen when switching between futures or threads with deeper nesting from their spawn points.

     *-- C (preopen)
*-- B
      *-- C
            *-- D (new span)

It worked when there was not mutliple levels that were printed at once (so that C and D line up), but didn't account for the less indented grandparent A.

The fix is either

  1. (parent of new span)
*-- B
      *-- C
            C (prepoen)
            *-- D (new span)

or

2 (parent of retrace root)

A (preopen)
*-- B
      *-- C
            *-- D (new span)

I orignally went with #1, but after considering a bit I think #2 looks better now (printing the parent of the root of the retraced chain, and not the parent of the new span. The parent of the new span will either be printed in the retrace chain, or be the parent of the retrace root for single chains.

Let me know which method you prefer, and I'll be happy to change things around :)

The previous behavior was to print the parent of the new span, leading
to a parent being printed twice if the parent was also included in the
retrace chain.
@ten3roberts
Copy link
Contributor Author

For a real example

use tracing::Instrument;

#[tracing::instrument]
async fn b1() {}

#[tracing::instrument]
async fn b2() {}

#[tracing::instrument]
async fn a1() {
    tokio::time::sleep(std::time::Duration::from_secs(2)).await;
    tracing::info!("sleep done");
    b1().await;
    tracing::info!("a1 done");
}

#[tracing::instrument]
async fn a2() {
    tokio::time::sleep(std::time::Duration::from_secs(2)).await;
    tracing::info!("sleep done");
    b2().await;
}

#[tracing::instrument]
async fn t1() {
    tokio::time::sleep(std::time::Duration::from_secs(2)).await;
    a1().await;
}

#[tracing::instrument]
async fn t2() {
    tokio::time::sleep(std::time::Duration::from_secs(2)).await;
    a2().await;
}

#[tokio::main(flavor = "current_thread")]
async fn main() {
    use tracing_subscriber::layer::SubscriberExt;
    use tracing_subscriber::util::SubscriberInitExt;
    tracing_subscriber::Registry::default()
        .with(
            tracing_tree::HierarchicalLayer::new(4)
                .with_span_retrace(true)
                .with_span_modes(true)
                .with_indent_lines(true)
                .with_verbose_entry(true),
        )
        .init();

    let _span = tracing::info_span!("main").entered();
    tracing::info!("foo");
    tokio::spawn(t1().instrument(tracing::info_span!("t1-task")));
    tokio::time::sleep(std::time::Duration::from_secs(1)).await;
    tokio::spawn(t2().instrument(tracing::info_span!("t2-task")));
    tokio::time::sleep(std::time::Duration::from_secs(10)).await;
}

Behavior #1 (as it always prints the parent of the new/leaf span, it prints t2 twice in the chain (retrace and pre_open)
image

Behavior #2 (no duplicates as it always prints the parent of the chain root (if present))
image

@ten3roberts
Copy link
Contributor Author

I've now pushed behavior #2 in 26b48a7

From the ui test output, you can see how it now would print the "server" span, rather than "countdowns" twice

image

@ten3roberts
Copy link
Contributor Author

@oli-obk could you have a look at this?

I'd appreciate any input on how you think this looks, as I consider it complete 😊

@oli-obk
Copy link
Collaborator

oli-obk commented Mar 14, 2024

Thanks! looks great.

@oli-obk oli-obk merged commit cdb2c5a into davidbarsky:main Mar 14, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Interleaved futures don't work properly if I specify with_span_retrace and don't specify with_deferred_spans
4 participants