Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracing 0.1.38 included accidentally-breaking change from added Drop impl #2578

Open
jonhoo opened this issue Apr 28, 2023 · 19 comments
Open

Comments

@jonhoo
Copy link
Contributor

jonhoo commented Apr 28, 2023

Bug Report

Version

Builds fine:

tracing-regression v0.1.0 (/local/home/jongje/dev/tmp/tracing-regression)
└── tracing v0.1.37
    ├── tracing-attributes v0.1.24 (proc-macro)
    └── tracing-core v0.1.30

Is broken:

tracing-regression v0.1.0 (/local/home/jongje/dev/tmp/tracing-regression)
└── tracing v0.1.38
    ├── tracing-attributes v0.1.24 (proc-macro)
    └── tracing-core v0.1.30

Platform

Linux host 5.4.235-151.344.amzn2int.aarch64 #1 SMP Sat Mar 11 23:52:03 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

Crates

tracing and tracing-attributes (I believe a Drop impl was added in the latter as well)

Description

The change in #2562 that was intended to fix #2541 is a breaking change (and indeed broke my code). In particular, we now have
https://github.com/ilslv/tracing/blob/73e73a9d7677a848c614c976ac176d56c889bb9b/tracing/src/instrument.rs#L307

whereas we didn't before. If you have impl Drop for type with a generic type parameter T, Rust assumes that the type T may be used in the Drop implementation, and therefore requires that T outlives the wrapper type. This wasn't previously the case — without the Drop, the T only had to live as long as the last use of the wrapper, but crucially not all the way until it is dropped. You can see this with code like:

use std::task::{Context, Poll};
use tower_service::Service;
use tracing::instrument::Instrument;

pub struct Svc;

impl tower_service::Service<()> for Svc {
    type Response = ();
    type Error = ();
    type Future = std::future::Ready<Result<Self::Response, Self::Error>>;

    fn poll_ready(&mut self, _: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        Poll::Ready(Ok(()))
    }

    fn call(&mut self, _: ()) -> Self::Future {
        std::future::ready(Ok(()))
    }
}

pub async fn foo(s: tracing::Span, f: &mut Svc) {
    let mut _ready = f.ready().instrument(s.clone());
    // _ready is no longer used from this point on
    //
    // The type of _ready is Instrumented<Ready<&mut Svc>>
    // Rust only permits the &mut Svc to be used if
    // neither Ready nor Instrumented have a Drop impl.
    // This is because otherwise the implicit drop(_ready)
    // below might access &mut Svc, in which case any use
    // in the interrim would be unsound.
    //
    // (
    //   Alternatively, `impl Drop` is permitted if
    //   each <T> is marked #[may_dangle] as per
    //   https://github.com/rust-lang/rust/issues/34761
    // )
    let _ = s.in_scope(|| f.call(()));
    // implicit drop(_ready)
}

trait ServiceExt {
    fn ready(&mut self) -> std::future::Ready<&'_ mut Self>
    where
        Self: Sized,
    {
        std::future::ready(self)
    }
}

impl<T> ServiceExt for T where T: Service<()> {}

which will build fine with tracing 0.1.37, but no longer builds with tracing 0.1.38 (which has this change):

error[E0500]: closure requires unique access to `*f` but it is already borrowed
  --> src/lib.rs:37:24
   |
22 |     let mut _ready = f.ready().instrument(s.clone());
   |                      --------- borrow occurs here
...
37 |     let _ = s.in_scope(|| f.call(()));
   |                        ^^ - second borrow occurs due to use of `*f` in closure
   |                        |
   |                        closure construction occurs here
38 |     // implicit drop(_ready)
39 | }
   | - first borrow might be used here, when `_ready` is dropped and runs the `Drop` code for type `Instrumented`

This likely warrants a yank if enough people run into it. It could just be me, though this was in fairly innocuous code. Tracing it back to this change also wasn't trivial, so some folks may run into this but not realize how the issue arose for them in the first place.

The change will either need to be rolled back, or we'll need to include a #[may_dangle] annotation on the added impl Drop.

@jonhoo jonhoo changed the title tracing 0.1.38 included accidentally-breaking change tracing 0.1.38 included accidentally-breaking change from added Drop impl Apr 28, 2023
@hawkw
Copy link
Member

hawkw commented Apr 29, 2023

Oh, that's really unfortunate. Thanks for the report.

I'm willing to consider yanking tracing 0.1.38 for now while we discuss a longer-term solution. I'm not personally very familiar with the behavior of the #[may_dangle] annotation, so I'll have to read up on it.

I don't believe tracing-attributes will have to be yanked --- it was tracing-futures where a similar Drop impl was added.

@hawkw
Copy link
Member

hawkw commented Apr 29, 2023

I went ahead and yanked v0.1.38 until we figure out a long-term fix.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 29, 2023

Yeah, I think it's just tracing-futures (because of this impl) and tracing (because of this impl) that need to be yanked. I think I just saw tracing-attributes in the file diff list and assumed it was affected too.

As for #[may_dangle], the big problem is that it's still unstable (so nightly only), which means you'd still be breaking anyone not on nightly. The feature and its use is discussed in some detail in the Nomicon (and in one of my videos :p), but it's still in flux (the new RFC text is actually pretty helpful), so relying on it even on nightly is probably not a great call right now. Ultimately though, what you'd want is something like:

unsafe impl<#[may_dangle(droppable)] T> PinnedDrop for Instrumented<T> {

This may also end up requiring support from pin_project when that day comes.

The unsafe here is needed to assert that the drop impl indeed does not access T in any way except to drop it.

Which is all to say, I think our options are:

  1. Land this as-is and hope that the breakage is minimal.
  2. Cut a breaking release of tracing and tracing-futures.
  3. Revert the change and re-introduce it when #[may_dangle] is stabilized.

What's kind of annoying about option 2 is that you really do want the #[may_dangle] annotation here — without it, users of the library will find that entirely reasonable code (like what I gave above) does not compile. So if we cut a new major version of these crates, users would end up with worse ergonomics (at least in this one way) until such time as #[may_dangle] stabilizes, at which point we could start using it, and at which point the breaking change would no longer be necessary in the first place 😅

You have a better idea than I do about how valuable the change in #2562 is, and whether it's okay to delay or not.

@davidbarsky
Copy link
Member

Can we have another patch release since the fix has already merged?

@NobodyXu I'm not sure what fix you're referring to, but nothing was committed to this repo since this issue was reported, nor can I see any linked issue that might be a fix for this.

@NobodyXu
Copy link
Contributor

Sorry I mistook some other linked PRs as fix.

Really sorry about that, I was probably too sleepy.

@jonhoo
Copy link
Contributor Author

jonhoo commented Apr 30, 2023

One potentially useful data point: a (small) test internally at $work only revealed a single code instance (my code) that got bitten by this backwards-incompatibility. I could run a broader sweep internally to see if I find anything else that breaks if that'd be useful.

JamesHinshelwood added a commit to Zilliqa/zq2 that referenced this issue May 1, 2023
This reverts commit b914250.

tracing 0.1.38 was yanked due to an unintended
backwards-incompatible change in a patch version bump:
tokio-rs/tracing#2578 (comment)
JamesHinshelwood added a commit to Zilliqa/zq2 that referenced this issue May 1, 2023
This reverts commit b914250.

tracing 0.1.38 was yanked due to an unintended
backwards-incompatible change in a patch version bump:
tokio-rs/tracing#2578 (comment)
JamesHinshelwood added a commit to Zilliqa/zq2 that referenced this issue May 1, 2023
This reverts commit b914250.

tracing 0.1.38 was yanked due to an unintended
backwards-incompatible change in a patch version:
tokio-rs/tracing#2578 (comment)
rillian added a commit to brave/star-randsrv that referenced this issue May 2, 2023
The current release of the tracing crate was yanked for an unintended
API change. tokio-rs/tracing#2578

Revert to the next-latest release to address the audit warning.
JamesHinshelwood added a commit to Zilliqa/zq2 that referenced this issue May 3, 2023
This reverts commit b914250.

tracing 0.1.38 was yanked due to an unintended
backwards-incompatible change in a patch version:
tokio-rs/tracing#2578 (comment)
wizard-28 added a commit to pacstall/pacbot that referenced this issue May 6, 2023
@obi1kenobi
Copy link

👋 cargo-semver-checks maintainer here — is there a way this could have been caught by looking at rustdoc?

I understand that the "T must outlive the wrapper type" requirement can be breaking. Is it possible that in some circumstance it could already have been the case (is T: Self a valid bound?) so that adding the impl Drop would be guaranteed to not be breaking in that case?

I've added this example to the cargo-semver-checks lint wishlist: obi1kenobi/cargo-semver-checks#5

@jonhoo
Copy link
Contributor Author

jonhoo commented May 14, 2023

I think the way to detect it is if a Drop impl is ever added to a generic type. That's all it takes if I'm not mistaken (which it could totally be that I am). In practice it often won't break people because non static things don't end up in the generics, but if they ever do then you hit this. Unless the unstable #[may_dangle] is also added that is.

@obi1kenobi
Copy link

Could the generic type already have a bound that is equivalent or stronger than the bound that Drop would add?

For example, would adding a Drop impl to Foo<T> where T: 'static be breaking in this way? Are there any other edge cases like this?

Trying to figure out the exact delineation to avoid false-positives in cargo-semver-checks.

@NobodyXu
Copy link
Contributor

For example, would adding a Drop impl to Foo<T> where T: 'static be breaking in this way? Are there any other edge cases like this?

IMHO it is a breaking change, with T: 'static, T cannot contain any non-'static reference anymore.

@obi1kenobi
Copy link

Sorry, I didn't state that clearly enough. If Foo<T> originally required T: 'static before Drop was added, would adding the Drop impl still have been breaking?

And are any other bounds like this that would also have made adding Drop non-breaking?

@NobodyXu
Copy link
Contributor

I don't think this it is even valid.
AFAIK rustc would require the Drop impl to have the same bound as Foo definition.

@obi1kenobi
Copy link

Here's exactly what I mean. The Drop has the same bounds as the type originally had. Is the following change breaking?

Before:

pub struct Foo<T> where T: 'static {
    ...
}

After:

pub struct Foo<T> where T: 'static {
    ...
}

impl<T: 'static> Drop for Foo<T> {
    ...
}

Can the T: 'static bound be replaced with any other bound while still making this change non-breaking?

@jonhoo
Copy link
Contributor Author

jonhoo commented May 14, 2023

It's a good question. I think T: 'static already existing as a bound would be enough to make this not breaking. The added Drop impl only affects the drop check, and the drop check only shows up in the form of lifetime errors, so if all captured lifetimes are already static, I think no new errors could arise. @RalfJung would probably be a better person to ask.

@obi1kenobi
Copy link

obi1kenobi commented May 14, 2023

Thanks for the help!

Is a bound like T: Self or T: 'self possible to write here? The Drop implementation seems to add an implicit "generics outlive Self" bound but I'm not sure if that's something that's normally expressible in the type system.

I'd love to add an automated check for this in cargo-semver-checks but I'm worried about getting the edge cases wrong and giving people bad advice, either false-negative or false-positive.

@NobodyXu
Copy link
Contributor

@jonhoo I wonder if we can workaround this by wrapping the future itself:

let future = async move {
    let mut span_entered = None;
    let future = user_provided_future;
    let _guard_exit = scopeguard::guard(span, |span| { span_entered = span.entered(); });
    future.await
};

It could avoid the drop implementation altogether, since rust drops variable in the reverse of declaration order.

@RalfJung
Copy link
Contributor

RalfJung commented May 17, 2023

Sorry, I don't have the time to dig into this currently, but I hope the documentation added by rust-lang/rust#103413 helps. AFAIK currently making a type Drop is technically a breaking change even if the type is 'static, i.e. there are examples where adding a String to a tuple introduced dropck failures. There are also people who are unhappy about this and want to fix it, but I don't know if that fix is sufficient to make this a non-breaking change. Some links for further reading:

There is no point in adding a bound to the Drop impl. Such bounds are generally rejected, and if they are accepted here then that can only be because the bound is a NOP.

Of course the other question is whether adding a Drop here is likely to break anything in practice. But I guess this issue shows that the answer is "yes". But then why was there a question above whether there is a breaking change if we have a real-world example that got broken? I am missing some context here from not having the time to read the entire thread, it seems... sorry if what I said makes little sense.^^

@obi1kenobi
Copy link

But then why was there a question above whether there is a breaking change

Sorry for the confusion! As the maintainer of cargo-semver-checks I try to look at real-world breaking changes and distill them into cargo-semver-checks lints. Part of that is "wiggling the preconditions" to find the exact dividing line between breaking and non-breaking change, so as to make sure cargo-semver-checks doesn't report false-positive errors. This is why I was asking about 'static and such, even though none of that was the case in this specific instance.

Thanks for all the help, everyone!

@poliorcetics
Copy link
Contributor

After 0.1.38 was yanked, no 0.1.39 was published with the revert/fix, meaning people who updated quickly currently use 0.1.38 and will never receive a dependabot notif for a new version nor for the yanking

rye added a commit to frog-pond/ccc-server-next that referenced this issue Jul 24, 2023
Tracing was yanked, which left things in an interesting state that
renovate didn't seem to know how to handle.

I also regenerated the lockfile, which pulled in a bunch of updates.
This also fixed the CI issues from proc-macro2 / ts-rs and the beta
clippy.

Signed-off-by: Kristofer Rye <kristofer.rye@gmail.com>

[1]: tokio-rs/tracing#2578
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants