Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

async/await: awaiting inside a match block captures borrow too eagerly #57017

Open
seanmonstar opened this issue Dec 20, 2018 · 16 comments
Open

async/await: awaiting inside a match block captures borrow too eagerly #57017

seanmonstar opened this issue Dec 20, 2018 · 16 comments

Comments

@seanmonstar
Copy link
Contributor

@seanmonstar seanmonstar commented Dec 20, 2018

If you use await!(some_fut) inside an arm of a match X, the generated future eagerly borrows the value of X, if it not needed.

This may not usually be noticeable, but the issue compounds when the type X contains a trait object, and the future you wish to return is impl Future + Send. This causes a misleading error message that "dyn Trait + Send cannot be shared between threads", which is required to for &X: Send.

Example

Here's a simple struct with a trait object:

struct Client(Box<Any + Send>);

Consider a function like this:

impl Client {
    fn status(&self) -> u16 {
        200
    }
}

You could consider using a match to determine what kind of future to await (or what arguments to pass):

async fn get() {
}

pub fn wat() -> impl Future + Send {
    let client = Client(Box::new(true));
    async move {
        match client.status() {
            200 => {
                let _x = await!(get());
            },
            _ => (),
        }
    }
}

If the await is moved out of the match block, all is well:

pub fn ok() -> impl Future + Send {
    let client = Client(Box::new(true));
    async move {
        if client.status() == 200 {
            let _x = await!(get());
        }
    }
}

The wat function causes this compilation error:

error[E0277]: `(dyn std::any::Any + std::marker::Send + 'static)` cannot be shared between threads safely
  --> src/main.rs:21:17
   |
21 | pub fn wat() -> impl Future + Send {
   |                 ^^^^^^^^^^^^^^^^^^ `(dyn std::any::Any + std::marker::Send + 'static)` cannot be shared between threads safely
   |
   = help: the trait `std::marker::Sync` is not implemented for `(dyn std::any::Any + std::marker::Send + 'static)`
   = note: required because of the requirements on the impl of `std::marker::Sync` for `std::ptr::Unique<(dyn std::any::Any + std::marker::Send + 'static)>`
   = note: required because it appears within the type `std::boxed::Box<(dyn std::any::Any + std::marker::Send + 'static)>`
   = note: required because it appears within the type `Client`
   = note: required because of the requirements on the impl of `for<'r> std::marker::Send` for `&Client`
   = note: required because it appears within the type `for<'r> {Client, &'r Client, u16, impl std::future::Future, ()}`
   = note: required because it appears within the type `[static generator@src/main.rs:23:16: 30:6 client:Client for<'r> {Client, &'r Client, u16, impl std::future::Future, ()}]`
   = note: required because it appears within the type `std::future::GenFuture<[static generator@src/main.rs:23:16: 30:6 client:Client for<'r> {Client, &'r Client, u16, impl std::future::Future, ()}]>`
   = note: required because it appears within the type `impl std::future::Future`
   = note: the return type of a function must have a statically known size

Playground: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=2a9dbea32d31457d50d40b99c52ee214 (updated to latest syntax -Niko)

@cramertj
Copy link
Member

@cramertj cramertj commented Dec 21, 2018

This is just an artifact of the compiler holding the borrow of client in your match expression for the duration of the match condition-- there's nothing async/await!-specific going on here. There's a potential fix to be made to eliminate the borrow of client sooner, but that's a general MIR representation issue.

@cramertj
Copy link
Member

@cramertj cramertj commented Dec 21, 2018

("liveness" is really the tag I want, but we don't have one for that ;) )

@jonas-schievink
Copy link
Member

@jonas-schievink jonas-schievink commented Dec 21, 2018

This might be a duplicate of #46525?

@seanmonstar
Copy link
Contributor Author

@seanmonstar seanmonstar commented Dec 26, 2018

I'm not certain it's only that; it does seem related to await!. The compiler seems to be fine with give up the borrow if I call a self or &mut self method inside the match.

@nikomatsakis
Copy link
Contributor

@nikomatsakis nikomatsakis commented Mar 5, 2019

Marking this as deferred -- the problem here (discussed here on Zulip) has to do with how we use an "overapproximation" to decide what data may be inside a generator, at least initially. This is based on the HIR. Only later do we get more precise results when generating MIR.

I think we could resolve this by making that over-appoximation more precise, but @Zoxc expressed a desire to do away with it altogether. This would obviously be better, if we can make it work, though I'm a bit skeptical. (See the Zulip topic for more details.)

@nikomatsakis
Copy link
Contributor

@nikomatsakis nikomatsakis commented Apr 16, 2019

I'm adding the AsyncAwait-Unclear label, because I'd like to revisit this question and try to make some progress here before we stabilize, but I'm not willing to call it blocking yet =)

@sdroege
Copy link
Contributor

@sdroege sdroege commented Jun 5, 2019

Another instance of this from #61211. Doesn't only affect what is borrowed there but also the trait bounds of the generated future.

    let foo = Mutex::new("123");

    let foo: Box<dyn futures::future::Future<Output = ()> + Send> = Box::new(async move {
        // In a separate block works
        // {
        let bar = foo.lock().unwrap();
        drop(bar);
        // }
        futures::future::lazy(|_| ()).await;
    });

Fails with

error[E0277]: `std::sync::MutexGuard<'_, &str>` cannot be sent between threads safely
@Mark-Simulacrum
Copy link
Member

@Mark-Simulacrum Mark-Simulacrum commented Jul 11, 2019

FWIW it took me ~30 minutes or so to decide that this was likely a compiler "bug" with async/await, after spending 10-15 minutes reducing the amount of code I had added to just one match statement. The errors are long -- and largely unintelligible. I'm not sure if I'd call this a blocker, but I would say that if it's not fixed, we should definitely include it in a FAQ section of "common errors" or something along those lines.

Here's a gist of the error. There is no clear mention of the actual source of the error, which is in a different crate (this snippet points at main.rs, which is a thin shim over the larger crate rooted at main.rs) that compiled fine(!) -- that means that this might be introduced accidentally in a library and not discovered until a client attempts to use said function in a manner requiring Sync/Send, for example.

@Mark-Simulacrum
Copy link
Member

@Mark-Simulacrum Mark-Simulacrum commented Jul 12, 2019

My understanding is yes; but I can't be certain.

@nikomatsakis
Copy link
Contributor

@nikomatsakis nikomatsakis commented Oct 1, 2019

@rustbot claim

I'm assigning this to myself not to necessarily fix but to organize a call where we can dig a bit into what it would take to fix this, especially if we can move the MIR construction earlier.

@nikomatsakis
Copy link
Contributor

@nikomatsakis nikomatsakis commented Oct 9, 2019

I tried an experiment today: in the trait select code, I inserted a call to the mir_built query at the point where we expand, for an auto-trait, generator types to get at their contents. Basically right before the let witness = ... line here:

ty::Generator(def_id, ref substs, _) => {
let witness = substs.as_generator().witness(def_id, self.tcx());
substs
.as_generator()
.upvar_tys(def_id, self.tcx())
.chain(iter::once(witness))
.collect()
}

This "mostly works" but did cause some errors. One example is the regression test for #64477:

use std::future::Future;
use std::pin::Pin;
fn f<T>(_: &T) -> Pin<Box<dyn Future<Output = ()> + Send>> {
unimplemented!()
}
pub fn g<T: Sync>(x: &'static T) -> impl Future<Output = ()> + Send {
async move { f(x).await }
}
fn main() { }

which now errors out with a cycle error:

query stack during panic:
#0 [mir_built] processing `bar::{{closure}}#0`
#1 [typeck_tables_of] processing `bar`
#2 [typeck_tables_of] processing `bar::{{closure}}#0`
#3 [type_of] processing `bar::{{closure}}#0`
#4 [collect_mod_item_types] collecting item types in top-level module
#5 [analysis] running analysis passes on this crate

I am now sort of investigating this cycle in a bit more detail. For example, the type_of query for generators requests the full "typeck tables" for the generator:

Node::Expr(&hir::Expr {
kind: hir::ExprKind::Closure(.., gen),
..
}) => {
if gen.is_some() {
return Some(tcx.typeck_tables_of(def_id).node_type(hir_id));
}
let substs = InternalSubsts::identity_for_item(tcx, def_id);
tcx.mk_closure(def_id, substs)
}

However, in contrast, the handling for closures is much simpler (and I think more appropriate). It seems likely that we could refactor this to be more similar to closures -- basically creating a generator type like Generator<DefId, W> where W is a formal type parameter representing the "witness". This would certainly allow the type_of query to terminate.

However, we are still going to have to construct the typeck_tables_of at some point. So the real problem here is that typeck_tables_of(bar::generator) for the generator invokes typeck_tables_of(bar) for the base type which then, in turn, attempts to solve some trait and triggers the MIR construction query. What's going on there?

Presumably the reason that typeck_tables_of(bar) winds up requesting the generators types is that it is trying to prove that its hidden return type (which involves the generator) implements Send (i.e., that it meets its impl Send bound).

So why is typeck_tables_of(bar::generator) invoking typeck_tables_of? Well, this is because the typeck-tables for a closure and its container are created together:

fn typeck_tables_of(tcx: TyCtxt<'_>, def_id: DefId) -> &ty::TypeckTables<'_> {
// Closures' tables come from their outermost function,
// as they are part of the same "inference environment".
let outer_def_id = tcx.closure_base_def_id(def_id);
if outer_def_id != def_id {
return tcx.typeck_tables_of(outer_def_id);
}

This happens because of the limitations around inference; the two are inferred in a coherent unit. So the full type of something in the generator can be inferred as a result of the surrounding function. You can see that in this example (playground):

use std::fmt::Debug;

fn is_debug<T: Debug>(_: T) { }

fn main() {
    let mut v: Option<_> = None;
    
    let generator = async move {
        is_debug(v);
    };
    
    if false {
        // Commenting out this line will yield an error, because the type
        // of `v` is not specified.
        v = Some(String::new());
    }
    
    drop(generator);
}

So my conclusion from this is that we can't "just" construct the MIR from the generator and use it to inform its constituent types. It's going to take a bit more cleverness.

Somewhat annoyingly, we probably also can't remove the checking of bounds from type-checking the function, although we might be able to do something. Right now, when we start type-checking bar, we register some obligations for the hidden type $H (a type variable) like $H: Send. After all, we want to check that the hidden-type is Send. It is these obligations that are causing the cycle problem. You'd almost think we could defer them until later (i.e., finish generating the types and then check them), except that in some cases the search for impls could influence the type checker.

Consider this example (which compiles today):

struct Foo<T> { t: Option<T> }

unsafe impl Send for Foo<u32> { }

fn foo() -> impl Send {
    Foo { t: None }
}

fn main() { }

Here, the hidden type $H is Foo<$T> where $T is basically unconstrained. Forcing $H: Send results in $T = u32.

Anyway, not saying we can't construct the MIR and therefore get precise results, but it's going to take a bit of careful thought to untangle these dependencies.

@Aaron1011
Copy link
Member

@Aaron1011 Aaron1011 commented Oct 17, 2019

@nikomatsakis: That last example is extremely surprising to me - the compiler is essentially pulling a type out of thin air. I think this can only occur when there is exactly one possible impl - if I add unsafe impl Send for Foo<bool>, I get the expected inference error.

I think it would be possible to divide up the generated bounds into two categories:

  1. 'Normal' bounds (those not involving a generator type).
  2. 'Generator' bounds (e.g. <generator_type>: Send)

'Generator' bounds will have deferred checking as you described, while 'normal' bounds will still be checked during type checking (to preserve this odd inference behavior).

I believe the fact that generator types are unnameable prevents this from having an effect on type inference. If I understand correctly, this kind of inference occurs when we unify a type variable (i.e $T) with a type from an applicable impl block when checking SomeType<$T>: Send. However, generators don't have (normal) generics, so it's impossible for checking such a predicate to ever cause a type variable to be unified.

In general, I think we could safely defer checking any predicate of the form SomeType: SomeTrait, where SomeType and SomeTrait have no inference variables. However, I don't think there's any advantage to doing this for types other than generators, since the entire point of doing this is to generate generator MIR early on.

@Aaron1011
Copy link
Member

@Aaron1011 Aaron1011 commented Oct 19, 2019

I'm working on this, and will have a PR open soon.

@Aaron1011
Copy link
Member

@Aaron1011 Aaron1011 commented Oct 19, 2019

@rustbot claim

@rustbot rustbot assigned Aaron1011 and unassigned nikomatsakis Oct 19, 2019
@Aaron1011
Copy link
Member

@Aaron1011 Aaron1011 commented Oct 19, 2019

@nikomatsakis: Oops - didn't mean to unassign you.

Aaron1011 added a commit to Aaron1011/rust that referenced this issue Oct 24, 2019
When working on rust-lang#57017, I discovered that generators are currently
considered `Freeze` based on the witness types they contain (just like
any other auto trait). That is, a generator which contains a `!Freeze`
type (e.g. `Cell<T>`) will also be `!Freeze`.

However, I believe that it is sound to always treat generator types as
`Freeze`, regarless of what types they may store internally. The only
possible of interacting with a generator is through the `resume` method
of the generator trait. `resume` takes a `Pin<&mut Self>`, which
requires a `&mut Self` to be created. In other words, given only a
shared reference to a generator (`&{generator}`), it is impossible to
mutate the generator. Note that is true regardless of whether or not
types *inside* the generator have interior mutability - they can only
change when `resume` is called, which is impossible to call using a
shared reference.

The motivation for this PR is to support further work on rust-lang#57017. My
approach is to delay resolution of auto-trait predicates (e.g.
`{generator}: Send) until after we've constructed the generator MIR
(specifically, after we've run the `StateTransform` MIR transformation
pass). However, the const qualification pass (which runs before
`StateTransform`) explictly checks whether types are `Freeze`. There are
several ways of resolving this:

1. Refactor const-qualification to avoid the need to check whether
generators are `Freeze`. This might involve running (part of ) const
qualification after the `StateTransform` pass runs.
2. Use the current, more conservative approach for generator when
checking `Freeze` - that is, use the witness types computed from the
HIR.
3. Make generators always `Freeze`.

Option 1 and 2 introduce a large amount of additional complexity for the
sole purpose of supporting generators. Option 3 (this PR), requires only
a minor change to `SelectionContext`. Theoretically, it could also
improve performance, since we now perform less work when checking for
`Freeze` types.

This should probably have a test, but I'm not really sure how to write
one, considering that `Freeze` is private.
Aaron1011 added a commit to Aaron1011/rust that referenced this issue Oct 24, 2019
When working on rust-lang#57017, I discovered that generators are currently
considered `Freeze` based on the witness types they contain (just like
any other auto trait). That is, a generator which contains a `!Freeze`
type (e.g. `Cell<T>`) will also be `!Freeze`.

However, I believe that it is sound to always treat generator types as
`Freeze`, regardless of what types they may store internally. The only
possible of interacting with a generator is through the `resume` method
of the generator trait. `resume` takes a `Pin<&mut Self>`, which
requires a `&mut Self` to be created. In other words, given only a
shared reference to a generator (`&{generator}`), it is impossible to
mutate the generator. Note that is true regardless of whether or not
types *inside* the generator have interior mutability - they can only
change when `resume` is called, which is impossible to call using a
shared reference.

The motivation for this PR is to support further work on rust-lang#57017. My
approach is to delay resolution of auto-trait predicates (e.g.
`{generator}: Send) until after we've constructed the generator MIR
(specifically, after we've run the `StateTransform` MIR transformation
pass). However, the const qualification pass (which runs before
`StateTransform`) explictly checks whether types are `Freeze`. There are
several ways of resolving this:

1. Refactor const-qualification to avoid the need to check whether
generators are `Freeze`. This might involve running (part of ) const
qualification after the `StateTransform` pass runs.
2. Use the current, more conservative approach for generator when
checking `Freeze` - that is, use the witness types computed from the
HIR.
3. Make generators always `Freeze`.

Option 1 and 2 introduce a large amount of additional complexity for the
sole purpose of supporting generators. Option 3 (this PR), requires only
a minor change to `SelectionContext`. Theoretically, it could also
improve performance, since we now perform less work when checking for
`Freeze` types.

This should probably have a test, but I'm not really sure how to write
one, considering that `Freeze` is private.
Aaron1011 added a commit to Aaron1011/rust that referenced this issue Oct 24, 2019
When working on rust-lang#57017, I discovered that generators are currently
considered `Freeze` based on the witness types they contain (just like
any other auto trait). That is, a generator which contains a `!Freeze`
type (e.g. `Cell<T>`) will also be `!Freeze`.

However, I believe that it is sound to always treat generator types as
`Freeze`, regardless of what types they may store internally. The only
possible of interacting with a generator is through the `resume` method
of the generator trait. `resume` takes a `Pin<&mut Self>`, which
requires a `&mut Self` to be created. In other words, given only a
shared reference to a generator (`&{generator}`), it is impossible to
mutate the generator. Note that is true regardless of whether or not
types *inside* the generator have interior mutability - they can only
change when `resume` is called, which is impossible to call using a
shared reference.

The motivation for this PR is to support further work on rust-lang#57017. My
approach is to delay resolution of auto-trait predicates (e.g.
`{generator}: Send)` until after we've constructed the generator MIR
(specifically, after we've run the `StateTransform` MIR transformation
pass). However, the const qualification pass (which runs before
`StateTransform`) explictly checks whether types are `Freeze`. There are
several ways of resolving this:

1. Refactor const-qualification to avoid the need to check whether
generators are `Freeze`. This might involve running (part of ) const
qualification after the `StateTransform` pass runs.
2. Use the current, more conservative approach for generator when
checking `Freeze` - that is, use the witness types computed from the
HIR.
3. Make generators always `Freeze`.

Option 1 and 2 introduce a large amount of additional complexity for the
sole purpose of supporting generators. Option 3 (this PR), requires only
a minor change to `SelectionContext`. Theoretically, it could also
improve performance, since we now perform less work when checking for
`Freeze` types.

This should probably have a test, but I'm not really sure how to write
one, considering that `Freeze` is private.
@tmandry tmandry added this to High priority in wg-async-foundations triage Feb 11, 2020
@tmandry tmandry added this to In progress in wg-async-foundations work Feb 11, 2020
@tmandry tmandry moved this from In progress to Blocked in wg-async-foundations work Feb 18, 2020
@Aaron1011 Aaron1011 removed their assignment Dec 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
9 participants