Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#[derive] sometimes uses incorrect bounds #26925

Open
comex opened this Issue Jul 9, 2015 · 39 comments

Comments

Projects
None yet
@comex
Copy link
Contributor

comex commented Jul 9, 2015

In the following code:

#[derive(Copy, Clone)]
struct Y<T>(&'static fn(T));

both derives expand to impls that require the corresponding trait to be implemented on the type parameter, e.g.:

#[automatically_derived]
impl <T: ::std::marker::Copy> ::std::marker::Copy for Y<T> where
 T: ::std::marker::Copy {
}

However, this isn't actually necessary, as Y<T> will still be eligible for Copy regardless of whether T is.

This may be hard to fix, given the compilation phase at which #[derive] works...

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Jul 9, 2015

This is a dupe of #7671 which I suspect may've been accidentally closed. (i.e. the patch that closed it was modified to no longer tackle it.)

@apasel422

This comment has been minimized.

Copy link
Member

apasel422 commented Jul 10, 2015

#19839 is relevant.

@eefriedman

This comment has been minimized.

Copy link
Contributor

eefriedman commented Aug 11, 2015

The bounds that #[derive] uses aren't conservative, they're just wrong. Whether a type parameter implements the derived trait is in theory unrelated to whether a given field implements that trait. The defaults just appear to work most of the time because in practice people use traits and field types where the approximation is reasonably accurate.

Unfortunately, the obvious fix (just bounding the field types) causes breakage:

#[derive(Clone)]
struct Item<A> { inner: A }
#[derive(Clone)]
pub struct IntoIter<A> { inner: Item<A> }
// The obvious bound is `Item<A> : Clone`, but we can't use that bound
// because `Item` is private.

And actually, in the general case there is no bound we can expose because it could involve a requirement that a parameter type implements a private trait.

We could in theory special-case certain types syntactically, but that only improves a few special cases like Clone of a reference.

It's possible that we could punt computing the bounds off to type-checking (add syntax like where each field : Clone), and come up with a type-sensitive heuristic; that's probably a lot of work, though. We could also extend the #[derive] syntax to allow the user to specify the correct bounds; easy to implement, but not very intuitive to use.

@Yoric

This comment has been minimized.

Copy link
Contributor

Yoric commented Oct 13, 2015

In terms of syntax, it might perhaps be possible to generalize ? and let devs specify where T: ?Clone. Not sure how the compiler would interpret this post-derive, though.

@jorendorff

This comment has been minimized.

Copy link
Contributor

jorendorff commented Oct 28, 2015

GHC handles the case described in @eefriedman's comment:

module DeriveShowPrivate(IntoIter(IntoIter), f) where
data Item a = Item { itemInner :: a } deriving Show
data IntoIter a = IntoIter { inner :: Item a } deriving Show
f v = IntoIter {inner = Item {itemInner = v}}

It works:

import DeriveShowPrivate(f)
main = putStrLn (show (f 37))

The output is:

IntoIter {inner = Item {itemInner = 37}}

Maybe GHC "inlines" the private bounds until it obtains all-public bounds.

@brson brson added T-lang P-low labels Apr 4, 2017

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member

Mark-Simulacrum commented May 7, 2017

There are also cases where derive is more conservative than it should be, see #32872 and #31518.

@Mark-Simulacrum Mark-Simulacrum changed the title #[derive] is too conservative with field trait bounds #[derive] sometimes uses incorrect bounds May 7, 2017

@bluss

This comment has been minimized.

Copy link
Contributor

bluss commented May 7, 2017

(Some contrivance ahead.)

If the bounds would be dropped when possible for existing code, the following would lose its good and intended meaning. The effect would be that RefMut<'a, T> were suddenly Copy, which is something that we imagine is not desirable, even a memory error.

#[derive(Copy, Clone)]
pub struct Ptr<Mark>(*mut u8, PhantomData<Mark>);

pub type Ref<'a, T: 'a> = Ptr<&'a T>;
pub type RefMut<'a, T: 'a> = Ptr<&'a mut T>;
@bluss

This comment has been minimized.

Copy link
Contributor

bluss commented May 7, 2017

With that I think that even if the bounds are plain wrong, they are part of the stock recipe that one can conjure with derive, it's “part of what you order” and we can't change that backwards incompatibly.

Derive can however grow bells and whistles, like the bound customization that rust-derivative and serde offer.

@comex

This comment has been minimized.

Copy link
Contributor Author

comex commented May 8, 2017

If there's a good way to make derive do the right thing, the backwards incompatibility issue sounds like the kind of thing epochs could handle.

@durka

This comment has been minimized.

Copy link
Contributor

durka commented Jun 26, 2018

@ramosbugs

This comment has been minimized.

Copy link

ramosbugs commented Jul 16, 2018

It looks like this issue affects the failure_derive crate as well.

In the following code fragment, Fail is only implemented when T is bound by Fail.

#[derive(Fail)]
pub enum SomeFailure<T: Foo> {
    SomeCondition(T),
    SomeOtherCondition,
}

The macro expansion ends up being something like:

impl <T: Foo> ::failure::Fail for SomeFailure<T> where
 T: ::failure::Fail {
    #[allow(unreachable_code)]
    fn cause(&self) -> ::std::option::Option<&::failure::Fail> {
        match *self {
            SomeFailure::SomeCondition(ref __binding_0) => {
                return None
            }
            SomeFailure::SomeOtherCondition => {
                return None
            }
        }
        None
    }
    #[allow(unreachable_code)]
    fn backtrace(&self) -> ::std::option::Option<&::failure::Backtrace> {
        match *self {
            SomeFailure::SomeCondition(ref __binding_0) => {
                return None
            }
            SomeFailure::SomeOtherCondition => { return None }
        }
        None
    }
}

In this case, the parameterized type T has no reason to implement Fail, and in many cases will not represent a failure. I'm not sure of the best solution to these issues (other than manually implementing the trait), but hopefully this additional example helps illustrate another (potentially common) use case.

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Jul 16, 2018

I'm not sure of the best solution to these issues

As stated above, crates can create their own mechanism to fine-tune their own derive bounds. Serde is an example of such a crate. You should file an issue or submit a PR to failure.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Aug 20, 2018

I'm gonna leave a reminder here for @nikomatsakis to read https://arxiv.org/abs/1608.05233 (if he hasn't already) to see if it'd help rustc/Chalk in any way support the "coinductive" case.

@dhardy

This comment has been minimized.

Copy link
Contributor

dhardy commented Sep 20, 2018

@nikomatsakis why the obsession with solving recursive type bounds in order to make derive work correctly? From my point of view these are two completely separate problems, and it appears the type bounds can already be solved (even with derive(Clone) in the first case).

As I understand it, derive is a macro designed to spit out templated code. The macro logic should add appropriate bound annotations, but does not need to solve them. From @eefriedman's post near the top:

It's possible that we could punt computing the bounds off to type-checking (add syntax like where each field : Clone), and come up with a type-sensitive heuristic; that's probably a lot of work, though.

Macros can cover the each bit, so this only needs something like typeof(expr) for a correct solution:

#[derive(Clone)]
struct Foo<T: Iterator + ?Sized> {
    data: T::Item
}

// generates:
impl<T> Clone for Foo<T>
where
    T: Iterator + ?Sized,   // copied from struct requirements
    typeof(Self::data): Clone
{
    fn clone(&self) -> Self {
        Foo { data: self.data.clone() }
    }
}

typeof is even a reserved keyword.


Note that this isn't just about making derive(Clone) and friends work correctly — it's about enabling users to write correct macros too. Those correct macros should not need to solve complex type problems.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Sep 20, 2018

The problem @dhardy is this. Assume that T::Item expands out to Foo<T> again. Then you have:

impl<T> Clone for Foo<T> where Foo<T>: Clone { ... }

This cyclic impl will never terminate: in order to "use" the impl, you need to provide the impl itself. In rustc today, this results in an overflow error, as you can see from this example:

struct Tree<T> {
    data: T,
    children: Vec<Tree<T>>,
}

impl<T> Clone for Tree<T>
where
    T: Clone,
    Vec<Tree<T>>: Clone,
{
    fn clone(&self) -> Self {
        panic!()
    }
}

fn is_clone<T: Clone>() {}

fn main() {
    is_clone::<Tree<u32>>();
}

Once we transition to Chalk, it will not overflow, but it simply error: there is no Tree<T> impl that can be used, so the trait is not implemented.

As an interim step, I think we should just add some annotations that let you "opt-out" of the additional bounds when deriving for specific type parameters. This seems like a fine thing to do no matter what.

Also, i'm not 100% confident that we want to adopt the "perfect" design here. It marks a subtle change in the semver policy that is not obviously good. Today, if I have #[derive(Clone)] struct Foo<T> { .. }, then this always means that T: Clone must hold. This is true even if Foo just contains Rc<T>: Clone. Now, often this is annoying, but it does mean that you could later change from Rc<T> to Box<T> if you wanted without affecting your clients.

Assuming we did want to adopt the perfect path, we need to define how it gets lowered to the underlying logical rules. Keeping in mind that derive is a "dumb macro", it seems like we really ought to move "the smarts" there into trait solving, so that e.g. the simple impls I gave above would be accepted. One way to do this is to adopt a "co-inductive semantics" for trait matching, but that carries other complications if we are not careful (and may well invalidate unsafe code, for example, that relies on today's inductive behavior to rule out cycles). There may be other solutions, though.

That said, I do have the feeling that one ought to be able to have a coinductive-style semantics for cases like Clone here. It feels pretty similar to the work we did on figuring out how to think about implied bounds. I've not really sat down and thought about this for a while, mostly because I'd like to make progress on the general chalk transition first...

UPDATE: Updated the example to be more realistic.

@dhardy

This comment has been minimized.

Copy link
Contributor

dhardy commented Sep 21, 2018

Thanks @nikomatsakis for the explanation.

Once we transition to Chalk, it will not overflow, but it simply error: there is no Tree<T> impl that can be used, so the trait is not implemented.

And yet if the bound is removed, clone can be implemented just fine, since the system is in fact solvable. The problem reminds me of searching for fixed points of a function or recursive proofs — yet as noted above this is unrelated to the problem of termination (recursive functions which terminate are common). So can Chalk solve this correctly?

In co-inductive proofs, there is a key limitation that the various rules you setup must be "productive" or "guarded". This prevents simple loops like T: Foo => T: Foo. It seems like maybe we could make all trait matching co-inductive, or at least potentially co-inductive, if we could find a satisfying way to define such conditions. I'm not sure what that is. But I'd be happy to hear about it if you think you know the answer!

So this is the crux of the issue. One can have fun with toy examples:

trait Trait {}
impl<T: Trait> Trait for T {}

This currently causes an overflow error when trying to check the bound, but fundamentally it's harmless: Trait happens to be implemented for every type (if the solver can verify this), but since it doesn't do anything this is perfectly fine. Also since blanket trait implementations are restricted to the defining crate, there's no risk of a user accidentally implementing Sync everywhere or some such. However ...

trait Trait {
    fn foo(&self);
}

impl<T: Trait> Trait for T {
    fn foo(&self) {
        self.foo()
    }
}

... now we have "bad" code (infinite recursion). As @comex points out, Rust freely allows diverging code, so technically we don't need to solve this in the type system. Rust also already has warnings about infinite function recursion; quite possibly the same logic could be used to warn about this case. So I don't see any real reason for the bound checker to reject this code.

Today, if I have #[derive(Clone)] struct Foo<T> { .. }, then this always means that T: Clone must hold. This is true even if Foo just contains Rc<T>: Clone. Now, often this is annoying, but it does mean that you could later change from Rc<T> to Box<T> if you wanted without affecting your clients.

I saw this argument above, but it seemed like a strange one. It is a minor issue that derive(Clone) does not explicitly state the bounds, but I don't think this is a major one, and not being explicit about what Foo's copy semantics are feels like an API bug — so causing obvious breakage in such a case could even be useful. @elidupree had roughly the same response.


Note for readers: of course this implies that the solution I proposed above, bounds on typeof(T::field), would be a major breaking change until and if Chalk is integrated and can solve reflexive bounds. So it's not a short-term fix.

antifuchs added a commit to antifuchs/ratelimit_meter that referenced this issue Mar 9, 2019

Make the Point trait work
Now, we manually impl Default for State in all the impls that count,
instead of deriving it (since as per
rust-lang/rust#26925, the derive puts the
wrong trait bounds on the type parameter).

@elidupree elidupree referenced this issue Mar 25, 2019

Open

Improved `Partial{Eq,Ord}` derive #3

4 of 10 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.