Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orphan rules are stricter than we would like #1856

Open
nikomatsakis opened this Issue Jan 19, 2017 · 33 comments

Comments

Projects
None yet
@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jan 19, 2017

The current orphan rules (described as "covered first" in my blog post) are too restrictive in some cases. The purpose of this RFC issue is to catalog cases where the rules are tighter than we would like and to also point at possible solutions.


Scenario. Type parameters restrict our flexibility

In the discussion around the try trait in #1718, we realized that the orphan rules would prohibit one from adding an impl like this:

impl<T,E> Try<Poll<T, E>> for Result<T, E>

This is sort of annoying because if there were no type parameters involved, the impl would be permitted:

impl Try<Poll<(),()>> for Result<(),()> // OK

This roughly corresponds to the impl<T> BorrowFrom<Rc<T>> for T example from the Little Orphan Impls blog post. As that blog post describes, this is an artifact of introducing ordering into the algorithm; the post describes various alternative rules, notably the "covered rule", that would permit both of these impls, but at the cost of other kinds of rules.


Scenario. Wanting to implement a trait where the output type is not in current crate.

@Manishearth describes wanting to add an impl into libstd like so:

impl Add<str> for str {
    type Output = String;
    ...
}

By rights, this impl should live in libcore, since all the input types are in libcore, but the output type is defined in libstd, so of course it cannot.


Scenario. Optionally derive traits if desired by others, but without adding a required dependency.

A common example is that my crate foo defines a type Foo which can be serialized (e.g., using the serde crate). However, I do not want to add serde as a required dependency of foo. Currently, best practice is to define a Cargo.toml feature (e.g., with_serde) that adds a dependency on serde and includes the relevant impls. But this is a lot of manual work. Another example is how the rayon crate exports the ParallelIterator trait; other crates, like ndarray, might want to implement those, but without forcing a dependency on rayon on others

Proposals targeting this scenario:

This scenario was also described in #1553.

Other scenarios?

Please leave comments and we can migrate those descriptions into this header.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Jan 19, 2017

The solution I like to the third scenario is to enable a greater variety of blanket impls through specialization & mutual exclusion.

For example, if there is a Sequence type in std serde had an impl<T> Serialize for T where T: Sequence, and Foo is a sequential type which implements Sequence, it now also implements Serialize.

Traits like Sequence can form a "pivot" between two cousin libraries.

This has pretty bad limitations when it comes to things like ndarray and rayon because the high performance expectation probably necessitates some sort of API access that wouldn't be contained in that ancestor trait (possibly even some unsafe code).

@oli-obk

This comment has been minimized.

Copy link
Contributor

oli-obk commented Jan 19, 2017

impl<T> Serialize for T where T: Sequence

this doesn't work, because we'd also want to have

impl<T> Serialize for T where T: Map

We already ran into this when we tried to

impl<I, T> Serialize for I where I: Iterator<Item = T>, T: Serialize
@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Jan 19, 2017

@oli-obk It does work with mutual exclusion, by making it incoherent for a single type to implement both Map and Sequence. That's exactly the feature I'm proposing. :-)

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Jan 19, 2017

I've mused a kind of lattice system where if one uses crate foo with some type, and another crate bar with this trait, you must also use another (the "least upper bound" of foo and bar) containing orphan impls involving foo's type and bar's trait.

The motivation here is sort of a collaborative look at the expression problem: It's sort of arbitrary to say whether the trait or type should own the impls when its a joint endeavor. The traditional approach to the expression problem was about allowing everyone to work in isolation, which is what made it hard/impossible to resolve, here the idea is since the author of the type and the trait both have a stake, they both should be able to write the impl. Ideally this would be coupled with a jointly-owned crate on crates.io they both could publish.

I don't have a detailed design or informal argument that this is sound, but as this has long been my rough thinking on the problem from a software engineering perspective, and I figured I should share it.


For the String case (outputs not inputs more broadly), core should probably say "I defer to a crate called collections" giving it a non-orphan exception, along with maybe an input signature the "deferred impl" must obey (for purposes of core being able to define other blankets).

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Jan 19, 2017

integrating with cargo workspaces, by @nikomatsakis

I'm in favor of a boundary that isn't a crate boundary and contains multiple crates but is logically the same "project", where a different set of rules apply etc. I have never really pushed for this but I have often talked about separating the notion of a package from the notion of a crate.

However, this may only further muddle our already-confusing crate/mod system.

@bluss

This comment has been minimized.

Copy link

bluss commented Jan 19, 2017

I'd like such a wider boundary, a way for a parent crate to delegate to other crates, so that they are permitted to implement traits for types in the parent crate “in its stead”.

Something like this:

[package]
name = "ndarray"
trait_delegate = ["ndarray-serde"]

Now ndarray-serde will depend on ndarray and is allowed to define foreign traits (serde::Serialize in this case) for the types in ndarray.

@ubsan

This comment has been minimized.

Copy link
Contributor

ubsan commented Jan 20, 2017

@bluss

I'd like such a wider boundary, a way for a parent crate to delegate to other crates, so that they are permitted to implement traits for types in the parent crate “in its stead”.

I would rather have a way for a child crate, of two parent crates, to define interop code. This means that ndarray doesn't have to know all of the crates that will be interopped with it; there's just an interop crate. This crate would have metadata, saying that it's for interoperability between two crates, and may be decided upon by the final consumer, or any link up the chain. For example, say I want to interop ndarray and serde with the crate ndarray-serde. ndarray-serde would look something like:

[package]
name = "ndarray-serde"

[interop]
ndarray = "0.N"
serde = "0.M" # note, these must be the versions in the final binary, and imply dependencies
[dependencies]
ndarray = "0.N"
serde = "0.M"
ndarray-serde = "0.L"

that was a lot of "interop"s.

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Jan 20, 2017

FWIW the conditional features rfc eases the interop use case (but doesn't solve it completely)

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Jan 20, 2017

@Manishearth and I talked about orphan issues tonight, and he gave a good example from servo, which I admit I may have badly misunderstood. But based on my understanding, I discovered a pattern for 'smuggling' code for foreign types. There are some restrictions on this pattern, but it seems quite powerful.

An example is FFI conversion. Maybe you have a single crate which defines all of your FFI types (possibly in an automatically generated fashion), and its shared by many other crates in your project. So you want all your FFI types to implement from/into their real types, and for your real types to implement from/into your FFI types. This runs into orphan rules because your FFI library doesn't know about the real types, because its a shared dependency of all the crates containing the real types.

Here's a way of defining those traits (and a 'smuggler' impl) which totally compiles and works (and it blows my mind):

trait Real: Sized {
    type Ffi: Ffi<Self>;
    fn from_ffi(source: Self::Ffi) -> Self;
    fn into_ffi(self) -> Self::Ffi;
}

trait Ffi<T>: Sized {
    fn into_real(self) -> T;
    fn from_real(real: T) -> Self;
}

impl<T> Ffi<T> for T::Ffi where T: Real {
    fn into_real(self) -> T {
        T::from_ffi(self)
    }
    
    fn from_real(real: T) -> T::Ffi {
        real.into_ffi()
    }
}

You can implement the Real trait for all the real types locally where they live, and declare their FFI type to be the foreign type from the FFI library, and that FFI type magically gets the Ffi trait implemented for it.

I was doubtful that this could be coherent, but because Ffi is parameterized over its Real, implementing it for an associated type just works!

You can actually smuggle arbitrary code (not only conversions) into the FFI crate by just having it delegate to a function on the Real trait.

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Jan 20, 2017

@Ericson2314

I've mused a kind of lattice system where if you use a crate with this type and another with this crate, you must also use another (their "least upper bound") with the impl.

Could you please rephrase this, or maybe with example code? I confess I don't understand at all what each "this" and "another" is referring to.

@bluss

This comment has been minimized.

Copy link

bluss commented Jan 21, 2017

The automatic features RFC (did you mean this one @Manishearth?) seems like it doesn't solve any of the hard issues around either version coupling or orphan rules. It is like a convenience around what we can already achieve, while we want something that really bends the current rules a bit.

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Jan 21, 2017

did you mean this one @Manishearth?

yep. used to be called "conditional dependencies", now is "automatic features", mixed the names.

seems like it doesn't solve any of the hard issues around either version coupling or orphan rules.

Yeah, it doesn't, but it smoothens some of the more common annoying use cases.

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Jan 22, 2017

@glaebhoerl sorry, did a sloppy job writing on my phone. I've gone back and cleaned up that post so it should be understandable now.

@ztrash

This comment has been minimized.

Copy link

ztrash commented Jan 24, 2017

The current orphan rules (described as "covered first" in my blog post) are too restrictive in some cases.

I wonder if you would consider a way to make binary operator traits like this work:

impl<T> Mul<Matrix<T>> for T

It would be a much cleaner way of implementing these operators for numerical programming.

@burdges

This comment has been minimized.

Copy link

burdges commented Feb 11, 2017

Isn't there a case that binary operators should've been defined on 2-tuples? Ala impl Mul for (A,B) { .. } not impl Mul<B> for A?

I donno if this could changed in a backward compatible way with a blanket impl like impl<A,B> MulPair for (A,B) where A: Mul<B> { } or impl<A,B> Mul<B> for A where (A,B): MulPair { }.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Feb 11, 2017

You've literally just decomposed the problem into figuring out the exact same predicate, but specifically for tuples. It's good to have the two use the same rules, but you don't have a simpler problem to solve.

@snuk182

This comment has been minimized.

Copy link

snuk182 commented Jul 17, 2018

Would like to have something like this allowed:

impl ForeignTrait2 for ForeignWrapper<MyStruct> { ... }

Currently results in E0117.

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Jul 17, 2018

@axos88

This comment has been minimized.

Copy link

axos88 commented Dec 10, 2018

Are there any caveats I'm not considering for completely ignoring the orphan rule for BINARY crates? For example: I'd love to be able to impl Debug, or Serialize for types coming from used libraries (or even overwrite their implementation - could be useful for mock testing and debugging, but probably with a keyword such as reimpl Debug for Foo), and since I'm talking about leaf crates, the only thing that could break is the application itself compiling IF a dependency is upgraded where that trait is already implemented (unless you can overwrite the implementation), but then again that only happens if someone upgrades that dependency, so they are already touching the application code, so I would say it's not unexpected to break the code if you start upgrading your libraries.

@Ixrec

This comment has been minimized.

Copy link
Contributor

Ixrec commented Dec 10, 2018

afaik permitting leaf crates to completely ignore the orphan rules would be perfectly sound and not create any of the subtle composability issues we usually talk about on issues like this. The concerns there would be about it preventing you from easily refactoring the leaf/binary code into a non-leaf library, or creating a surprisingly huge and fundamental difference between what is allowed in libraries versus binaries that's especially likely to catch newcomers off guard. In particular, I'm told it's common to design Rust executables as mostly a library crate with only a very thin wrapper in the actual binary crate, in which case binary-only powers wouldn't help much.

@axos88

This comment has been minimized.

Copy link

axos88 commented Dec 10, 2018

I think the part about catching newcomers off guard can be helped with good error messages, and warnings (such as a link to what the orphan rule is, and why it's needed for library crates, also explain that this limitation does not hold for binary crates, and why that is so).

As far as I see - although I still consider myself a newcomer to rust - is that the habit of having a small wrapper as the binary and having most of the code in a library is getting less common, especially since cargo now defaults to binary crates. Never really understood what was the point of that, I really think it's a fallacy to think that a full-blown application would/(or even should!) be easily converted to a library, other than some very small tools.

And about introducing a fundamental difference between them.... Yeah, I totally see your point there, however the use cases for these crates are fundamentally different (one is for reuse in other components, the other is for distributing and running), so I'm not convinced this is a bad thing. And allowing it gives us a lot of expressiveness in return, which in some cases cannot be worked around (such as impl Operator<MyType> for ForeignType), and in other cases very circumstantial (wrapping with NewTypes, etc).

So all in IMHO the orphan rule was introduced as a "necessary limitation" to circumvent the composability issues, however it's a limitation that I would think is not needed for binary crates.

@Ekleog

This comment has been minimized.

Copy link

Ekleog commented Dec 10, 2018

The orphan rules are required for binary crates in order to keep semver working correctly, as an application is not necessarily distributed with its Cargo.lock, even though it is the best practice around the Rust community.

(also, I have noticed that until now every time I tried designing an application as a library wrapped by a binary I actually somehow managed it, and it makes for much cleaner code IMO as libraries make coupling and dependencies much more explicit)

@axos88

This comment has been minimized.

Copy link

axos88 commented Dec 10, 2018

Well to me that seems like the cause of the incompatibility with semver is not really caused by not enforcing the orphan rule, but by not distributing the Cargo.lock, which is a very questionable thing to do - basically that means you trust every dependency (direct and transitive) to adhere to semver, which I think everyone already realized that unfortunately there are and always will be people, who don't. Also there's this: https://gist.github.com/jashkenas/cbd2b088e20279ae2c8e, which I don't agree with in full extent, but there are some very interesting points to consider.

Furthermore, I don't think we should restrict the experessiveness of a language in order to "be compatible with" an arbitrarily chosen versioning scheme, unless it is established, that it is a convention that all the rust things are to follow that versioning scheme.

We can issue warnings that hey, here be lions, be careful, we can put this behind a special keyword / compiler flag / rustc cmdline argument / whatever, but the amount of help we can get while debugging or just extending stuff without having to fork every single dependency I just want to make Serializeable or Debuggable is hard to ignore.
And this opens the door to the possibility of overwriting implementations of traits in leaf crates (something along the lines of reimpl Trait for Struct {}), that could be a very powerful tool during mock testing, or debugging nasty issues, or just simply working around a bug found in one of the libraries / extending their functionality a bit without having to necessarily fork it (this latter part is questionable though).

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Dec 10, 2018

@axos88 The fundamental problem isn't cargo and semver of third party library crates, but the versioning of Rust itself. We've made backward compatibility guarantees regarding the standard library that would mean we wouldn't be able to add impls to it if orphan impls were allowed, since someone could have written than impl as an orphan impl.

@DanielJoyce

This comment has been minimized.

Copy link

DanielJoyce commented Jan 11, 2019

This really needs to be fixed instead of pestering crate owners to remember to implement/derive Debug all the time...

@axos88

This comment has been minimized.

Copy link

axos88 commented Jan 16, 2019

@withoutboats That could be circumvented, if we'd say that orphan impls always override default impls.

@cloudhan

This comment has been minimized.

Copy link

cloudhan commented Feb 20, 2019

@axos88 I think just refuse to compile is good enough. In many package managers, they have multiple packages provide the same functionality, when installing both, a conflict will occur and the package manager will ask the user to choose either one. Here, if the user want to use new crate with additional impls and keep the homemade impls, the compiler panics!

@axos88

This comment has been minimized.

Copy link

axos88 commented Feb 20, 2019

@cloudhan that's not a good way because of the reasons @withoutboats described above. An application may stop compiling only because of a seemingly non-breaking change in rust, or another dependency, which is not acceptable for obvious reasons.

However if we specify a clear overriding order (and possibly still issue compiler warnings, so they don't go unnoticed, perhaps unless somebody clearly describes that they WANT to override), things won't break.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Feb 21, 2019

overriding violates the fundamental property of coherence we are trying to maintain - upstream code may already have been compiled using the impl which you have overriden, meaning that the same type impl is dispatched differently in different parts of your program.

@burdges

This comment has been minimized.

Copy link

burdges commented Feb 21, 2019

impl Operator<MyType> for ForeignType works but not impl<T: Trait> Operator<MyType> for T.

As a rule, you could never add Debug etc to another crate because you cannot access the type's contents usually. It's simpler, more readable, and better for the community if you send the other a PR adding the functionality. And it'll catch bugs because they maybe omitted that impl for good reason. As an example, Rust ecosystem for zero-knowledge proofs is by far the best, in part due to session types, which adding Clone breaks.

@cloudhan

This comment has been minimized.

Copy link

cloudhan commented Feb 21, 2019

@burdges

in part due to session types, which adding Clone breaks.
then C++'s delete would make more sense

impl Trait for MyType = delete

maybe better than simply left the trait unimplemented and implicitly imply it.

@Ixrec

This comment has been minimized.

Copy link
Contributor

Ixrec commented Feb 21, 2019

I'd certainly support adding comments about deliberately unimplemented traits, but I don't see any benefits to adding a language-level syntax for it (C++ has similar syntax because, unlike Rust, it has implicitly provided methods and method overriding).

@burdges

This comment has been minimized.

Copy link

burdges commented Feb 21, 2019

You cannot simply ban specific traits like Clone because another trait like ToOwned could provide similar functionality. You'd never catch them all throughout the ecosystem. We'd always prevent these with data hiding in my example, but then data hiding prevents many cases about which you care too.

It's only auto-traits like Send and Sync that developers must unimplement now, which although annoying only comes up in niche cases, and people can reasonable learn about.

We communicate performance characteristics with missing impls too: You do algebra on elliptic curve points in projective coordinates, but you cannot hash or order them without converting to affine coordinates, which takes time. If you permit adding comparison operators, then you'll wind up with folks writing amazingly slow hash and btree maps.

Instead, we should add a mechanism for trait warnings because we do want HashMap: Clone but it's mildly insecure, due to cloning the HashMap's BuildHasher. Ideally, any code that invokes HashMap: Clone should generate a warning, whether through ToOwned or whatever, so that each specific indirect call site gets annotated with an #[allow()]. We need trait warnings to disambiguate unclear cases like HashMap: Clone so that developers can be more fearless about providing iffy impls.

Just fyi, these trait warnings would not suffice for my previous use cases because say cloning a session type for a multi-signature will compromise the user's private key, not just enable some DoS attack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.