Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crates should allow private impl of external traits for external structs #493

Open
rust-highfive opened this Issue Nov 30, 2014 · 84 comments

Comments

Projects
None yet
@rust-highfive
Copy link

rust-highfive commented Nov 30, 2014

Issue by alanfalloon
Thursday Apr 24, 2014 at 02:31 GMT

For earlier discussion, see rust-lang/rust#13721

This issue was labelled with: in the Rust repository


As you know, you can't provide an impl for a trait if neither the type nor the trait are defined in the current crate, you get:

error: cannot provide an extension implementation where both trait and type are not defined in this crate

For public implementations, this makes perfect sense, there are issues with collisions between implementations and sheer surprise. However, for a private implementation of the trait, this is a real hindrance.

Consider the case of std::path::Path not implementing std::fmt::Show. The rationale for closing #13009 is perfectly valid: we don't want people treating paths as strings. However, by refusing to add an implementation of Show, you have made that decision for all programs everywhere. In my case, I had a large struct with numerous Path elements that I wanted to print for debugging, but #[deriving(Show)] won't work because Path has no Show impl, and now I'm stuck either implementing it tediously from scratch, or switching to str for my path names.

The perfect compromise would have been to allow my crate to define a private Show impl for Path. There is precedent for this in other languages, go allows interfaces to implemented anywhere, for example.

@dead10ck

This comment has been minimized.

Copy link

dead10ck commented Nov 30, 2014

I'm very much in favor of this. It should be possible to extend existing types without having to wrap them in a unit struct, or define a new trait where it doesn't make sense to (e.g. adding a single helper method).

@freebroccolo

This comment has been minimized.

Copy link
Contributor

freebroccolo commented Nov 30, 2014

I agree that the current restrictions enforcing coherence (essentially global uniqueness for impl) are a hindrance. This is because coherence is fundamentally anti-modular.

Having globally unique instances is something of a devil's bargain: convenient at a small, local scale, but leads to a number of problems in the long run, seriously limiting how the language can evolve. This is the situation Haskell is in now, having embraced coherence: see here, and here. I'm a little concerned Rust is making a similarly unfortunate (and unnecessary) choice…

Distinguishing the definition of an impl from canonicalization of an impl would be a big step forward here. Explicit canonicalization would allow the programmer to specify in a more principled fashion which instances to use (from an external crate or local), without resorting to wrapper structs.

Aside from helping to restore some modularity now and retain flexibility going forward, explicit canonicalization opens to the door to more general and more powerful mechanisms for implementation resolution. For example, see the work on explicit unification hints here subsuming type classes, canonical structures, etc. Another well-known but somewhat less general approach using ML-style modules is modular type classes.

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Nov 30, 2014

On the other side of this debate, of course, is Edward Kmett. He was going to give a talk about this, but it got cancelled - I hope it'll be rescheduled at some point!

@freebroccolo

This comment has been minimized.

Copy link
Contributor

freebroccolo commented Dec 1, 2014

@glaebhoerl I want to point out that I'm not advocating "implicits" or "ML-modules" per-se as an alternative to type-classes, so in that sense I don't think what I'm proposing is on the "other side" at all, but something else entirely. I would advocate an approach that does not require global uniqueness but would still allow the current behavior to more or less be recovered in a controlled fashion and as a special case without sacrificing modularity. I don't think Kmett addresses this possibility.

Having said that, I think Kmett has some great ideas and has done some really amazing things with Haskell but where coherence is concerned, I don't find his arguments convincing. To me, it basically boils down to "coherence is really convenient, and besides look at all we've managed to accomplish with Haskell type classes".

What I find lacking is a principled response to the alternatives, e.g., canonical structures, unification hints or other more general mechanisms, or the differences in the way type-classes are implemented in various theorem provers. Part of the reason for this is that it's difficult to compare the alternatives in a practical sense given differences in languages, libraries, and other factors. But that doesn't make the argument in favor of Haskell's approach any more convincing.

I also feel such arguments ignore most of the collateral complexity of embracing global uniqueness and trying to work around the consequences elsewhere: several random extensions relating to type-classes, issues with the proposal for the module system (backpack), generalized newtype deriving (with all its quirks), roles, etc. Global uniqueness only simplifies some things from a particular perspective. As a general language design principle, it's hard to justify (especially in retrospect!) and I believe definitely a case of "worse is better".

There are very real and well known pain points associated with embracing coherence, as the op points out. I guarantee these will only become more of a thorn in the side of Rust users as the language grows. It would be a shame for Rust to simply adopt the status quo on this rather than reconsider some of the decisions Haskell has made while it's still within the realm of possibility.

Furthermore, it seems to me (and I could be off base), that one of the principles behind Rust is to avoid making things implicit when having them explicit offers greater flexibility and control. The current approach ties two different concepts together, definition and canonicalization (globally), where the latter is implicit and doesn't need to be.

@dobkeratops

This comment has been minimized.

Copy link

dobkeratops commented Dec 1, 2014

+10000
the promise of a more open-world environment is what attracted me to Rust, and when I discovered these restrictions I began to lose interest.. a future C++ with UFCS and modules will satisfy me more.

Changes like this would draw me back to the language.

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Dec 4, 2014

@darinmorrison couldn't agree with you more. Need to look at Edward's comments more, but I am not sure where he could have gotten that opinion. By separating cannonicalization and definition, one can fine tune the coherence rules to exactly what they think they should be (which seems to very from person to person) without sacrificing the expressiveness of the language. Does anything more need to be said?

(As an aside, I started writing an RFC https://github.com/Ericson2314/rfcs/blob/master/active/0000-combine-mod-trait.md, which would combine traits and modules, around the time of the associated items RFC. I was told that this was too much to do for 1.0 -- even more obvious at this point in time than then. To reiterate, I do see plenty of advantages to separating cannonicalization and definition, even if traits and modules are kept separate.)

@freebroccolo

This comment has been minimized.

Copy link
Contributor

freebroccolo commented Dec 5, 2014

@Ericson2314 interesting… I remember seeing comments about ML style modules somewhere but didn't realize anyone had started writing up an RFC. Even aside from this issue, it would be pretty awesome to have them in Rust if it were feasible. (Higher-rank polymorphism and existentials would also be great with HKTs but I agree it seems tricky with the monomorphization.) I would be willing to help with an implementation effort somehow. /cc @jonsterling

@zwarich

This comment has been minimized.

Copy link
Contributor

zwarich commented Dec 5, 2014

You can get a correct local version of the reasoning that Edward Kmett takes as the main advantage of type classes with the Modular Type Classes approach, albeit with some changes to the version described in the paper.

The original paper models type classes using ML modules with generative functors (meaning that a functor applied to the same arguments gives fresh abstract type components in the resulting module) rather than applicative functors (meaning that each distinct application gives identical abstract types in the resulting module). This implies that instance functors are required to be fully transparent, with no abstract type components. If they were modeled with applicative functors instead, you could use ordinary type abstraction to enforce Edward Kmett's desired guarantees from type classes locally.

An email to the TYPES mailing list and some slides explain this in a bit more detail. It's unfortunate that this has never appeared in a real programming language, since it really seems like the correct way to combine modules and type classes.

There is another issue that comes up with modeling type classes using ML modules that wouldn't really be a problem for Rust. Haskell type classes allow you to write programs that have an unbounded number of instances at runtime:

f :: Eq a => Int -> a -> Bool
f 0 x = x == x
f n x = f (n-1) [x]

main = do n <- getLine; print (f (read n) ())

Encoding this example with ML modules would require first-class modules, but with Rust you already encounter ad-hoc failure during monomorphization for examples like this.

[Edit: @jonsterling pointed out that I swapped generative and applicative in describing the original MTC paper, even though it was clear what I meant from later context.]

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Dec 5, 2014

@zwarich Interesting! Furthermore, Since Rust doesn't allow impure consts / statics at all, I don't think we'd gain anything from generative functors.

My draft RFC proposes much of the functionality discussed in the slide show (arguments on functions being lifted onto the projection, etc), but I only hoped it would work out. Nice to learn from the slide show that it actually does (with pure applicative functors)!

@zwarich

This comment has been minimized.

Copy link
Contributor

zwarich commented Dec 5, 2014

@Ericson2314 If you want to see the semantics for an ML module system with both generative and applicative functors worked out in detail, the F-ing Modules paper by Rossberg, Russo, and Dreyer is interesting. They express all module system features by translation into System F_omega. If you want recursive modules with type abstraction, then you have to extend System F_omega to something like Dreyer's RTG.

@jonsterling

This comment has been minimized.

Copy link

jonsterling commented Dec 8, 2014

Might be good to see if we can get @RobertHarper to weigh in on this, particularly on the question of whether or not generative functors make sense in a Rust-like language.

(Bob, if you're too busy and don't want to receive notifications on this thread, there's an unsubscribe button on the right side of the screen that you can hit.)

@RobertHarper

This comment has been minimized.

Copy link

RobertHarper commented Dec 8, 2014

I haven't a clue about the context, but if someone wishes to summarize I could comment or make suggestions. I see that Modular Type Classes are being discussed, which I think are a good idea, of course, and cleaner and more general than the over-evolved mess in Haskell. In particular instance declarations in Haskell are wrong-headed; one should separate the instance itself, which is a functor, from its activation for use during type inference. And then one should be able to instantiate explicitly rather than always rely on the inference mechanism to trip over the thing you want, another problem area. In my view modeling type classes as run-time records is wrong-headed; instances are static, because they are inferred during elaboration. If you wan to pass actual run-time generated records or existentials, please do so, but that's not what type classes are for.

It may be helpful to look at our old paper on separate compilation for Standard ML. We thought long and hard about this, but no one paid any attention afaict, for largely "meta" reasons of the timing.

But maybe none of this is relevant to your discussion, so forgive me if not, and please fill me in if I can be of help.

@jonsterling

This comment has been minimized.

Copy link

jonsterling commented Dec 8, 2014

@zwarich Can you elaborate on how type abstraction can be used to enforce Haskell-style “coherence” in certain cases? This is interesting.

@RobertHarper Hi Bob,

Thanks for popping in. I was wondering about the following:

In spite of the fact that Rust has more fine-grained structure than ML wrt mutability & references, I think that generative functors are probably crucial and shouldn't be ignored. It feels like applicative functors aren't actually that useful except when you consider them as enabling certain patterns of use with type classes as Derek's slides point out. Moreover, a module system with type classes can start off with only generative functors and be perfectly useful; then applicative functors can be added later to sort of patch over the areas that were slightly more difficult to use. Bob, do you have any comments or corrections to what I have said above?

@RobertHarper

This comment has been minimized.

Copy link

RobertHarper commented Dec 9, 2014

hi john, not sure i understand "fine-grained srtucture" in rust, so it's hard for me to answer. i agree that applicative functors are a theoretical curiosity, and a bit of a hack in ocaml (they use an approximation of code equality, violating repind, to distinguish Set(X).t from Set(Y).t. There's a decent theory of it now, done by Russo, Dreyer, and Rossberg (in some order), but they are of marginal use in any language with state, certainly, and even without. The only argument for applicative functors was to have useful higher-order functors, but that never panned out. The only reasonable solution for higher-order functors is essentially what Tofte and MacQueen proposed a long time ago in the yucky Definition style; this may or may not be what was used by Dreyer, et al in their paper (I don't recall and don't have it to hand.)

@RobertHarper

This comment has been minimized.

Copy link

RobertHarper commented Dec 9, 2014

BTW, I don't understand why generative functors interfere with type classes, can you explain? As long as you propagate that "type t = int", etc then the instance is going to be type-transparent, regardless of the functor being "generative".

BTW, I'm still not sure that I like the terminology "applicative vs generative functors", because it's not really about the functors, but about tracking the effects using type abstraction.

@jonsterling

This comment has been minimized.

Copy link

jonsterling commented Dec 9, 2014

Bob

Re type classes and gen. functors, I think you are right, and I seem to recall that we had a similar conversation at one point where I raised a concern about generative functors and type classes, but I had forgotten it. Honestly, I'm not really sure of the point of having applicative functors in Rust...

@RobertHarper

This comment has been minimized.

Copy link

RobertHarper commented Dec 9, 2014

I wouldn't know the designer's motives, of course, but I highly doubt there's any technical justification. Putting it the other way 'round, I'd like to hear the justification for it in Rust.

@zwarich

This comment has been minimized.

Copy link
Contributor

zwarich commented Dec 9, 2014

@RobertHarper Excusing the use of the terms 'generative' and 'applicative', doesn't it make more sense for instance functors in the Modular Type Classes approach to be applicative rather than generative? The paper uses generative functors, and this forces instance functors to be transparent, so that distinct applications of the functor that arise from elaboration produce compatible types. Using applicative functors instead would allow for transparent instance functors. That's what Dreyer argues in those slides I linked, and it makes sense to me. Am I missing something?

@RobertHarper

This comment has been minimized.

Copy link

RobertHarper commented Dec 9, 2014

Type classes are by nature transparent; they cannot be opaque, because if they were, they'd be useless. (See PFPL for a discussion.) I don't see any advantage to applicative functors for anything, and not for type classes, unless I'm missing something. All that applicative functors do for you is to make abstract types more often equal than they would be without them. I don't see the relevance to type classes at all.

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Dec 9, 2014

Bob, thank you for coming here.

First things first. Rust does not allow any effectful top-level (or impl-level, the closest thing to module-level) definitions. (Functions defined here can of course have effects, but the definition of function themselves is effect free.) Put differently, all static initialization is effect-free.

This leads me to believe that were functors added to the Rust of today, they would need to be effect-free, "generative" vs "applicative" semantics aside. Does anybody disagree?

@RobertHarper

This comment has been minimized.

Copy link

RobertHarper commented Dec 9, 2014

No, that is irrelevant. Even in a totally pure functional language, you want only generative functors, because you do not want to confuse, say, posets ordered two different ways. The one-liner is that type abstraction is an effect. That's why existentials combine open with bind.

To reiterate, you never want applicative functors, and never ever these only.

Robert Harper
(from handheld)

On Dec 9, 2014, at 12:40, John Ericson notifications@github.com wrote:

Bob, thank you for coming here.

First things first. Rust does not allow any effectful top-level (or impl-level, the closest thing to module-level) definitions. (Functions defined here can of course have effects, but the definition of function themselves is effect free.) Put differently, all static initialization is effect free.

This leads me to believe that were functors added to the Rust of today, they would need to be effect-free, "generative" vs "applicative" semantics aside. Does anybody disagree?


Reply to this email directly or view it on GitHub.

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Dec 9, 2014

Mmm sorry, by "they would need to be effect-free" I meant that generative functors in Rust would be able to do only what generative functors do in a "totally pure functional language", i.e. no incrementing of global counters or anything like that. My intention was not to squash all discussion of generative functors. I see that "effect-free" is really the wrong term for what I meant.

@RobertHarper

This comment has been minimized.

Copy link

RobertHarper commented Dec 9, 2014

well, yes, there would be no storage effects, but there are nevertheless abstraction effects.

Robert Harper
(from handheld)

On Dec 9, 2014, at 13:06, John Ericson notifications@github.com wrote:

Mmm sorry, by "they would need to be effect-free" I meant that generative functors in Rust would be able to do only what generative functors do in a "totally pure functional language", i.e. no incrementing of global counters or anything like that. My intention was not to squash all discussion of generative functors. I see that "effect-free" is really the wrong term for what I meant.


Reply to this email directly or view it on GitHub.

@RobertHarper

This comment has been minimized.

Copy link

RobertHarper commented Dec 9, 2014

Now that I have a keyboard .... Even if Rust is effect-free at initialization time (but then what does initialization consist of if not effects?), so-called generative functors nevertheless introduce a top-level effect, namely type abstraction. Actually, even structure declarations with a "sealing" effect introduce such effects, so it's not really about functors, it's about abstraction. The only reason the discussion becomes about functors is that it is of paramount importance to be able to distinguish statically between different instances of a functor, even if the type parts are the same, because the instances proxy for the difference of interpretation provided by the code attached to the type. So Posets of natural numbers ordered by LT should be distinguishable from those ordered by GT. This is not a matter of effects, notice, but in the presence of effects it becomes even more important, eg one wishes to distinguish statically between two different instances of a hash table, even if all the parameters about keys and values are the same. This example is about state, of course.

Historically, the invention of applicative functors was a dead-end attempt to have a useful notion of higher-order functor, which does not fall out easily from standard type theories of modules. (Nor is there a case for higher-order functors even being very useful.) Even with so-called applicative functors, you still don't get a very useful concept of higher-order modules.

Standard ML modules are a local optimum that cannot easily be improved upon without making very large changes and introducing considerable complexity. The main improvements that I can envision are modular type classes (mostly to keep up with the Peyton-Jones's, but they can be handy in limited circumstances), and my re-working of data types to better integrate with modules. The latter is definitely worth doing, for a host of good reasons, and is the most significant improvement that I know about to the SML module system.

Does Rust have data types in the sense of SML? If you're doing them like in SML or Haskell or O'Caml, then I claim you are doing them wrong, and I know how to do them right.

@jonsterling

This comment has been minimized.

Copy link

jonsterling commented Dec 9, 2014

Bob, do you have a paper or any literature that summarizes your re-working of data types? If I recall correctly, the idea was to support something like this:

signature OPTION = data
  type 'a t
  con some : 'a -> 'a t
  con none : 'a t
end

As a result of the special declarations the signature also exposes pattern matching somehow. My questions: does this signature give rise to a default standard implementation structure Option : OPTION or would you write it yourself? Is the induction principle for 'a Option.t given in terms of the sum of all con declarations within Option's signature OPTION? How do data signatures/structures compose wrt. constructors and pattern matching?

@RobertHarper

This comment has been minimized.

Copy link

RobertHarper commented Dec 9, 2014

the design is implicit in the harper-stone semantics given in the milner volume, and is sketched explicitly in my ml workshop talk, which is available on my web page. i haven’t taken it further, but between the two i feel confident that it can be made to work out. the main ideas are as follows:

  1. introduce a “data signature” of the kind you describe below.
  2. introduce “data structure” declarations of two forms
    2.a. data structure S : SIG, where SIG is a data signature, which gives the default implementation
    2.b. data structure S : SIG = struct … end, where SIG is a data signature, which allows you to implement it yourself in whatever way you like.

the tag “data” on “structure” makes ‘a S.t available for pattern matching. in 2a it also indicates what it means to have a default implementation (as a recursive type). when you implement it yourself, there is trouble if you use effects, because the pattern compiler re-orders things to optimize pattern matching. a modal distinction would be helpful here, so that pattern-matching implementations could be required to be pure.

there are details to work out that will no doubt complicate matters, such as mutually recursive data types, or gadt’s, which are a hack in my view anyway, so maybe i don’t care about that so much.

there are opportunities to combine this with modular type classes to get a general form of “deriving” in haskell (one that works more broadly), and for providing special syntax in a non-ad hoc manner.

bob

On Dec 9, 2014, at 16:55, Jonathan Sterling notifications@github.com wrote:

Bob, do you have a paper or any literature that summarizes your re-working of data types? If I recall correctly, the idea was to support something like this:

signature OPTION = data
type 'a t
con some : 'a -> 'a t
con none : 'a t
end
As a result of the special declarations the signature also exposes pattern matching somehow. My questions: does this signature give rise to a default standard implementation structure Option : OPTION or would you write it yourself? Is the induction principle for 'a Option.t given in terms of the sum of all con declarations within Option's signature OPTION? How do data signatures/structures compose wrt. constructors and pattern matching?


Reply to this email directly or view it on GitHub #493 (comment).

@Blaisorblade

This comment has been minimized.

Copy link

Blaisorblade commented Dec 14, 2014

(Sorry to butt in, via Twitter).
@RobertHarper wrote:

I don't see any advantage to applicative functors for anything, and not for type classes, unless I'm missing something.

To summarize @zwarich's point, Derek Dreyer agrees that the classical motivation for applicative functors are not interesting, and makes a different point in the links. That point is not part of the paper on F-ing modules (by Rossberg, Russo, Dreyer) which you cited, and it doesn't seem to have been published.
Those links contain answers to that question, and to your other questions:

BTW, I don't understand why generative functors interfere with type classes, can you explain?
[...]
I wouldn't know the designer's motives, of course, but I highly doubt there's any technical justification. Putting it the other way 'round, I'd like to hear the justification for it in Rust.

A summary of Dreyer's point follows (as a separate, longer comment), in case it helps.
(EDIT: formatted quotes more accurately).

@Blaisorblade

This comment has been minimized.

Copy link

Blaisorblade commented Dec 14, 2014

Dreyer's canonical example is the MkSet functor: it's a natural candidate for being an instance functor, but it's not transparent, so you need to apply it explicitly even to instances you make canonical — to which Dreyer comments:

This is quite cumbersome. Aren’t modular type classes supposed to apply your functors for you?

Here's MkSet from his slides:

signature ORD = sig type t; val cmp : t → t → bool end

signature SET = sig type t; type elem;
  val empty : t;
  val insert : elem → t → t …
end

functor MkSet (X : ORD)
  :> SET where type elem = X.t
  = struct ... end

To answer @RobertHarper's point:

Even in a totally pure functional language, you want only generative functors, because you do not want to confuse, say, posets ordered two different ways.

Dreyer agrees, but provides a different solution (which however does appear in the F-ing modules paper, at least in the journal version). If it weren't for decidability, you'd just want to compare the ordering functions with contextual equivalence. To approximate that, instead of adding stamps to the result of (generative) functor application (which IIRC is one way of understanding generative functors), you can attach those to value declarations (at least, "moving the stamp generation" is how I understand this, he doesn't say it), so that applicative functors will distinguish posets ordered in different ways. From the email:

In Chapter 2 of my thesis, I argued that the right answer to the
question [of comparing the code in functor arguments] is "contextual equivalence". [MkSet(IntLt).t] and [MkSet(IntGt).t]
are compatible types so long as IntLt and IntGt are contextually
equivalent. Of course, in general, contextual equivalence is
undecidable, so we must look for a conservative approximation. But
it's not hard to get a reasonable conservative approximation of it by
attaching "stamps" (i.e. hidden abstract types) to each value
declaration. If the language had real dependent types, that would be
even better, as Set(IntLt).t is really a dependent type. But let's
sidestep dependent types for now.

As he points out (in the slides), this only works for pure applicative functors.
And compared to Ocaml, these stamps are robust to eta-expansion. That is, if you define MyIntLt = IntLt, you still have MkSet(MyIntLt) = MkSet(IntLt) != MkSet(IntGt), unlike in Ocaml where MkSet(MyIntLt) != MkSet(IntLt) != MkSet(IntGt).

I think "semantically we'd want contextual equivalence, but can't have it" should be a compelling argument. I'm not sure I like this conservative approximation, but it's certainly less conservative than pure generative functors, and I don't see downsides — Dreyer claims it solves at least all the problems in Ocaml.

There's a decent theory of it now, done by Russo, Dreyer, and Rossberg (in some order), but they are of marginal use in any language with state, certainly, and even without.

By looking at Dreyer's page, that paper must be the already cited F-ing modules (journal version).

On terminology, they also write:

Hence, from here on, when talking about functors, we will use “applicative” interchangeably with “pure”, and “generative” interchangeably with “impure”. (In fact, the correspondence is so natural and intuitive that we are tempted to retire the “applicative” versus “generative” terminology altogether. For historic reasons, however, we will continue to use the traditional terms in the remainder of this article.)

@Blaisorblade

This comment has been minimized.

Copy link

Blaisorblade commented Dec 14, 2014

To be extra-clear: I don't mean to be polemic, I'm sincerely interested in @RobertHarper and @jonsterling comments/arguments on Dreyer's design (and everybody else of course, but we seem to already buy Dreyer's points). I'm especially interested because I'd like to use similar designs in other contexts.

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Dec 11, 2016

This proposal, to allow private impls but not public impls of traits/types you didn't create, seems completely sensible to me. In addition to allowing local convenience impls, it also simplifies local experimentation with trait impls before upstreaming those impls.

Does this proposal have any downsides? Does anyone have interest in moving it forward, with a concrete RFC? I'd be happy to help anyone interested in writing one.

@burdges

This comment has been minimized.

Copy link

burdges commented Dec 11, 2016

I've only skimmed some of the above, so maybe I've missed something, but..

These local impls make adding a generic impl a violently breaking change, as local impls can later fail irreparably. You can avoid that breakage if local impls override imported impls, but doing so creates another breakage where bug fixes in crates do not always propagate to dependents. You could compromise by allowing use to exclude some impls, but that's complicated, and still allows both breakage scenarios.

A priori, I'd think wrapper structs sounds like the right approach, especially coupled with delegation of implementation.

@Blaisorblade

This comment has been minimized.

Copy link

Blaisorblade commented Dec 12, 2016

These local impls make adding a generic impl a violently breaking change, as local impls can later fail irreparably. [...]

Thanks for pointing that out; semantic versioning guidelines would need updating. Could you elaborate on why that's actually a showstopper rather than a tradeoff to think about? Dropping a local instance when upgrading a dependency doesn't sound hard and it happens already in Haskell (if you resort to orphans). If instances are different then you need wrapper structs.
Or am I missing some Rust-specific reason?

On the other hand, disadvantages of wrapper structs and delegation of implementation have been discussed above in quite a few posts. In particular, if you go for wrapper structs, and ignore the needed boilerplate, you risk needing something like https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/coercible.pdf.

Moreover, delegation of implementation sounds related to "generalized newtype deriving", which in Haskell needed further fixes for a bunch of soundness issues in combination with other advanced features (IIRC, mostly GADTs and type families, which Rust might or might not have). I'm not sure any of those apply (yet) to your proposal though, and wrapper structs/delegation are useful beyond a hack for coherence.

@burdges

This comment has been minimized.

Copy link

burdges commented Dec 13, 2016

I'd imagine you'll get a better answer from someone more experienced, but..

Anything that makes reading or reasoning about the code harder sounds dangerous. In particular, a security audit retains its value much longer if the impl used gets determined explicitly by the call site and cannot be changed by code elsewhere in the program.

There was considerable opposition to adding an impl becoming a breaking change in past discussions of negative trait bounds #1658 although more careful forms made progress #1672 (comment). Also, impl specialization only changes default methods #1210 while the final impl must visibly exclude any methods it wants pulled from defaults #1268.

As an aside, there is a .0 member to unwrap a Rust wrapper like struct Mine<T>(pub T), while afaik you need a uniquely named accessor in Haskell, ala data Mine T = Mine { getMine :: T }, so maybe you're happier just unwrapping explicitly when you want impls that apply only to T.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Dec 13, 2016

Thanks for pointing that out; semantic versioning guidelines would need updating. Could you elaborate on why that's actually a showstopper rather than a tradeoff to think about? Dropping a local instance when upgrading a dependency doesn't sound hard and it happens already in Haskell (if you resort to orphans). If instances are different then you need wrapper structs.
Or am I missing some Rust-specific reason?

You're assuming that the impl defined in the dependency you're upgrading isn't materially different from the impl you've defined locally. If it does produce a materially different output that breaks the behavior of your code. For example, maybe you've locally defined Add on Vec<T> as a concat operator, because you think its so silly that Rust doesn't have that sugar, and then in std after years of debate we decide on some mathy operation to be performed instead. If you delete your impl and upgrade, your code will be totally broken.

But of course we could just accept that adding impls of existing traits for existing types is a breaking change. This works, except that right now there are a bunch of impls we don't have yet in std for various reasons (eg. any impl on an array of length greater than 32). So maybe someday, when we do think we have all the impls we want, you'd think we could do this.

The "philosophical" reason against this, even at that point in the future, is that we're trying (with some frustrating exceptions involving Send and Sync) to provide this guarantee: adding new items to your system is never a breaking change. Whether that's through adding a dependency or upgrading one, as you keep introducing new well-formed items the program keeps compiling. "Information monotonically increases." It keeps the upgrade story very smooth and upholds Rust's promise of "stability without stagnation."

People often lump this property into coherence, but coherence is about what happens when you change the compiler underneath the code, not changing the code itself. I've want to name this property "concordance" but I can't bring myself to finish a blog post.

@Blaisorblade

This comment has been minimized.

Copy link

Blaisorblade commented Dec 13, 2016

@burdges

This comment has been minimized.

Copy link

burdges commented Dec 13, 2016

At least a wrapper structs makes the propagation of impls transparent. If you delegate then any relevant changes propagate for those traits. Any trait commonly needing local wrappers with tweaked impls can provide smartish default methods via #1210.

A person reading code carefully needs to identify the impl referenced by every call site. Anything that makes this less local makes (a) reading the code significantly harder, and (b) the value of that reading far more transient. If one can make a private impl for say From<T>, then it can override even the reflexive impl, so every call to into() becomes suspect, even in library code. As does every imported macro.

We worry not just about bugs here but hostile developers too. An attacker can find a change in an impl in a library, a call site in an application they wish to backdoor, and send a "refactoring" pull request that "accidentally" replaces the new impl with the old impl so as to create a vulnerability, but their pull can reference the old code from the library. And they can embed the hostile impl into a macro in yet another create.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Dec 15, 2016

Thanks for your comment. When I said "if instances are different you need wrapper structs" I should have said "if impls are different", because of the kind of scenario you describe. So agreed there.

But one doesn't know in advance if the impl you write will be different from the impl that doesn't exist yet. We're not going to require people to rewrite their code to use newtype structs in order to upgrade to a new version of Rust, that would definitely meet our definition of a breaking change.

That would mean that local impls have to override global ones, among the last options, though this raises other questions on where the overriding applies. I haven't looked

There are a few cases where name resolution is an issue (for example, adding a default method to a trait is this sort of breaking change). Since we have a canonical, fully qualified, unambiguous way to name any item, we could consider a breaking change of this sort (all you have to do is change the syntax you use to perform a function call).

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Dec 16, 2016

In the abstract, would it make logical sense to say that:

  • if you depend on a specific version of an upstream library, then you are free to define impls whose legality depends on the fact that the upstream doesn't currently define any impl that would conflict with them;

  • on the other hand, if you depend on a minimum version of that library, then any impls you define must be compatible with any impls the upstream could potentially add? (i.e., @withoutboats's argument, and current Rust policy)

It feels like these things are connected, even if Rust doesn't currently let you make the choice. (Of course, it would probably violate separation of concerns between Cargo and Rust, which is why I'm asking "in the abstract".)

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Dec 16, 2016

@glaebhoerl You mean orphan impls here?

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Dec 16, 2016

Yes I think? Isn't that what the discussion's about?

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Dec 16, 2016

Yes, I just woke up and I'm quite groggy.

@burdges

This comment has been minimized.

Copy link

burdges commented Dec 17, 2016

If you want to depend on a specific version, then you can just "vender the dependency" to use Go speak. In that vein, you could maybe avoid violating separation between cargo and rustc by making the dependency live in your own source tree, and enforce the dependency on a specific version using git submodules.

As an aside, I do think Go's policy of expecting people to vender dependencies is kinda antithetical to free software, but this usage sounds reasonable. And git submodules stand out like a sore thumb too.

@RReverser

This comment has been minimized.

Copy link

RReverser commented Mar 1, 2017

I probably missed something in the discussion, but if concern is that upstream might add own impl in a minor version, then:

  1. This concern applies to any breaking changes in minor versions - e.g. you had local trait MyTrait in scope and implemented my_method for upstream type, but then upstream type defined own my_method in a minor version and your program doesn't compile anymore because of ambiguity. There is this fine line where we can only expect crate developers to be careful about semver, and not block new features on such concerns.

  2. Even if we want to avoid that issue with impl specifically, why not just always prefer impl that is in local scope to any imported? Then calling trait methods would follow normal scoping rules and wouldn't come much as a surprise for end developer, and wouldn't break on upstream addition of impl.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Mar 1, 2017

@RReverser

On your first point, the key difference is that there is a fully explicit syntax you just need to transform to to fix the error - <_ as MyTrait>::my_method. However, even though its an 'allowed' change, I know this fact is considered when evaluating new APIs and is a reason people aren't quick to move methods from itertools to Iterator for example.

On your second point, that would violate the coherence property we're trying to maintain. It would be very bad to violate this property because of something called 'the HashTable problem'. For example:

mod foo {
    impl Hash for i32 { ... }

    fn f(mut table: HashMap<i32, &'static str>) {
        table.insert(0, "hello");
        ::bar::f(&table);
    }
}

mod bar {
    impl Hash for i32 { ... }

    fn f(table: &HashMap<i32, &'static str>) {
        assert_eq!(table.get(0), Some("hello"));
    }
}

You might get a different value out for the same key if you move the HashMap between modules/crates, because it will Hash with a different impl. This would be quite bad. And of course it applies to all kinds of traits, not only Hash. In general, when people write code they assume that if you call a method on the same value, you keep dispatching to the same place. Basically, methods ought to be type scoped, not module scoped.

@burdges

This comment has been minimized.

Copy link

burdges commented Mar 1, 2017

In fact, there is a safer version of this that you can do in Rust @RReverser so long as all you want is convenience in your own crate. Just impl the trait for a wrapper struct like in this example. You can even derive traits for structs whose members do not impl the traits by wrapping those members too.

@RReverser

This comment has been minimized.

Copy link

RReverser commented Mar 1, 2017

@burdges That's exactly the entire point we're trying to avoid with local derives (which is this issue about) - not having Wrap(x) and x.0.original_prop everywhere around the code just for the sake of passing it to compatible trait consumers.

For example, if upstream defines own implementation for u8, I don't really want to wrap my usages of u8 just to substitute that specific behavior - I just want to use my locally defined trait as I would with local functions vs imports.

@RReverser

This comment has been minimized.

Copy link

RReverser commented Mar 1, 2017

You might get a different value out for the same key if you move the HashMap between modules/crates, because it will Hash with a different impl. This would be quite bad. And of course it applies to all kinds of traits, not only Hash. In general, when people write code they assume that if you call a method on the same value, you keep dispatching to the same place. Basically, methods ought to be type scoped, not module scoped.

@withoutboats I'm not sure this is a strong argument, this applies to any languages and to anything that can be named, not just traits - e.g. if I have same-named structure called IntNewType in two different types or same-named function write or using commonly named variable like i, it's obvious that moving it across files or even scopes might break the snippet.

Not really a trait issue in any way, just normal scope semantics.

@Blaisorblade

This comment has been minimized.

Copy link

Blaisorblade commented Mar 1, 2017

You might get a different value out for the same key if you move the HashMap between modules/crates, because it will Hash with a different impl. This would be quite bad. And of course it applies to all kinds of traits, not only Hash. In general, when people write code they assume that if you call a method on the same value, you keep dispatching to the same place. Basically, methods ought to be type scoped, not module scoped.

@withoutboats I'm not sure this is a strong argument, this applies to any languages and to anything that can be named, not just traits - e.g. if I have same-named structure called IntNewType in two different types or same-named function write or using commonly named variable like i, it's obvious that moving it across files or even scopes might break the snippet.

Not really a trait issue in any way, just normal scope semantics.

@RReverser There's a misunderstanding about "move".
In the example above, in fact, a hashtable misbehaves just because it is created in one module and used in another one. Not because the code manipulating it is moved (though that would also cause problems).
Also, even if you're impl-using code across modules, the code you're moving never refers explicitly to the impl it uses, so it's harder to check if you're breaking things. Worse, if you write a polymorphic method baz that uses an impl for a type parameter, the impls can't be resolved right away—the client site that picks the type argument also specifies the impl to use, which might not be the one that was in scope in baz's definition.

In general, impls are just not lexically scoped.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Mar 1, 2017

In addition, you don't get the hashtable problem if names refer to different items in different scopes, you get a type error. e.g.

mod foo {
    struct Foo;
    fn foo() -> u32 {
        ::bar::bar(Foo)
    }
}

mod bar {
    pub struct Foo(u32);
    pub fn bar(foo: Foo) -> u32 {
        foo.0
    }
}

You don't get some undefined behavior hear, you just get a compiler error. And you can always call ::bar::bar in foo by using the fully qualified name to construct a ::bar::Foo.

But impls don't have names and aren't scoped. In order to scope them, you'd have to give them names, and then pass them around as well ass the types. In order for this to behave well it wouldn't be HashMap<K, V> but HashMap<K, V, H, E> where H is a Hash for K impl and E is an Eq for K impl, and whenever you take a hashmap, you have to specify which impls for those traits you're taking. It would be unmanageable.

@RReverser

This comment has been minimized.

Copy link

RReverser commented Mar 1, 2017

In the example above, in fact, a hashtable misbehaves just because it is created in one module and used in another one. Not because the code manipulating it is moved (though that would also cause problems).

Oh right, thanks for explanation. Although not sure if that's a big problem? Exactly same applies to default fn implementations proposal after all where you can have one "default" implementation in module A, and own implementation in module B.

In general, impls are just not lexically scoped.

Yes, that's exactly what I'm suggesting to change.

@RReverser

This comment has been minimized.

Copy link

RReverser commented Mar 1, 2017

In addition, you don't get the hashtable problem if names refer to different items in different scopes, you get a type error. e.g.

Not if it's a function or variable name or ...

E.g. when you have fn calc_hash in two modules with same types but different behavior, same concerns apply - that's why I'm saying this is not specific to traits and rather a normal scoping behavior, which we could apply to impls too.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Mar 1, 2017

@RReverser Specialization (default fn) is not lexically scoped, the impls form a module-independent ordering based on their types, as in Vec<u32> is more specific than Vec<T> is more specific than T.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Mar 1, 2017

Namespacing is not the same as what we're talking about. The important thing is that you can't pass something with one shape from one module to another just because they have the same name.

@RReverser

This comment has been minimized.

Copy link

RReverser commented Mar 2, 2017

Specialization (default fn) is not lexically scoped, the impls form a module-independent ordering based on their types, as in Vec is more specific than Vec is more specific than T.

Ah fair enough.

The important thing is that you can't pass something with one shape from one module to another just because they have the same name.

@withoutboats Here you couldn't either, in both places object needs to implement same trait, even if specific implementations differ (just like with functions where you still must have same arguments&return types, but implementations don't matter).

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Mar 2, 2017

That's not correct, the implementation is a part of the shape. HashMap only works if the Hash implementation for the key is the same every time its called.

What you're describing is not lexical scope but a very implicit kind of dynamic scope, because hashing calls inside of HashMap::insert or HashMap::get, which are not defined in these modules but inside the standard library, will call a different implementation of Hash depending on which module they themselves are called from.

HashMap can't call free functions from the module its called in unless you pass them in as arguments the same way it can with methods. Its totally different! Methods must be dispatched based on type and not scope, because they are passed around with the type.

@RReverser

This comment has been minimized.

Copy link

RReverser commented Mar 2, 2017

Well I'm open to hear other suggestions.

Example: I want all [u8] in my specific file to implement Debug not as everywhere else, but to be displayed as strings. Obviously, I don't want to change all my definitions of [u8] to &str or some Newtype<u8>, as that will have avalanche effect on all the usages inside my module and by other dependencies.

Neither I want to implement custom Debug for each separate structure that uses u8 inside of my file, as that will require manually writing formatters for all the other fields too.

Note that this isn't specific to Debug - for example, serde has to implement custom attributes like serialize_with / deserialize_with on fields to workaround this problem too, and there are many other traits where it would be useful to override behavior for limited scope.

If speaking not about lexical scope, would it be possible, for example, to have custom trait implementations bound to owner struct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.