Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISet Should Be Partially-Applied on its Order #671

Closed
djspiewak opened this issue Mar 13, 2014 · 48 comments
Closed

ISet Should Be Partially-Applied on its Order #671

djspiewak opened this issue Mar 13, 2014 · 48 comments
Milestone

Comments

@djspiewak
Copy link
Contributor

@djspiewak djspiewak commented Mar 13, 2014

So to set the stage, this is not an argument for or against local typeclasses or anything similar. I'm well aware of what typeclass coherence is. This is simply about what we can do within Scala.

The problem with the current ISet implementation is that it takes an Order[A] with every operation. This is entirely a non-issue if we can make the assumption that no one is ever, ever using more than one Order instance for the same type. However, the instant someone does this (and we cannot prevent it, because of Scala being…Scala), data is lost. This is directly analogous to the clearly-insane situation where someone puts a mutable value with an unstable equality into a Java hashtable.

The alternative to this implementation is to capture a single Order[A] at construction time. This implementation has no differences relative to the current situation if people are obeying the unspoken single-instance contract for typeclasses. The only difference arises when someone actually violates this rule and starts using multiple orderings. Specifically: no data will be lost. The ISet will continue to function exactly as per normal under all operations, despite the new (possibly contradictory) ordering that was introduced by the user.

The only point where things become different is in the case of union. The union of two sets involving different orderings is a major problem. The key realization though is it is no more or less of a problem under either implementation! If you union two sets with different orders under the current scheme, data will be lost from at least one side, and possibly both depending on how the orders interact. If you union two sets under different order with the captured, you still lose data. From an outside perspective, nothing changes.

What is interesting is that it is actually algorithmically possible to improve on the incoherent merge situation if we capture the order at construction time, whereas we have absolutely no hope if the order is always passed locally. I can provide more details on this point if necessary, but for now it suffices to say that there is an algorithm which can achieve a sub-log-linear merge (i.e. better than a full re-insertion) on a pair of lazy binary search trees preserving all data even when the trees are using distinct orderings (and this algorithm gracefully degrades to the standard merge algorithm when the orderings are identical). Said algorithm has no reliance on evil tricks like ord1 eq ord2 or similar.

Benefits to partially-applying on the Order at construction time:

  • Data is never lost, even when users do something unsound
  • Clever implementations to skirt around incoherence are possible
  • All existing use sites continue to work (except for those that are unsound but coincidentally working)

Downsides:

  • Introduces a local typeclass within ISet as a defensive measure against users introducing local typeclasses of their own
  • Not like Haskell

Honestly, I don't even see a question here. If you're not using local typeclasses, then this change carries absolutely no effect. If you are using local typeclasses, then this change removes silent data loss cases and enables potentially much more efficient implementation details.

@larsrh larsrh added this to the 7.1 milestone Mar 13, 2014
@pchiusano
Copy link
Member

@pchiusano pchiusano commented Mar 14, 2014

Sounds good to me, but I'd like to hear more about this fancy merge
algorithm that works even if trees are ordered differently.
On Mar 13, 2014 5:33 PM, "Daniel Spiewak" notifications@github.com wrote:

So to set the stage, this is not an argument for or against local
typeclasses or anything similar. I'm well aware of what typeclass coherence
is. This is simply about what we can do within Scala.

The problem with the current ISet implementation is that it takes an
Order[A] with every operation. This is entirely a non-issue if we can
make the assumption that no one is ever, ever using more than one Orderinstance for the same type. However, the instant someone does this (and we
cannot prevent it, because of Scala being...Scala), data is lost. This is
directly analogous to the clearly-insane situation where someone puts a
mutable value with an unstable equality into a Java hashtable.

The alternative to this implementation is to capture a single Order[A] at
construction time. This implementation has no differences relative to
the current situation if people are obeying the unspoken single-instance
contract for typeclasses. The only difference arises when someone actually
violates this rule and starts using multiple orderings. Specifically: no
data will be lost. The ISet will continue to function exactly as per normal
under all operations, despite the new (possibly contradictory) ordering
that was introduced by the user.

The only point where things become different is in the case of union. The
union of two sets involving different orderings is a major problem. The key
realization though is it is no more or less of a problem under _either_implementation! If you union two sets with different orders under the
current scheme, data will be lost from at least one side, and possibly
both depending on how the orders interact. If you union two sets under
different order with the captured, you still lose data. From an outside
perspective, nothing changes.

What is interesting is that it is actually algorithmically possible to
improve on the incoherent merge situation if we capture the order at
construction time, whereas we have absolutely no hope if the order is
always passed locally. I can provide more details on this point if
necessary, but for now it suffices to say that there is an algorithm which
can achieve a sub-linear merge on a pair of lazy binary search trees
preserving all data even when the trees are using distinct orderings (and
this algorithm gracefully degrades to the standard merge algorithm when the
orderings are identical). Said algorithm has no reliance on evil tricks
like ord1 eq ord2 or similar.

Benefits to partially-applying on the Order at construction time:

  • Data is never lost, even when users do something unsound
  • Clever implementations to skirt around incoherence are possible
  • All existing use sites continue to work (except for those that are
    unsound but coincidentally working)

Downsides:

  • Introduces a local typeclass within ISet as a defensive measure
    against users introducing local typeclasses of their own
  • Not like Haskell

Honestly, I don't even see a question here. If you're not using local
typeclasses, then this change carries absolutely no effect. If you _are_using local typeclasses, then this change removes silent data loss cases
and enables potentially much more efficient implementation details.

Reply to this email directly or view it on GitHubhttps://github.com//issues/671
.

@djspiewak
Copy link
Contributor Author

@djspiewak djspiewak commented Mar 14, 2014

I'll see about writing it up more fully. The essence of it is lazily amortizing the reordering required to make the conventional algorithms work. The amortization can be performed in such a way that the complexity addition to subsequent operations decays logarithmically, thus avoiding compounding complexity with subsequent unions. I suspect it is possible to play this trick with intersect as well, but I haven't given it any thought.

My only concern at present is the fact that the algorithm I have in mind requires an enormous amount of thunking, which is likely to make baseline constant-factor performance dramatically slower on the JVM. An evil fast-path would be to use eq as a way to detect the common case.

@tonymorris
Copy link
Member

@tonymorris tonymorris commented Mar 14, 2014

This would be an extremely bad idea. I am begging, seriously, let us not do these terrible, costly things that provide zero benefit.

I have work to do now.

@djspiewak
Copy link
Contributor Author

@djspiewak djspiewak commented Mar 14, 2014

Zero cost with several significant benefits (ie not being completely broken
in certain cases).

On Thursday, March 13, 2014, tonymorris notifications@github.com wrote:

This would be an extremely bad idea. I am begging, seriously, let us not
do these terrible, costly things that provide zero benefit.

I have work to do now.


Reply to this email directly or view it on GitHubhttps://github.com//issues/671#issuecomment-37611145
.

@tonymorris
Copy link
Member

@tonymorris tonymorris commented Mar 14, 2014

Yeah sure. This kind of nonsense does not belong in a useful project. Keep it out please.

@puffnfresh
Copy link
Member

@puffnfresh puffnfresh commented Mar 14, 2014

Benefits:

  1. None (I do not accept "making type-class incoherency less obviously broken" as a benefit)

Downsides:

  1. Puts a constraint on creation and all uses of ISet

And I don't know about the "not like Haskell" downside: Haskell could even do this, but it'd be more obviously bad.

@djspiewak
Copy link
Contributor Author

@djspiewak djspiewak commented Mar 14, 2014

It's not obviously broken now. You lose data and you know now why. With my
proposal, you don't lose data and there is the potential for efficient
solutions to some of the problems that are unavoidable with call site
constraints. At the very least, there are solutions. With call site
constraints, it's just silently wrong.

If you don't use local typeclasses, this doesn't affect you. The only cases
which would fail following this change are already horribly broken.

On Thursday, March 13, 2014, Brian McKenna notifications@github.com wrote:

Benefits:

  1. None (I do not accept "making type-class incoherency less obviously
    broken" as a benefit)

Downsides:

  1. Puts a constraint on creation and all uses of ISet

And I don't know about the "not like Haskell" downside: Haskell could even
do this, but it'd be more obviously bad.


Reply to this email directly or view it on GitHubhttps://github.com//issues/671#issuecomment-37613686
.

@puffnfresh
Copy link
Member

@puffnfresh puffnfresh commented Mar 14, 2014

@djspiewak making broken things (e.g. local type-classes) only seem less broken (without actually changing the brokeness) while sacrificing sensible things (e.g. the universally quantified constraints on functions) is not a goal.

@puffnfresh
Copy link
Member

@puffnfresh puffnfresh commented Mar 14, 2014

@djspiewak I just want to leave this by pointing out that it does affect me:

scala> val example1 = ISet.empty[Int => String].isEmpty
example1: Boolean = true

scala> val example2 = ISet.singleton[Int => String](_.toString).isEmpty
example2: Boolean = false

scala> val example3 = ISet.singleton[Int => String](_.toString).deleteMin
example3: scalaz.ISet[Int => String] = Tip()
@tonymorris
Copy link
Member

@tonymorris tonymorris commented Mar 14, 2014

It is possible to debunk this nonsense with a whisper of breath. It is annoying to be compelled to expend any further effort on it. I refuse to do so.

@tonymorris tonymorris closed this Mar 14, 2014
@runarorama
Copy link
Member

@runarorama runarorama commented Mar 14, 2014

I'm in agreement that this is a bad idea.

The basic point for me is that it would be impossible to construct an empty set of type Forall[ISet], which makes no sense. The existential type class thing works alright until you need to do something that is not completely first-order.

Basically, if you are using more than one instance of a type class per type, then you're doing something very wrong.

@nuttycom
Copy link
Contributor

@nuttycom nuttycom commented Mar 14, 2014

I also agree that this is a bad idea. It's regrettable that we don't have genuine newtypes in scala (I still can't reason about all the occasions under which value classes cause boxing), but I don't believe that we should ever make a change that supports incoherent type classes. If someone wants to do something that awful, they always have the option to maintain their own fork for their evil purposes; I think that the chances of such a fork becoming dominant are infinitesimal.

@djspiewak
Copy link
Contributor Author

@djspiewak djspiewak commented Mar 14, 2014

Apparently a simple matter of correctness and robust API design in the face
of a language misfeature has morphed into a minor crusade against said
misfeature. I believe I specifically attempted to preempt that in sentences
one through three.

In any case, it's done, and clearly ISet will remain as it is.

On Thursday, March 13, 2014, Kris Nuttycombe notifications@github.com
wrote:

I also agree that this is a bad idea. It's regrettable that we don't have
genuine newtypes in scala (I still can't reason about all the occasions
under which value classes cause boxing), but I don't believe that we should
ever make a change that supports incoherent type classes. If someone wants
to do something that awful, they always have the option to maintain their
own fork for their evil purposes; I think that the chances of such a fork
becoming dominant are infinitesimal.


Reply to this email directly or view it on GitHubhttps://github.com//issues/671#issuecomment-37617494
.

@tonymorris
Copy link
Member

@tonymorris tonymorris commented Mar 14, 2014

Yes clearly. Let's move on now.

@pchiusano
Copy link
Member

@pchiusano pchiusano commented Mar 14, 2014

I didn't realize this was such a contentious issue. Could someone explain to me "like I'm five", why this is a bad idea? I am genuinely asking. Here are the points I see that have been made:

  • Incoherent/local typeclasses are a bad idea, so any attempt to support this use case better is not something we should bother with. I have a question related to this - will the current implementation of union for ISet just silently fail or lose data if one or both of the sets was constructed with a different Order than the one used for union? If so, that seems like something that would be nice to improve on, all else being equal.
  • I find @runarorama's point about the empty set no longer having the type forall a . ISet a kind of interesting.
  • @puffnfresh can you explain your example? What are you illustrating there?

Also, for the record, I think @djspiewak was posting this in good faith, and you guys were pretty rude to him.

@pchiusano pchiusano reopened this Mar 14, 2014
@tonymorris
Copy link
Member

@tonymorris tonymorris commented Mar 14, 2014

This is really unfair and unreasonable.

@runarorama
Copy link
Member

@runarorama runarorama commented Mar 14, 2014

Yes, absolutely it will silently lose data. If you e.g. union with an Order[A] that considers all elements of A to be equal, you will lose all but one element from your sets. But that's no different than mapping to Unit. Basically if you provide an ordering that behaves like Unit, then you will get a result that behaves like Unit.

Again, if you're using more than one ordering per type, you are already off the rails.

@pchiusano
Copy link
Member

@pchiusano pchiusano commented Mar 14, 2014

@runarorama that example actually sounds okay to me, because the resulting ISet is well-formed with respect to the Order that was passed to union. You may have chosen a silly Order that considers all elements equal, but that seems okay to me. I was more asking about a use case where say I merge two ISet[A] where one was built with x: Order[A] and the other was built with x.reverseOrder, and I then union them with x as the Order. Will the resulting ISet be silently ill-formed or missing data, with respect to x, the ordering passed to union?

I don't think I agree that it is necessarily a problem having more than one Order per type. I may be dynamically generating Order[A] values, storing them in a Map[String, Order[A]], and letting users pick an Order[A] by name at runtime. (Kind of like what we were doing with assembling Monoid/Reducer based computations at runtime.) Clearly this sort of thing is sometimes useful.

It seems the issue is not with having multiple instances per se, it's that we sometimes would like to enforce or assume that there is a single instance, for operations like union, or for other reasons (like if we don't want to have to specify an Order for ISet.empty).

Also, not that I endorse doing this, but you can enforce the one-instance rule by tagging your instances, and using the same universal quantification trick as ST, so something like:

newtype Tag r a = Tag a

newtype Order a = Order (a -> a -> Ordering)

use :: Order a -> (forall t . Tag t (Order a) -> Tag t r) -> r
empty :: Tag t (Set a)
fromList :: Tag t (Order a) -> [a] -> Tag t (Set a)
union :: Tag t (Set a) -> Tag t (Set a) -> Tag t (Set a)
-- other operations propagate tag as expected
The type system now prevents you from union-ing two Set a built with different instance. You are forced to introduce Order instances in a stack-like discipline. We could actually do this sort of thing for Scala, but I suspect it gets kind of ugly.

@runarorama
Copy link
Member

@runarorama runarorama commented Mar 14, 2014

"I was more asking about a use case where say I merge two ISet[A] where one was built with x: Order[A] and the other was built with x.reverseOrder"

In that case, you should be fine. But, you should have two different types for those orderings. In fact, perhaps reverseOrder should return a new type Reverse[A].

"Clearly this sort of thing is sometimes useful."

It's not all that clear to me. But the larger question here isn't whether this or that thing is useful. It is whether we should try to correct for incoherence introduced by the user. No, we should not. We should absolutely assume that the user is moral and good, and if they try to pass null, or construct values that contain exceptions and side effects, or use incoherent type class instances, then we shouldn't try to compensate for that. When the user does something unsound, all bets are off, and they should be off.

@djspiewak
Copy link
Contributor Author

@djspiewak djspiewak commented Mar 14, 2014

It's very important to note that, because Scala doesn't implement typeclasses but rather implicit dictionaries, it is possible to get into a bad situation with ISet as it stands without any form of incoherence. Just pass the order explicitly. No rules have been violated aside from the undocumented internal rules of ISet.

@pchiusano
Copy link
Member

@pchiusano pchiusano commented Mar 14, 2014

I know why null, exceptions, side effects, and so on are bad. The arguments against multiple instances seem to just be that in some contexts, it is preferable to assume one instance exists. Yes, I agree. But is that really it? Or is there more to it?

Furthermore, what difference does it make if I supply the Order instance at construction time vs on each operation? I understand that it's nice to have the empty set have the type forall a . Set a, so there's that, but let's ignore the empty set case for a minute and pretend we're discussing a NonEmptySet type. Like, what is the actual argument for why it matters where I supply this evidence? If your program doesn't use multiple instances, why do you even care?

I thought about this and came up with the following:

  • union now has to do something arbitrary when given sets built with different orderings. By your own argument, who cares about this case?
  • union is no longer commutative when multiple instance are used. Even if we have some clever union algorithm, we have to pick one order or the other for the result. I think this is actually a somewhat compelling argument. OTOH, do we really care about this case of multiple instances being used?
  • Anything else?
@djspiewak
Copy link
Contributor Author

@djspiewak djspiewak commented Mar 14, 2014

union would still commute so long as the following two conditions are held:

  • Both the left and right Order instances are total
  • The Order for a given ISet is not part of its identity

The first criterion is trivial, while the second criterion is obvious when you consider that equality cannot be generally defined on Order.

@pchiusano
Copy link
Member

@pchiusano pchiusano commented Mar 14, 2014

@djspiewak I don't see how that's possible. What if I do a union b, where a is ordered using x: Order[A], and b is ordered using x.reverseOrder? How can that possibly return the same thing as b union a? We have to pick either x or x.reverseOrder as the order for the resulting Set.

Or did you mean to say that the ordering of the set's elements is not part of its identity? Because I don't buy that assumption - if Set has a toList function, which returns the elements in some order, then that order becomes observable to other code. You could make functions like toList take an Order and re-sort according to that order, but that would kind of defeat the purpose of an ordered set type, for many use cases.

@puffnfresh
Copy link
Member

@puffnfresh puffnfresh commented Mar 14, 2014

@pchiusano only one instance per type-class per type gives type-class coherency, also known as confluence or instance consistency. It's an important property for reasoning and sometimes performance. There's some papers which quickly describe why we want this property - but nothing very in-depth.

As to why not to do this:

{-# LANGUAGE ExistentialTypes #-}
data Set a = Ord a => Tip | Bin (Set a) a (Set a)

Well, it's just negatively useful. We now put an artificial constraint on our data type. Now all of our functions have more constraints - I want the minimum amount of constraints, since that gives me the most powerful parametricity.

I can go on, but I think this demonstrates that this is a bad idea!

@runarorama
Copy link
Member

@runarorama runarorama commented Mar 14, 2014

No, the burden of proof is the other way. What is the actual argument for
why you should supply the Order instance at construction time? Is it really
that multiple instances are convenient in some contexts? Is that all there
is to it? If so, then why should we attempt to get in your way by having
you supply your order instances to each set rather than to each operation?

Let's try to keep this discussion at the level of principles. Rather than
think about what is expedient in this particular case, what is a good
general policy that applies here, and why?

On Fri, Mar 14, 2014 at 9:42 AM, Paul Chiusano notifications@github.comwrote:

I know why null, exceptions, side effects, and so on are bad. The
arguments against multiple instances seem to just be that in some contexts,
it is preferable to assume one instance exists. Yes, I agree. But is that
really it? Or is there more to it?

Furthermore, what difference does it make if I supply the Order instance
at construction time vs on each operation? I understand that it's nice to
have the empty set have the type forall a . Set a, so there's that, but
let's ignore the empty set case for a minute and pretend we're discussing a
NonEmptySet type. Like, what is the actual argument for why it matters
where I supply this evidence? If your program doesn't use multiple
instances, why do you even care?

I thought about this and came up with the following:

  • union now has to do something arbitrary when given sets built with
    different orderings. By your own argument, who cares about this case?
  • union is no longer commutative when multiple instance are used. Even
    if we have some clever union algorithm, we have to pick one order or the
    other for the result. I think this is actually a somewhat compelling
    argument. OTOH, do we really care about this case of multiple instances
    being used?
  • Anything else?

Reply to this email directly or view it on GitHubhttps://github.com//issues/671#issuecomment-37668840
.

@djspiewak
Copy link
Contributor Author

@djspiewak djspiewak commented Mar 14, 2014

toList is a good counter-example, though I'm generally of the mindset that toList shouldn't exist on sets at all. Nevertheless, it does exist, and would most certainly show a difference. We could probably do something evil to make Order selection consistent (if (ord1.hashCode < ord2.hashCode) ord1 else ord2 anyone?), but that seems like a terrible concept.

@djspiewak
Copy link
Contributor Author

@djspiewak djspiewak commented Mar 14, 2014

@runarorama I gave a long argument in the OP why the current situation is problematic, and I showed later on how users can get into that state without even violating typeclass coherence. The only strong counter-argument I've heard to date is the Forall[ISet] case, which I think would be a fairly serious loss. What else is there, though? Losing Forall[ISet] is a loss of convenience. Leaving ISet the way it is perpetuates a lack of correctness in the API.

@runarorama
Copy link
Member

@runarorama runarorama commented Mar 14, 2014

"The problem with the current ISet implementation is that it takes an Order[A] with every operation."

This is not a problem in and of itself.

"This is entirely a non-issue if we can make the assumption that no one is ever, ever using more than one Order instance for the same type."

Exactly. Which is why it's not a problem in and of itself. It only becomes an issue if there are incoherent type class instances in play. In which case, any attempt to correct for that is as misguided as an attempt to correct for null. Seems pretty open and shut to me.

@djspiewak
Copy link
Contributor Author

@djspiewak djspiewak commented Mar 14, 2014

It only becomes an issue if there are incoherent type class instances in play.

Explicitly pass two different orders. No typeclasses involved.

@puffnfresh
Copy link
Member

@puffnfresh puffnfresh commented Mar 14, 2014

@djspiewak explicitly passing something violates coherency.

@runarorama
Copy link
Member

@runarorama runarorama commented Mar 14, 2014

"Explicitly pass two different orders. No typeclasses involved."

Right, so if the user chooses to do that, that is their choice. The law should not try to stop you from doing something stupid with your own life.

@runarorama
Copy link
Member

@runarorama runarorama commented Mar 14, 2014

At the end of the day, everything in Scalaz comes implicitly with the following warning:

Do not expect this to work in a lawful and sane way in the face of null, side effects, exceptions, and incoherence. Scalaz is a mirror. If a monkey looks in, no apostle looks out.

@djspiewak
Copy link
Contributor Author

@djspiewak djspiewak commented Mar 14, 2014

@djspiewak explicitly passing something violates coherency.

For the love of god, will someone please link me a definition? This isn't twitter; we can actually cite things.

@pchiusano
Copy link
Member

@pchiusano pchiusano commented Mar 14, 2014

@puffnfresh Thank you for your response. But it's not 'more constraints across the board'. We accept more constraints on empty in exchange for fewer constraints on functions like add :: a -> Set a -> Set a, remove, and so on. Suppose I'm in a higher order context where I've received a Set a and possibly another way of getting a values, but know nothing else about a. Being able to call add or remove in these contexts might be useful. The alternative would be propagating the Ord a dependency. Of course, you can do that, but one approach is not always better than the other.

@runarorama I don't see how this decision (of where to bind constraints) can be resolved by general principle. It feels like a question of what factoring makes the most sense for your program. For instance, one sometimes uses a Coyoneda-like construction, which in a sense defers having to provide a constraint on the type constructor until you actually want to 'run' the data type. This is not always a good choice, nor is it always a bad choice. Seems to me it's the same sort of thing here. So, I would still like to hear an argument one way or the other. If your argument is just that you don't like empty to require any constraints, then okay. If you have some other argument, then let's hear it! I don't see the arguments about 'not supporting typeclass incoherency' as being particularly relevant to the question of where the constraint should be bound, especially in Scala. I think we all agree that mixing multiple instances of Order when working with sets is bad, so we can stop talking about that. :)

For the record, I do not have strong feelings one way or another, nor do I have a strong argument for one way or the other for this particular case.

@runarorama
Copy link
Member

@runarorama runarorama commented Mar 14, 2014

"I think we all agree t hat mixing multiple instances of Order when working with sets is bad, so we can stop talking about that."

That is the whole issue here:

"This is entirely a non-issue if we can make the assumption that no one is ever, ever using more than one Order instance for the same type."

So I agree, let's stop talking about this.

@puffnfresh
Copy link
Member

@puffnfresh puffnfresh commented Mar 14, 2014

@djspiewak Order is an emulation of a type-class - it is not meant to be used as a record. Sometimes we might need to, because Scala makes it so, but we accept that this is going outside of intended use and therefore we must be careful.

Functional Pearl: Implicit Configurations quickly defines coherence:

Coherence means that two typing derivations for the same term at the same type in the same environment must be observationally equivalent.

Standard Haskell ensures coherence by prohibiting overlapping instances. By requiring that every local instance mention an opaque type, we ensure that two local instances from different scopes cannot overlap—at least, not if their parameters are fully instantiated.

In Scala, we use implicit parameters for type-classes, but we have to leave that as an implementation detail in order to have hopes of guaranteeing the above.

@pchiusano
Copy link
Member

@pchiusano pchiusano commented Mar 14, 2014

@runarorama you still did not respond to my point about where constraints 'should' be bound. Can you please do that, otherwise we are just talking past each other.

@djspiewak
Copy link
Contributor Author

@djspiewak djspiewak commented Mar 14, 2014

@puffnfresh Thanks for the definition. I think we are now going somewhat beyond the scope of this ticket, but I'll point out the critical element here:

Coherence means that two typing derivations for the same term at the same type in the same environment must be observationally equivalent.

The definition of coherence is a function of your definition of environment, which is essentially the argument I was making in New York.

@puffnfresh
Copy link
Member

@puffnfresh puffnfresh commented Mar 14, 2014

@pchiusano here is isEmpty:

final def isEmpty =
  this match {
    case Tip() => true
    case Bin(_, _, _) => false
}

We know nothing about what the Set holds, so it has to be a function only of the Set's structure. Woo, parametricity! 🎆

Let's add Order:

final def isEmpty(implicit o: Order[A]) =
  /* let's do something awful by comparing elements */

By giving our function another constraint, we've limited what we know about the function by limiting what we know about what the function does not do.

@larsrh
Copy link
Contributor

@larsrh larsrh commented Mar 14, 2014

I'm closing this issue. There's recent precedent with #619 to not capture instances in data types; see also our ports of Data.Set and Data.Map. Older occurrences of the same thing are in e.g. BKTree. Feel free to discuss this matter further on the mailing list, though.

(To contributors: Please do not reopen this issue.)

@larsrh larsrh closed this Mar 14, 2014
@runarorama
Copy link
Member

@runarorama runarorama commented Mar 14, 2014

@pchiusano Consider a simpler case, a type class class C a where foo :: a -> Bool. There is a function forall a. bar :: C a => a -> a -> a with the following law:

If foo x and foo y then bar x y = x.

Now we change the type of bar to:

bar :: forall a. exists C. (exists C. (C a, a)) -> (exists C. (C a, a)) -> (C a, a)

Our law is now incoherent. How do we reason about bar (bar x y) z using our law now? Which instance of C is the winner? The central issue here is incoherence and overlapping instances. This is not about anything else.

You can come up with a one-off, ad-hoc, and completely unnecessary solution to this question of how to recover the coherence of the bar law. But it is just that, one-off, ad-hoc, first-order, and completely unnecessary.

@pchiusano
Copy link
Member

@pchiusano pchiusano commented Mar 14, 2014

@puffnfresh Yes, that makes sense, though depending on where constraints are, it means you get possibly fewer free theorems in other contexts that use Set, which must now accept Order constraints and may use this for evil purposes, even if they are just calling add or remove. The constraints (and hence loss of free theorems) have to be paid for one way or the other, somewhere in the code.

@runarorama let me try to paraphrase your argument. You are saying that when you have some interface, governed by certain laws, it can be incoherent / undefined when implementing operations that accept multiple existential instances. It's not clear which existential instance should win, and in general, this has no correct answer, so let's not even try to answer it. And in general, existential instances are rather questionable in many cases anyway.

I think I understand all the arguments here so I'm just going to drop it. Thanks for the discussion, and @larsrh sorry for hijacking this thread. :)

@runarorama
Copy link
Member

@runarorama runarorama commented Mar 14, 2014

@pchiusano Yes, exactly. I would make an even stronger statement and say that a general policy against existential instances is warranted because, more generally, we should not think of type class instances as values, even though Scala allows us to do that. Because thinking of them that way is harmful to reasoning about programs.

@nuttycom
Copy link
Contributor

@nuttycom nuttycom commented Mar 15, 2014

@runarorama this is a little off-topic, but I wonder about the policy you're suggesting, because I think that existential instances have a place: specifically, they're important for information hiding. Existential instances at least seem like they allow you to potentially limit your maintenance burden by exposing only a minimal set of operations for a type. For a trivial example, something like this:

data Foo a = forall f. Foldable f => Foo (f a)

seems appropriate when I don't want to reveal anything more about a type other than the fact that it can be folded over, but still leave multiple providers of values of this type free to choose whatever representation is appropriate. If you disallow existential typeclasses, what's the alternative here?

In my mind, there's a symmetry here: when defining functions, you may universally quantify the input to give your clients flexibility, and you may existentially quantify the output (with a data type as defined above) to give the library designer (yourself) similar flexibility.

@runarorama
Copy link
Member

@runarorama runarorama commented Mar 15, 2014

@nuttycom It's not existential type classes that I want to disallow, but treating instances as values. You would resolve this in Scala in the same way that it's resolved in Haskell: using universally quantified types.

trait Fold[A,R] {
  def apply[F[_]:Foldable](f: F[A]): R
}

sealed trait Foo[A] {
  def apply[R](f: Fold[A,R]): R
}

object Foo {
  def apply[F[_]:Foldable, A](a: F[A]): Foo[A] = new Foo[A] {
    def apply[R](f: Fold[A,R]) = f(a)
  }
}

Note that while the Foldable instance is closed over, it is not kept in a record. So the Foldable constraint appears both in the construction and deconstruction of Foo:

// Requires `Foldable[List]`
val foo: Foo[Int] = Foo(List(1,2,3))

val x = foo(new Fold[Int, Int]) {
  def apply[F[_]:Foldable](xs: F[Int]) =
    // Can reference the foldable instance, but it's not usable outside of here
    Foldable[F].foldMap(xs)(x => x)
})
@nuttycom
Copy link
Contributor

@nuttycom nuttycom commented Mar 15, 2014

@runarorama Ah, very clever. Thanks!

@etorreborre
Copy link
Contributor

@etorreborre etorreborre commented Mar 17, 2014

@runarorama there is a typo in your code example

// Requires `Foldable[List]`
val foo: Foo[Int] = Foo(List(1,2,3))

//                      here v
val x = foo(new Fold[Int, Int] {
  def apply[F[_]:Foldable](xs: F[Int]) =
    // Can reference the foldable instance, but it's not usable outside of here
    Foldable[F].foldMap(xs)(x => x)
})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
9 participants