New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heap bugs with Order #1236

Closed
chrisokasaki opened this Issue Aug 1, 2016 · 4 comments

Comments

Projects
None yet
3 participants
@chrisokasaki

chrisokasaki commented Aug 1, 2016

(I don't think this is likely to be a new issue, but I couldn't find any previous issues that discussed it.)

Because Heap's insert method takes the Order as an implicit parameter, it is very easy to get nonsense results by supplying incompatible orders, as shown below:

import scalaz._
import Scalaz._

def revOrd: Order[Int] = implicitly[Order[Int]].reverseOrder

val h = Heap.singleton(5).insert(7)
val h2 = h.insert(4)(revOrd)

println(h2.minimum)  // prints 5!

Of course, a programmer that does this deliberately may deserve what they get, but, because of implicits, this could easily happen accidentally when somebody creates a heap in one scope and inserts into it in a different scope that has a different implicit order defined..

Similar issues occur with the other insert methods, as well as with union.

This bug is almost certainly because the code was ported from Haskell, where a given type can only have a single Ord a instance, but Scala makes no such guarantee.

I've written a description of one approach to fixing the API of heaps to avoid this issue:
https://github.com/chrisokasaki/scads/blob/master/design/heaps.md
Fixing the issue for insert is fairly easy. Fixing the issue for union takes more work.

@S11001001

This comment has been minimized.

Member

S11001001 commented Aug 1, 2016

That's true, but we deliberately make the simplifying assumption that you will define typeclass instances in a globally coherent way, and use Tag or other newtype mechanisms if you wish to introduce incompatible instances, not the introduction of different implicits into scope.

@djspiewak

This comment has been minimized.

Contributor

djspiewak commented Aug 2, 2016

@chrisokasaki I raised a similar issue some time ago (#671), just for some historical context. Because of the broadly coherence-dependent assumptions that surround the scalaz API, I doubt this is a line of research that would be considered useful for the library.

Regarding your proposal document:

  • @bvenners has done some work using path dependent types to constrain coherence. He was specifically looking at Equal, but Order also applies. Off the top of my head, I think this talk details his latest as of last September: https://www.youtube.com/watch?v=UBjzbkhvYTU
  • I like the idea of getting around the union problem by just not defining merge. The problem is that most uses of heap structures are actually sets, and thus need union (at least in my experience).
  • I'm pretty sure that it's possible to resolve the union coherence problem by lazily amortizing the operation. My brain is foggy, and the last time I did any formal work on this was almost 3 years ago, but I had a proof sketch (NOT complete!) of O(n) for two sets which are secretly ordered using the same semantic, and O(n log n) in the worst case (when they have a completely incompatible ordering). No Order instance comparisons were required.
  • Are you aware of @odersky's currently ongoing efforts to redesign the Scala collections? Just wanted to give you a heads up in case you weren't already aware of the effort. lampepfl/dotty#818
@chrisokasaki

This comment has been minimized.

chrisokasaki commented Aug 2, 2016

@djspiewak Thanks. I was unaware of the Bill Venners video until somebody else pointed me to it earlier today. He is certainly doing similar things and is farther along in the process as well!

I happened to be at a meeting with Martin two weeks ago, and he encouraged me to flesh out a proposal for heaps.

@S11001001

This comment has been minimized.

Member

S11001001 commented Mar 27, 2017

Some folks have found my previous comment unwelcoming or rude to @chrisokasaki, or think that I've been overly dismissive of an absolutely accurate concern. I understand that; I relied on lots of implicit assumptions and context about Scalaz that haven't really been written down, and you couldn't be expected to know without spending a lot of time in the community, more time than many have available. So I thought I would expand on @djspiewak's point, which is an accurate summary:

Because of the broadly coherence-dependent assumptions that surround the scalaz API, I doubt this is a line of research that would be considered useful for the library.

In Scalaz, we think that global typeclass coherence is very important for usability and refactorability when employing "the typeclass pattern", whether by defining your own classes, instances, or just using the classes+instances defined by Scalaz. I want to highlight the "refactorability" concern specifically, because we aren't always the best at choosing the right place to place constraints, or applying the correct constraints to the instances we provide, so they sometimes move around from release to release.

With coherence, our users can be assured that our changes here will not silently break working code; if anything, they will manifest as a compiler error where additional constraints are required.

Of course, this comes at the cost of both carefully following the coherence rules ourselves, and demanding that our users follow these rules, too. For example, we ask users who wish to define incompatible instances (i.e. alternative "coherence domains") to combine global coherence with @@ newtype tagging¹; the type-level hackery involved with that isn't the easiest requirement to internalize, because we don't have a truly elegant or well-constrained way to subst. Compounding the problem, we don't do the best job of telling our users what they're in for, or what they'll get out of following our rules.

In my experience, and I suspect that of many other Scalaz users, the usability and refactorability benefits well exceed these costs.

Among the conveniences we can provide to users because of this shared covenant of coherence is that the interface for Heap et al can be simplified. @chrisokasaki's design is of great interest for supporting guaranteed correctness in the face of potentially incompatible instances (which, again, admittedly is a prerequisite for supporting Scala in general). If you are concerned about this, or have similar need for enforced invariants in the relationship between values (e.g. the Order and the various Heaps derived from it), definitely check it out!

However, in Scalaz, we have already charged users with preserving coherence, though; in this, they have already paid for consistently-structured mergeable heaps et al, and it would be a needless usability barrier to charge them more.

I should emphasize again, I agree 100% with the point that nonsense will happen if you do not have global coherence, e.g. you have "a different scope that has a different implicit order defined". We don't have a means in Scala to enforce global coherence², so these things can potentially happen. Nevertheless, the practical API usability and implementation advantages of assuming coherence seem to outweigh by quite a lot the danger of failing to account for incoherent instances, in my experience.


  1. As @djspiewak said elsewhere (lampepfl/dotty#2029), "the tag in the nominal type comes out of the uniqueness of the path-dependent instantiation" 😄 #1306
  2. And this will remain true under lampepfl/dotty#2047 supposing it's accepted, though that will be a great aid to usability.

@S11001001 S11001001 closed this Apr 5, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment