Overloading based implementation of Set, TreeSet, Map and TreeMap #24

julienrf · 2017-02-06T13:43:11Z

I sketched the integration of Set, TreeSet, Map and TreeMap collections. They were challenging because Map takes two type parameters where previous collections were only taking one type parameter, and also because TreeSet and TreeMap take an additional implicit Ordering parameter.

This is an alternative to #23 that makes Set extend Iterable and SortedSet have two overloads of map (one has an additional implicit Ordering parameter).

julienrf · 2017-02-06T13:44:31Z

src/main/scala/strawman/collection/Sorted.scala

+}
+
+/** Polymorphic transformation methods on sorted collections */
+trait SortedPolyTransforms[A, +C[X] <: Sorted[X]] {


This trait is analogous to IterablePolyTransforms but polymorphic transformation methods takes an additional implicit Ordering parameter.

It would be possible to abstract over different evidence types (e.g. Ordering[_] vs _ <:< Int) but at least the Int case could be done with a simpler API. Do we have any other similar cases?

Maybe ClassTag[_], depending on how we deal with arrays. I don’t understand what you mean with “the Int case”.

I mean that you wouldn't want to overload BitSet.map as def map[B](f: Int => B)(implicit ev: B <:< Int) but as def map(f: Int => Int)

But ClassTag could indeed be interesting. It would allow us to reuse a generic PolyTransformsWithEvidence for specialized collections.

I created #28 to keep track of this issue.

julienrf · 2017-02-06T13:44:54Z

src/main/scala/strawman/collection/Sorted.scala

+/**
+  * Factories for collections whose elements require an ordering
+  */
+trait OrderingGuidedFactories[C[_]] {


Analogous to IterableFactories, but with an implicit Ordering.

julienrf · 2017-02-06T13:48:06Z

src/main/scala/strawman/collection/immutable/SortedSet.scala

+  // so that they take precedence over the one inherited from IterablePolyTransforms
+  def map[B](f: A => B)(implicit ordering: scala.Ordering[B]): C[B]
+
+}


One problem is that Sorted and Set are independent so there is no disambiguation mechanism between their overloaded methods. That’s why I had to re-define the methods of Sorted (to make them take precedence over those of Set).

I also tried an approach based on intermediate LowPriority linear traits, without success:

trait SortedSetLikeLowPriority1[A, +C[X] <: SortedSet[X]] extends SetLike[A, Set] with SetOps[A, C[A]] with SetMonoTransforms[A, C[A]] trait SortedSetLikeLowPriority2[A, +C[X] <: SortedSet[X]] extends SortedSetLikeLowPriority1[A, C] with SortedLike[A, C] trait SortedSetLike[A, +C[X] <: SortedSet[X]] extends SortedSetLikeLowPriority2[A, C]

Couldn't this be resolved by making SortedPolyTransforms extend IterablePolyTransforms?

Probably, but this is not what we want: we really want SortedSet[A] to extend SortedPolyTransforms[A, SortedSet] and IterablePolyTransforms[A, Set] (note that we have Set in the latter case).

We could define SortedPolyTransforms as follows, as a workaround:

trait SortedPolyTransforms[A, +C[X] <: Sorted[X], +D[X] <: Iterable[X]] extends IterablePolyTransforms[A, D]

I wouldn't call it a workaround, I think that's how it ought to be defined. You also have SeqPolyTransforms <: IterablePolytransforms. Enforcing the correct linearization makes overrides work in the desired order.

I just pushed a commit that fixes the linearization order with a similar trick.

julienrf · 2017-02-06T13:49:48Z

src/test/scala/strawman/collection/test/Test.scala

+  def sortedSets(xs: strawman.collection.immutable.SortedSet[Int]): Unit = {
+    val xs1 = xs.map((x: Int) => x.toString)
+    val xs2: SortedSet[String] = xs1
+    val xs3: strawman.collection.immutable.Set[String] = xs.map((x: Int) => x.toString)


Note that the type annotation x: Int is required.

This looks wrong. Scala 2.12 should be able to perform overload resolution correctly without a type annotation.

Where does the problem come from?

@szeiger How could we investigate where does this issue come from?

Simplify and run through -Ytyper-debug?

Looks like a type inferencer bug. This is as far as I got: https://issues.scala-lang.org/browse/SI-10194

scala/scala#5708

julienrf · 2017-02-06T13:56:03Z

src/main/scala/strawman/collection/immutable/SortedSet.scala

+  extends SortedLike[A, C]
+    with SetLike[A, Set] // Inherited Set operations return a `Set`
+    with SetOps[A, C[A]] // Override the return type of Set ops to return C[A]
+    with SetMonoTransforms[A, C[A]] {


There is subtlety here: we inherit from SetLike[A, Set] so that methods like map return a Set rather than a SortedSet. However we don’t want + or ++ to return a Set but a SortedSet, hence the SetOpt[A, C[A]] and SetMonoTransforms[A, C[A]] mixins.

szeiger · 2017-02-08T17:04:05Z

src/main/scala/strawman/collection/Set.scala

+}
+
+/** Monomorphic transformation operations */
+trait SetMonoTransforms[A, +Repr] {


extends IterableMonoTransforms?

I added a commit with this change.

szeiger · 2017-02-08T17:21:41Z

src/main/scala/strawman/collection/Sorted.scala

+}
+
+/** Polymorphic transformation methods on sorted collections */
+trait SortedPolyTransforms[A, +C[X] <: Sorted[X]] {


It would be possible to abstract over different evidence types (e.g. Ordering[_] vs _ <:< Int) but at least the Int case could be done with a simpler API. Do we have any other similar cases?

szeiger · 2017-02-08T17:22:10Z

src/main/scala/strawman/collection/Sorted.scala

+/**
+  * Factories for collections whose elements require an ordering
+  */
+trait OrderingGuidedFactories[C[_]] {


szeiger · 2017-02-08T17:26:46Z

src/main/scala/strawman/collection/immutable/SortedSet.scala

+  // so that they take precedence over the one inherited from IterablePolyTransforms
+  def map[B](f: A => B)(implicit ordering: scala.Ordering[B]): C[B]
+
+}


Couldn't this be resolved by making SortedPolyTransforms extend IterablePolyTransforms?

szeiger · 2017-02-08T17:29:43Z

src/test/scala/strawman/collection/test/Test.scala

+  def sortedSets(xs: strawman.collection.immutable.SortedSet[Int]): Unit = {
+    val xs1 = xs.map((x: Int) => x.toString)
+    val xs2: SortedSet[String] = xs1
+    val xs3: strawman.collection.immutable.Set[String] = xs.map((x: Int) => x.toString)


This looks wrong. Scala 2.12 should be able to perform overload resolution correctly without a type annotation.

szeiger · 2017-02-08T17:31:57Z

src/main/scala/strawman/collection/Set.scala

+
+/** Base trait for set operations */
+trait SetLike[A, +C[X] <: Set[X]]
+  extends IterableLike[A, C]


Coming back to my argument for StableIterable from the other PR, we also inherit methods like take and drop from Iterable which only make sense with a stable ordering. Should we move them down to StableIterable? Or is this not important enough in practice and we should simply allow it for non-stable orderings, too (like in 2.12 and the current strawman design)?

I like this idea.

I created issue #30 to keep track of this idea.

julienrf · 2017-02-22T13:31:55Z

Rebased to master and added TreeMap.

julienrf · 2017-02-22T13:36:31Z

src/test/scala/strawman/collection/test/Test.scala

+
+  def mapOps(xs: Map[Int, String]): Unit = {
+    val xs1 = xs.map((k, v) => (v, k))
+    val xs2: strawman.collection.Map[String, Int] = xs1


This example demonstrates that calling map with a function that returns a tuple returns a Map.

julienrf · 2017-02-22T13:44:05Z

src/test/scala/strawman/collection/test/Test.scala

+    val xs1 = xs.map((k, v) => (v, k))
+    val xs2: strawman.collection.Map[String, Int] = xs1
+    val xs3 = xs.map(kv => (kv._2, kv._1))
+    val xs4: strawman.collection.Iterable[(String, Int)] = xs3


Note that here the distinction between the map overload that returns a Map and the one that returns an Iterable is based on the arity of the supplied function.

This is because map method of Map is defined as follows:

def map[K2, V2](f: (K, V) => (K2, V2)): C[K2, V2]

We could alternatively define it as follows:

def map[K2, V2](f: ((K, V)) => (K2, V2)): C[K2, V2]

Note that here, f takes one parameter, which is a tuple, where in the first snippet, it takes two parameters.

I’m not sure which solution is better… I think both can work and make sense.

With the binary function I like the ability to just write kvs.map((k, v) => …) without having to write { case (k, v) => … }. But note that dotty already makes it possible to write kvs.map((k, v) => …) even though f would have type ((K, V)) => (K2, V2). Maybe we could put this feature into scalac too…

Not sure I’m clear… Do you have any strong preference and why?

The binary function is potentially more efficient because it avoids tupling. Does the untupling in Dotty also work for overloaded functions? I thought it was done during type adaptation after overload resolution. If we can get scala/scala#5698 into Scala to make the case-syntax work here we will need a Function1 version but I still think we should have the Function2 version in addition.

Dotty can compile this PR as-is. It would also have no probblem with the Function1 version. In fact, it would let you transparently use binary closures such as "(k, v) => ..." as unary function arguments, expanding them to { case (k, v) => ... ". So as far as dotty is concerned the Function1 version is better since it can make the Function2 variant redundant.

julienrf · 2017-02-22T13:46:33Z

src/test/scala/strawman/collection/test/Test.scala

+    val xs3 = xs.map(kv => kv._1)
+    val xs4: immutable.Iterable[String] = xs3
+    val xs5 = xs.map((k: String, v: Int) => (v, k))
+    val xs6: immutable.SortedMap[Int, String] = xs5


Note that here we have to explicit the types of the parameters k and v. I think this type inference issue is already fixed by scala/scala#5708.

julienrf · 2017-02-22T13:50:25Z

src/test/scala/strawman/collection/test/Test.scala

+    class Foo
+//    val xs7 = xs.map((k: String, v: Int) => (new Foo, v)) Error: No implicit Ordering defined for Foo
+    val xs7 = (xs: immutable.Map[String, Int]).map((k, v) => (new Foo, v))
+    val xs8: immutable.Map[Foo, Int] = xs7


The commented line fails as expected because there is no implicit ordering for Foo. (Note that, alternatively, the overloading resolution could have chosen to use the map method of Map, which does not require an additional Ordering, but that’s not the case and I think that’s better like that)

However, if we upcast the OrderedMap to a simple Map, then the overload that is picked is the one that requires no Ordering and returns a Map.

I've been thinking about Rex's CanBuildFrom idea and there are a few things I'd like to try here. I think it may be possible to find a solution that is close to CanBuild without the implementation complexity of CanBuildFrom, that can provide a fallback to the unconstrained collection type when the evidence is not available.

In that specific case, though, I would expect a failure if the evidence is not available. When do you think you could push your experiment? For reference, I tried something here but it’s probably different from what you are thinking about: https://github.com/scala/collection-strawman/compare/poly-transforms#diff-e325090939d0c05b24284230016c77e7R189

This pattern works when classes don't have type parameters of their own:

trait Foo; trait Bar; trait TypeClass[C] {} class Super { def foo[A](f: () => A): Foo = ??? } class Sub extends Super { def foo[A: TypeClass](f: () => A): Bar = ??? }

If that pattern would work for collections it would be adequate to handle things like "this has to be a Tuple2 to stay a Map". Right now you get "missing parameter type" on (new Sub[String]).foo(_ + ".") if you give the classes type parameters. There's no real reason that has to be the case, though, AFAICT.

Anyway, that would be my proposal for the "less complex than CanBuildFrom", @szeiger. As long as the pattern works we can put in whatever, but I think a simple witness that the arguments are the correct type will be sufficient for most use cases (i.e. not replacing breakOut, but everything else).

The strawman LGTM so far. I propose let's go ahead and flesh it out. I did not quite get the simplified CBF idea. The example by @Ichoran looks good, but it seems to be exactly what is done now. After all, TreeSet's map

def map[B](f: (A) => B)(implicit ordering: Ordering[B]): TreeSet[B] = ???

is exactly the same as

def map[B: Ordering](f: (A) => B): TreeSet[B] = ???

which looks like the pattern that @Ichoran was proposing. Or am I missing something here?

Let's go ahead with this version. I'll start working in my idea now, but it's only a generalization of what we're already doing, so it doesn't have to hold up any work on this PR.

@odersky - The type inference doesn't currently work the way it probably ought to. It picks the most specific version and runs with it, then gives you an error if it doesn't work. So, in my toy example,

trait Foo; trait Bar; trait TypeClass[C] {} class Super[+A] { def foo[B](f: A => B): Foo = ??? } class Sub[+A] extends Super[A] { def foo[B: TypeClass](f: A => B): Bar = ??? }

you can't actually get a Foo when your type is Sub unless you ask for it:

scala> val s = new Sub[String] s: Sub[String] = Sub@22495dd1 scala> s.foo(_ => "") <console>:14: error: could not find implicit value for evidence parameter of type TypeClass[String] s.foo(_ => "") ^ scala> s.foo(_ => ""): Foo scala.NotImplementedError: an implementation is missing

Maybe this is what we want to avoid accidentally selecting the wrong types, but it doesn't feel like TreeMap is a Map if you can't throw whichever function you want at it.

@Ichoran Another solution consists in pre-upcasting to Super: (s: Super[String]).foo(_ => "").

In my opinion it is fine to get a failure if I want to map a TreeSet into something that has no Ordering instance, or if I try to map an Array into something that has no ClassTag available. I prefer to have to explicitly upcast my TreeSet into a Set.

When type signatures are complex, explicit upcasting is a large burden.

Another solution is to add an asSuper method that does nothing but the upcast. (E.g. for TreeMap it would be asMap.)

julienrf · 2017-02-22T15:55:06Z

This PR is ready for review :)

If you approve the design, I’ll start replacing the ??? by the actual implementations.

odersky · 2017-02-23T12:40:36Z

src/test/scala/strawman/collection/test/Test.scala

+    val xs1 = xs.map((k, v) => (v, k))
+    val xs2: strawman.collection.Map[String, Int] = xs1
+    val xs3 = xs.map(kv => (kv._2, kv._1))
+    val xs4: strawman.collection.Iterable[(String, Int)] = xs3


Dotty can compile this PR as-is. It would also have no probblem with the Function1 version. In fact, it would let you transparently use binary closures such as "(k, v) => ..." as unary function arguments, expanding them to { case (k, v) => ... ". So as far as dotty is concerned the Function1 version is better since it can make the Function2 variant redundant.

odersky · 2017-02-23T12:45:46Z

src/test/scala/strawman/collection/test/Test.scala

+    class Foo
+//    val xs7 = xs.map((k: String, v: Int) => (new Foo, v)) Error: No implicit Ordering defined for Foo
+    val xs7 = (xs: immutable.Map[String, Int]).map((k, v) => (new Foo, v))
+    val xs8: immutable.Map[Foo, Int] = xs7


The strawman LGTM so far. I propose let's go ahead and flesh it out. I did not quite get the simplified CBF idea. The example by @Ichoran looks good, but it seems to be exactly what is done now. After all, TreeSet's map

def map[B](f: (A) => B)(implicit ordering: Ordering[B]): TreeSet[B] = ???

is exactly the same as

def map[B: Ordering](f: (A) => B): TreeSet[B] = ???

which looks like the pattern that @Ichoran was proposing. Or am I missing something here?

Ichoran · 2017-02-23T16:26:08Z

I haven't formally reviewed it but it looks promising enough to me so that with Stefan's blessing I'm happy moving ahead. The type inference tweaks I'm suggesting wouldn't change the overall strategy much, just deliver a different user experience, so that's no reason to wait or change designs.

julienrf commented Feb 6, 2017

View reviewed changes

julienrf mentioned this pull request Feb 6, 2017

Sketch Set and SortedSet #23

Closed

julienrf commented Feb 6, 2017

View reviewed changes

szeiger reviewed Feb 8, 2017

View reviewed changes

julienrf mentioned this pull request Feb 13, 2017

Generalize SortedPolyTransforms into EvidenceGuidedPolyTransforms #28

Closed

julienrf force-pushed the sorted-sets-overloaded branch from f33c2a7 to dffb444 Compare February 22, 2017 13:22

julienrf changed the title ~~Overloading based implementation of Set and SortedSet~~ Overloading based implementation of Set, TreeSet, Map and TreeMap Feb 22, 2017

julienrf commented Feb 22, 2017

View reviewed changes

julienrf requested review from odersky and Ichoran February 22, 2017 13:50

julienrf mentioned this pull request Feb 22, 2017

Maps #26

Closed

julienrf added 8 commits February 22, 2017 16:35

Overloading based implementation of Set and SortedSet

915dda3

Fix linearization of IterablePolyTransforms

e1d565a

Make SetMonoTransforms extend IterableMonoTransforms

15f3b5c

Maps

7fd6d93

Add immutable Map base class

2ed74b8

Add TreeMap

bc57ed3

Reorganize imports

11db480

Naming consistency

1c137f5

julienrf force-pushed the sorted-sets-overloaded branch from 33779fc to 1c137f5 Compare February 22, 2017 15:42

odersky approved these changes Feb 23, 2017

View reviewed changes

julienrf merged commit ba8807e into master Feb 23, 2017

julienrf deleted the sorted-sets-overloaded branch February 23, 2017 16:27

julienrf mentioned this pull request Feb 23, 2017

Easy upcasting to leave out undesired overloads #35

Closed

julienrf mentioned this pull request May 1, 2017

Arity of map method of Map collection type #68

Closed

Overloading based implementation of Set, TreeSet, Map and TreeMap #24

Overloading based implementation of Set, TreeSet, Map and TreeMap #24

Conversation

julienrf commented Feb 6, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

julienrf commented Feb 22, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

szeiger Feb 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

julienrf commented Feb 22, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ichoran commented Feb 23, 2017

julienrf commented Feb 6, 2017 •

edited

Loading

szeiger Feb 22, 2017 •

edited

Loading