Add NonEmpty collection type #260

julienrf · 2017-10-03T10:33:48Z

Fixes #98.

The approach taken here is similar to @tarao’s approach here: https://github.com/tarao/nonempty-scala

I added a NonEmpty wrapper around a collection and make sure that the only way to build an instance of such a wrapper is to provide at least one element.

I provide an implicit conversion from NonEmpty to the wrapped instance, making it possible to use all the operations of the specific collection directly on the NonEmpty instance, as if it was effectively an instance of that specific collection.

Operations that preserve non-emptiness (e.g. map, ++) are defined directly on NonEmpty to return another NonEmpty instance.

Current limitations:

doesn’t work with String or Array,
doesn’t work with View (but that would be easily doable at the cost of slightly more cryptic method signatures).

tpolecat · 2017-10-03T17:12:28Z

Now that there's a way to represent a non-empty Iterable why not remove all the empty-unsafe methods like reduce, head, tail, min, etc., and define them on NonEmpty?

Ichoran · 2017-10-03T18:35:45Z

@tpolecat - Because that's a gigantic migration hassle for everyone who uses existing collections.

tpolecat · 2017-10-03T18:59:25Z

@Ichoran all such code is a bug anyway so I would think people would be grateful!

julienrf · 2017-10-03T19:02:44Z

It's not that simple : IMHO reduce and max should have returned an Option from the beginning, head and tail are useful for performance reasons. But I do agree that for some methods it would be nice to use NonEmpty (eg groupBy).

Also, the NonEmpty thing provided by this PR is a wrapper around another collection. It's usage is not as easy or straightforward as actual collections. That's why I put it in the collections-contrib artifact only. A consequence of that is that we cannot currently use it in the core collections. We'll see if time tells us that we should move it to the core! (well, this PR must be merged first...)

Ichoran · 2017-10-03T19:57:46Z

@tpolecat - All such code might be a bug, but since Scala doesn't have the kind of context-dependent type refinement that e.g. Kotlin has, and even if it did it might not be hooked up to a logic engine capable of working through all cases, it's not necessarily a bug. For instance,

def firstNumber(xs: Seq[Int]) =
  if (xs.isEmpty) 0 else xs.head

has no business not compiling, but we have no mechanism to make it compile, and no alternatives that avoid syntactic and/or runtime overhead (the latter of which can be mitigated by extensive use of macros, but that brings its own problems).

A library that made safety an overriding priority would not do it this way, but the existing collections library is not such a library, and all changes need to be compatible with the existing library modulo small and/or automatic rewrites.

tpolecat · 2017-10-03T20:09:13Z

I would prefer that it not compile. This is exactly the kind of code that becomes problematic as it evolves: if you factor the else consequent into a method you now have an unchecked entry condition. It's very easy for blocks to get unmoored from their guards this way.

I understand your arguments, I just wish we could aim higher.

sjrd · 2017-10-03T20:17:44Z

and all changes need to be compatible with the existing library modulo small and/or automatic rewrites.

IMO, even more important than this criteria is the following: there must be a way to write code that will cross-compile against 2.12 and 2.13. If that is not the case, the whole library ecosystem breaks down, because no one can afford to drop their existing 2.12 artifacts, so no one can afford to support 2.13.

Therefore, removing head from potentially empty collections is a no go, because if you do that I cannot write code that will cross-compile for 2.12 (where NonEmpty does not exist) and 2.13 (where head does not exist except on NonEmpty). And that's whether or not I am willing to write safe code.

dwijnand · 2017-10-03T20:24:37Z

there must be a way to write code that will cross-compile against 2.12 and 2.13

I was hoping that the tooling community would have developed some tools and examples of using Scalafix to fix code "in flight" while cross-compiling. That way we wouldn't need to have the exact same source code. This was was being explored in https://github.com/typelevel/catz-cradle, using a scalaz to cats or a cats to scalaz rewrites, but was abandoned.

Ichoran · 2017-10-03T22:33:42Z

@sjrd - That is already not the case; plenty of things have changed in minor source-incompatible ways.

For example, .to[] with a type argument has changed to .to() with a companion argument.

Barring some work with Scalafix along the lines of what @dwijnand was saying, cross-compilation will typically not be possible even as currently structured.

ShaneDelmore · 2017-10-04T04:16:02Z

The rewriting situation is easier these days that it was when those experiments were performed. At the time SBT/scalameta held us back but these days there are proof of concepts using sbt and cbt (for example here is an old one https://github.com/cvogt/cbt/pull/466/files). Seems everyone who has worked on it, including me, has fallen victim to getting a new job and running out of time to carry it over the finish line but don’t let that discourage anyone.

sjrd · 2017-10-04T08:11:51Z

Even with a good story for on-the-fly rewriting, this is not viable for projects that sit near the roots of the dependency graph in the eco-system. For example, scalafix itself depends on Scala.js and ScalaTest (to name only two big dependencies). This means that neither ScalaTest nor Scala.js can rely on on-the-fly fixing by scalafix, lest we introduce cyclic dependencies (the ultimate nightmare). If those two projects don't cross-compile for 2.12 and 2.13, guess what? virtually none of the ecosystem cross-compiles.

I would probably not trust on-the-fly rewriting anyway. I wouldn't be able to see the diffs it applies on my codebase in code reviews, for example.

.to[] versus .to() is bad enough, but it might be viable, since it's quite possible to avoid using those methods altogether (provided the common ones like .toList are still there). git grep '\.to\[' says I only use that in tests. It is however not possible to live without head.

dwijnand · 2017-10-04T11:20:36Z

I don't think we'd need Scalafix on 2.13 to cross build Scala.js on 2.12 and 2.13. Much like we didn't need sbt on 2.12 (outside of the compiler-bridge) to build the 2.12 ecosystem. Scalafix would be just a build-level dependency.

And for reviewing purposes I can image it would be possible to get CI to show diff of the results of rewriting the before and after code...

szeiger

I don't like the complexity of this design. The signatures of everyday methods are quite ugly and the runtime overhead is probably not negligible, either (but I haven't benchmarked it).

The automatic unwrapping of the underlying collection from a NonEmpty in combination with non-total operators on the underlying collections (which cannot realistically be removed or changed due to backwards compatibility concerns) blurs the lines between safe and unsafe operations (e.g. nel.head vs nel.tail.head).

julienrf · 2017-10-11T14:33:45Z

@szeiger Thanks for the review!

I don't like the complexity of this design.

Do you have any suggestion for a simpler design?

The automatic unwrapping of the underlying collection from a NonEmpty […] blurs the lines between safe and unsafe operations

Yeah, that’s a good point. Should the conversion be explicit?

Ichoran · 2017-10-12T03:17:45Z

collections-contrib/src/main/scala/strawman/collection/NonEmpty.scala

+  extends AnyVal {
+
+  def map[B, C2 <: Iterable[B]](f: A => B)(implicit bf: BuildFrom[C, B, C2]): NonEmpty[B, C2] =
+    new NonEmpty[B, C2](bf.fromSpecificIterable(coll)(coll.toIterable.map(f)))


Why does this not just defer directly to the map method on C?

(Same comment about "why not directly" for everything else.)

Indeed we could omit the .toIterable call. That’s a left over of a previous iteration…

The bf.fromSpecificIterable call is needed because it allows us to compute the resulting collection’s type C2 (otherwise we would get an Iterable[B]).

Ichoran · 2017-10-12T03:18:09Z

collections-contrib/src/main/scala/strawman/collection/NonEmpty.scala

+    new NonEmpty[B, C2](bf.fromSpecificIterable(coll)(coll.toIterable.map(f)))
+
+  def flatMap[B, C2 <: Iterable[B]](f: A => IterableOnce[B])(implicit bf: BuildFrom[C, B, C2]): NonEmpty[B, C2] =
+    new NonEmpty[B, C2](bf.fromSpecificIterable(coll)(coll.toIterable.flatMap(f)))


This is unsafe. flatMap might return no elements.

Oops, indeed!

Ichoran · 2017-10-12T03:19:34Z

collections-contrib/src/main/scala/strawman/collection/NonEmpty.scala

+    new NonEmpty[(A0, Int), C2](bf.fromSpecificIterable(coll)(coll.zipWithIndex))
+
+  def prepended[B >: A, C2 <: Seq[B]](elem: B)(implicit bf: BuildFrom[C, B, C2], ev: C <:< Seq[B]): NonEmpty[B, C2] =
+    new NonEmpty[B, C2](bf.fromSpecificIterable(coll)(elem +: coll))


This is too safe. Prepending to an existing collection is a way to get a NonEmpty.

What do you mean?

Ichoran · 2017-10-12T03:20:08Z

collections-contrib/src/main/scala/strawman/collection/NonEmpty.scala

+  @`inline` def +: [B >: A, C2 <: Seq[B]](elem: B)(implicit bf: BuildFrom[C, B, C2], ev: C <:< Seq[B]): NonEmpty[B, C2] =
+    prepended(elem)
+
+  @`inline` def :: [B >: A, C2 <: List[B]](elem: B)(implicit bf: BuildFrom[C, B, C2], ev: C <:< List[B]): NonEmpty[B, C2] =


This method shouldn't be here. It's List only.

Hence the implicit ev: C <:< List[B] parameter. The goal is to be able to have one NonEmpty class that can wrap any other collection type.

Ichoran · 2017-10-12T03:20:24Z

collections-contrib/src/main/scala/strawman/collection/NonEmpty.scala

+  extends AnyVal {
+
+  def map[B, C2 <: Iterable[B]](f: A => B)(implicit bf: BuildFrom[C, B, C2]): NonEmpty[B, C2] =
+    new NonEmpty[B, C2](bf.fromSpecificIterable(coll)(coll.toIterable.map(f)))


(Same comment about "why not directly" for everything else.)

Ichoran · 2017-10-12T03:22:36Z

collections-contrib/src/main/scala/strawman/collection/NonEmpty.scala

+  implicit def toCollection[A, C <: Iterable[A]](nonEmpty: NonEmpty[A, C]): C =
+    nonEmpty.coll
+
+  def fromIterable[A, C <: Iterable[A]](it: C): Option[NonEmpty[A, C]] =


fromIterable seems like more typing than is warranted.

Ichoran · 2017-10-12T03:26:06Z

collections-contrib/src/main/scala/strawman/collection/NonEmpty.scala

+    * @tparam A Type of the elements
+    * @tparam C Type of the wrapped collection
+    */
+  def apply[A, C <: Iterable[A]](elem: A, coll: C)(implicit bf: BuildFrom[C, A, C]): NonEmpty[A, C] =


That this would be the apply method is unexpected. I would think NonEmpty("herring", "cod", "salmon") would be the most natural usage. NonEmpty.prepended(x, xs) or somesuch would be better for this method.

That makes sense.

Ichoran · 2017-10-12T03:35:13Z

I have serious misgivings about this approach. The forwarding to the underlying collection seems very heavyweight, and having to maintain every reasonable method manually seems like a pain. Furthermore, it can't detect when you take an existing collection and do something to it that would make it nonempty.

I'm not sure I have a substantially better suggestion (without understanding why you chose the form you did for the forwarding), but I'm not terribly enthusiastic about us spending more time on this without a really clear use case from somewhere explaining how much this implementation solves some problem.

odersky

I think it was good we did this experiment, but in conclusion I see it as a complete validation of my earlier recommendation not to do this. It's not worth the complexity it introduces. The hallmark of good design is to know where your limits are. This one is beyond it.

odersky

I think it was good we did this experiment, but in conclusion I see it as a complete validation of my earlier recommendation not to do this. It's not worth the complexity it introduces. The hallmark of good design is to know where your limits are. This one is beyond it.

Removed `flatMap` operations. Renamed NonEmpty.apply to NonEmpty.cons. Introduced a new NonEmpty.apply factory method similar to other IterableFactory’s apply method.

julienrf · 2017-10-16T13:09:15Z

Thanks for the reviews!

I’ve added a commit fixing the issues you reported (I removed flatMap and renamed the factory method to cons).

I’ve added a better factory method that makes it easier to build a NonEmpty collection of any type, as you can see in the tests.

There is still an important issue with equality: currently NonEmpty is defined as a value class and a consequence of that is that it is only comparable with other NonEmpty instances. I would love to be able to transparently compare a non empty list and a list but that seems a bit hard to achieve. If we make NonEmpty a regular class, then we could override the equals and hashCode to just forward to the implementation of the wrapped collection, but that would only work when the non empty collection is on the left hand side of a comparison…

julienrf · 2018-02-12T16:11:20Z

I’m closing this one because of the problem with equality.

julienrf requested review from szeiger, odersky, Ichoran and jvican October 3, 2017 10:34

julienrf force-pushed the non-empty branch from a2a9e03 to 7b393c9 Compare October 3, 2017 12:51

Add NonEmpty collection type

1d1ba37

julienrf force-pushed the non-empty branch from 7b393c9 to 1d1ba37 Compare October 5, 2017 09:15

szeiger reviewed Oct 11, 2017

View reviewed changes

Ichoran reviewed Oct 12, 2017

View reviewed changes

odersky reviewed Oct 13, 2017

View reviewed changes

Addressed review comments.

5f46386

Removed `flatMap` operations. Renamed NonEmpty.apply to NonEmpty.cons. Introduced a new NonEmpty.apply factory method similar to other IterableFactory’s apply method.

jvican removed their request for review November 15, 2017 09:09

julienrf closed this Feb 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add NonEmpty collection type #260

Add NonEmpty collection type #260

julienrf commented Oct 3, 2017 •

edited

tpolecat commented Oct 3, 2017

Ichoran commented Oct 3, 2017

tpolecat commented Oct 3, 2017

julienrf commented Oct 3, 2017 •

edited

Ichoran commented Oct 3, 2017

tpolecat commented Oct 3, 2017 •

edited

sjrd commented Oct 3, 2017

dwijnand commented Oct 3, 2017

Ichoran commented Oct 3, 2017

ShaneDelmore commented Oct 4, 2017

sjrd commented Oct 4, 2017

dwijnand commented Oct 4, 2017

szeiger left a comment

julienrf commented Oct 11, 2017

Ichoran Oct 12, 2017

Ichoran Oct 12, 2017

julienrf Oct 12, 2017

julienrf Oct 12, 2017 •

edited

Ichoran Oct 12, 2017

julienrf Oct 12, 2017

Ichoran Oct 12, 2017

julienrf Oct 12, 2017

Ichoran Oct 12, 2017

julienrf Oct 12, 2017

Ichoran Oct 12, 2017

Ichoran Oct 12, 2017

Ichoran Oct 12, 2017

julienrf Oct 12, 2017

Ichoran commented Oct 12, 2017

odersky left a comment

odersky left a comment

julienrf commented Oct 16, 2017 •

edited

julienrf commented Feb 12, 2018

Add NonEmpty collection type #260

Add NonEmpty collection type #260

Conversation

julienrf commented Oct 3, 2017 • edited

tpolecat commented Oct 3, 2017

Ichoran commented Oct 3, 2017

tpolecat commented Oct 3, 2017

julienrf commented Oct 3, 2017 • edited

Ichoran commented Oct 3, 2017

tpolecat commented Oct 3, 2017 • edited

sjrd commented Oct 3, 2017

dwijnand commented Oct 3, 2017

Ichoran commented Oct 3, 2017

ShaneDelmore commented Oct 4, 2017

sjrd commented Oct 4, 2017

dwijnand commented Oct 4, 2017

szeiger left a comment

Choose a reason for hiding this comment

julienrf commented Oct 11, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

julienrf Oct 12, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ichoran commented Oct 12, 2017

odersky left a comment

Choose a reason for hiding this comment

odersky left a comment

Choose a reason for hiding this comment

julienrf commented Oct 16, 2017 • edited

julienrf commented Feb 12, 2018

julienrf commented Oct 3, 2017 •

edited

julienrf commented Oct 3, 2017 •

edited

tpolecat commented Oct 3, 2017 •

edited

julienrf Oct 12, 2017 •

edited

julienrf commented Oct 16, 2017 •

edited