Skip to content
This repository has been archived by the owner on Dec 22, 2021. It is now read-only.

Add NonEmpty collection type #260

Closed
wants to merge 2 commits into from
Closed

Conversation

julienrf
Copy link
Contributor

@julienrf julienrf commented Oct 3, 2017

Fixes #98.

The approach taken here is similar to @tarao’s approach here: https://github.com/tarao/nonempty-scala

I added a NonEmpty wrapper around a collection and make sure that the only way to build an instance of such a wrapper is to provide at least one element.

I provide an implicit conversion from NonEmpty to the wrapped instance, making it possible to use all the operations of the specific collection directly on the NonEmpty instance, as if it was effectively an instance of that specific collection.

Operations that preserve non-emptiness (e.g. map, ++) are defined directly on NonEmpty to return another NonEmpty instance.

Current limitations:

  • doesn’t work with String or Array,
  • doesn’t work with View (but that would be easily doable at the cost of slightly more cryptic method signatures).

@tpolecat
Copy link
Contributor

tpolecat commented Oct 3, 2017

Now that there's a way to represent a non-empty Iterable why not remove all the empty-unsafe methods like reduce, head, tail, min, etc., and define them on NonEmpty?

@Ichoran
Copy link
Contributor

Ichoran commented Oct 3, 2017

@tpolecat - Because that's a gigantic migration hassle for everyone who uses existing collections.

@tpolecat
Copy link
Contributor

tpolecat commented Oct 3, 2017

@Ichoran all such code is a bug anyway so I would think people would be grateful!

@julienrf
Copy link
Contributor Author

julienrf commented Oct 3, 2017

It's not that simple : IMHO reduce and max should have returned an Option from the beginning, head and tail are useful for performance reasons. But I do agree that for some methods it would be nice to use NonEmpty (eg groupBy).

Also, the NonEmpty thing provided by this PR is a wrapper around another collection. It's usage is not as easy or straightforward as actual collections. That's why I put it in the collections-contrib artifact only. A consequence of that is that we cannot currently use it in the core collections. We'll see if time tells us that we should move it to the core! (well, this PR must be merged first...)

@Ichoran
Copy link
Contributor

Ichoran commented Oct 3, 2017

@tpolecat - All such code might be a bug, but since Scala doesn't have the kind of context-dependent type refinement that e.g. Kotlin has, and even if it did it might not be hooked up to a logic engine capable of working through all cases, it's not necessarily a bug. For instance,

def firstNumber(xs: Seq[Int]) =
  if (xs.isEmpty) 0 else xs.head

has no business not compiling, but we have no mechanism to make it compile, and no alternatives that avoid syntactic and/or runtime overhead (the latter of which can be mitigated by extensive use of macros, but that brings its own problems).

A library that made safety an overriding priority would not do it this way, but the existing collections library is not such a library, and all changes need to be compatible with the existing library modulo small and/or automatic rewrites.

@tpolecat
Copy link
Contributor

tpolecat commented Oct 3, 2017

I would prefer that it not compile. This is exactly the kind of code that becomes problematic as it evolves: if you factor the else consequent into a method you now have an unchecked entry condition. It's very easy for blocks to get unmoored from their guards this way.

I understand your arguments, I just wish we could aim higher.

@sjrd
Copy link
Member

sjrd commented Oct 3, 2017

and all changes need to be compatible with the existing library modulo small and/or automatic rewrites.

IMO, even more important than this criteria is the following: there must be a way to write code that will cross-compile against 2.12 and 2.13. If that is not the case, the whole library ecosystem breaks down, because no one can afford to drop their existing 2.12 artifacts, so no one can afford to support 2.13.

Therefore, removing head from potentially empty collections is a no go, because if you do that I cannot write code that will cross-compile for 2.12 (where NonEmpty does not exist) and 2.13 (where head does not exist except on NonEmpty). And that's whether or not I am willing to write safe code.

@dwijnand
Copy link
Member

dwijnand commented Oct 3, 2017

there must be a way to write code that will cross-compile against 2.12 and 2.13

I was hoping that the tooling community would have developed some tools and examples of using Scalafix to fix code "in flight" while cross-compiling. That way we wouldn't need to have the exact same source code. This was was being explored in https://github.com/typelevel/catz-cradle, using a scalaz to cats or a cats to scalaz rewrites, but was abandoned.

@Ichoran
Copy link
Contributor

Ichoran commented Oct 3, 2017

@sjrd - That is already not the case; plenty of things have changed in minor source-incompatible ways.

For example, .to[] with a type argument has changed to .to() with a companion argument.

Barring some work with Scalafix along the lines of what @dwijnand was saying, cross-compilation will typically not be possible even as currently structured.

@ShaneDelmore
Copy link

The rewriting situation is easier these days that it was when those experiments were performed. At the time SBT/scalameta held us back but these days there are proof of concepts using sbt and cbt (for example here is an old one https://github.com/cvogt/cbt/pull/466/files). Seems everyone who has worked on it, including me, has fallen victim to getting a new job and running out of time to carry it over the finish line but don’t let that discourage anyone.

@sjrd
Copy link
Member

sjrd commented Oct 4, 2017

Even with a good story for on-the-fly rewriting, this is not viable for projects that sit near the roots of the dependency graph in the eco-system. For example, scalafix itself depends on Scala.js and ScalaTest (to name only two big dependencies). This means that neither ScalaTest nor Scala.js can rely on on-the-fly fixing by scalafix, lest we introduce cyclic dependencies (the ultimate nightmare). If those two projects don't cross-compile for 2.12 and 2.13, guess what? virtually none of the ecosystem cross-compiles.

I would probably not trust on-the-fly rewriting anyway. I wouldn't be able to see the diffs it applies on my codebase in code reviews, for example.

.to[] versus .to() is bad enough, but it might be viable, since it's quite possible to avoid using those methods altogether (provided the common ones like .toList are still there). git grep '\.to\[' says I only use that in tests. It is however not possible to live without head.

@dwijnand
Copy link
Member

dwijnand commented Oct 4, 2017

I don't think we'd need Scalafix on 2.13 to cross build Scala.js on 2.12 and 2.13. Much like we didn't need sbt on 2.12 (outside of the compiler-bridge) to build the 2.12 ecosystem. Scalafix would be just a build-level dependency.

And for reviewing purposes I can image it would be possible to get CI to show diff of the results of rewriting the before and after code...

Copy link
Member

@szeiger szeiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the complexity of this design. The signatures of everyday methods are quite ugly and the runtime overhead is probably not negligible, either (but I haven't benchmarked it).

The automatic unwrapping of the underlying collection from a NonEmpty in combination with non-total operators on the underlying collections (which cannot realistically be removed or changed due to backwards compatibility concerns) blurs the lines between safe and unsafe operations (e.g. nel.head vs nel.tail.head).

@julienrf
Copy link
Contributor Author

@szeiger Thanks for the review!

I don't like the complexity of this design.

Do you have any suggestion for a simpler design?

The automatic unwrapping of the underlying collection from a NonEmpty […] blurs the lines between safe and unsafe operations

Yeah, that’s a good point. Should the conversion be explicit?

extends AnyVal {

def map[B, C2 <: Iterable[B]](f: A => B)(implicit bf: BuildFrom[C, B, C2]): NonEmpty[B, C2] =
new NonEmpty[B, C2](bf.fromSpecificIterable(coll)(coll.toIterable.map(f)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this not just defer directly to the map method on C?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Same comment about "why not directly" for everything else.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed we could omit the .toIterable call. That’s a left over of a previous iteration…

Copy link
Contributor Author

@julienrf julienrf Oct 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bf.fromSpecificIterable call is needed because it allows us to compute the resulting collection’s type C2 (otherwise we would get an Iterable[B]).

new NonEmpty[B, C2](bf.fromSpecificIterable(coll)(coll.toIterable.map(f)))

def flatMap[B, C2 <: Iterable[B]](f: A => IterableOnce[B])(implicit bf: BuildFrom[C, B, C2]): NonEmpty[B, C2] =
new NonEmpty[B, C2](bf.fromSpecificIterable(coll)(coll.toIterable.flatMap(f)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unsafe. flatMap might return no elements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, indeed!

new NonEmpty[(A0, Int), C2](bf.fromSpecificIterable(coll)(coll.zipWithIndex))

def prepended[B >: A, C2 <: Seq[B]](elem: B)(implicit bf: BuildFrom[C, B, C2], ev: C <:< Seq[B]): NonEmpty[B, C2] =
new NonEmpty[B, C2](bf.fromSpecificIterable(coll)(elem +: coll))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too safe. Prepending to an existing collection is a way to get a NonEmpty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?

@`inline` def +: [B >: A, C2 <: Seq[B]](elem: B)(implicit bf: BuildFrom[C, B, C2], ev: C <:< Seq[B]): NonEmpty[B, C2] =
prepended(elem)

@`inline` def :: [B >: A, C2 <: List[B]](elem: B)(implicit bf: BuildFrom[C, B, C2], ev: C <:< List[B]): NonEmpty[B, C2] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method shouldn't be here. It's List only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hence the implicit ev: C <:< List[B] parameter. The goal is to be able to have one NonEmpty class that can wrap any other collection type.

extends AnyVal {

def map[B, C2 <: Iterable[B]](f: A => B)(implicit bf: BuildFrom[C, B, C2]): NonEmpty[B, C2] =
new NonEmpty[B, C2](bf.fromSpecificIterable(coll)(coll.toIterable.map(f)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Same comment about "why not directly" for everything else.)

implicit def toCollection[A, C <: Iterable[A]](nonEmpty: NonEmpty[A, C]): C =
nonEmpty.coll

def fromIterable[A, C <: Iterable[A]](it: C): Option[NonEmpty[A, C]] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fromIterable seems like more typing than is warranted.

* @tparam A Type of the elements
* @tparam C Type of the wrapped collection
*/
def apply[A, C <: Iterable[A]](elem: A, coll: C)(implicit bf: BuildFrom[C, A, C]): NonEmpty[A, C] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That this would be the apply method is unexpected. I would think NonEmpty("herring", "cod", "salmon") would be the most natural usage. NonEmpty.prepended(x, xs) or somesuch would be better for this method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense.

@Ichoran
Copy link
Contributor

Ichoran commented Oct 12, 2017

I have serious misgivings about this approach. The forwarding to the underlying collection seems very heavyweight, and having to maintain every reasonable method manually seems like a pain. Furthermore, it can't detect when you take an existing collection and do something to it that would make it nonempty.

I'm not sure I have a substantially better suggestion (without understanding why you chose the form you did for the forwarding), but I'm not terribly enthusiastic about us spending more time on this without a really clear use case from somewhere explaining how much this implementation solves some problem.

Copy link
Contributor

@odersky odersky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was good we did this experiment, but in conclusion I see it as a complete validation of my earlier recommendation not to do this. It's not worth the complexity it introduces. The hallmark of good design is to know where your limits are. This one is beyond it.

Copy link
Contributor

@odersky odersky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was good we did this experiment, but in conclusion I see it as a complete validation of my earlier recommendation not to do this. It's not worth the complexity it introduces. The hallmark of good design is to know where your limits are. This one is beyond it.

Removed `flatMap` operations.
Renamed NonEmpty.apply to NonEmpty.cons.
Introduced a new NonEmpty.apply factory method similar to other
IterableFactory’s apply method.
@julienrf
Copy link
Contributor Author

julienrf commented Oct 16, 2017

Thanks for the reviews!

I’ve added a commit fixing the issues you reported (I removed flatMap and renamed the factory method to cons).

I’ve added a better factory method that makes it easier to build a NonEmpty collection of any type, as you can see in the tests.

There is still an important issue with equality: currently NonEmpty is defined as a value class and a consequence of that is that it is only comparable with other NonEmpty instances. I would love to be able to transparently compare a non empty list and a list but that seems a bit hard to achieve. If we make NonEmpty a regular class, then we could override the equals and hashCode to just forward to the implementation of the wrapped collection, but that would only work when the non empty collection is on the left hand side of a comparison…

@jvican jvican removed their request for review November 15, 2017 09:09
@julienrf
Copy link
Contributor Author

I’m closing this one because of the problem with equality.

@julienrf julienrf closed this Feb 12, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants