-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding NonEmptyVector #1137
Adding NonEmptyVector #1137
Conversation
@zainab-ali thanks for working on this! It looks like there are already merge conflicts, so you'll want to pull in the latest changes from master. Also please note the second paragraph of the description for #1088 -- I think we'll want to change this implementation to not use the "cons" approach for |
@@ -11,11 +10,6 @@ package object data { | |||
def NonEmptyList[A](head: A, tail: A*): NonEmptyList[A] = | |||
OneAnd[List, A](head, tail.toList) | |||
|
|||
def NonEmptyVector[A](head: A, tail: Vector[A] = Vector.empty): NonEmptyVector[A] = | |||
OneAnd(head, tail) | |||
def NonEmptyVector[A](head: A, tail: A*): NonEmptyVector[A] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, this API (head : A, tail: A*)
would be handy.
*/ | ||
def show(implicit A: Show[A]): String = | ||
s"NonEmptyVector(${A.show(head)}, ${Show[Vector[A]].show(tail)})" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this show correct?
Or should I have s"NonEmptyVector(${Show[Vector[A]].show(vector)})"
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better than the cons
like show.
Current coverage is 89.09%@@ master #1137 diff @@
==========================================
Files 232 234 +2
Lines 3073 3137 +64
Methods 3021 3082 +61
Messages 0 0
Branches 49 52 +3
==========================================
+ Hits 2729 2795 +66
+ Misses 344 342 -2
Partials 0 0
|
|
||
object NonEmptyVector extends NonEmptyVectorInstances { | ||
|
||
def apply[A](head: A, tail: Vector[A]): NonEmptyVector[A] = NonEmptyVector(head +: tail) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to support this as:
def apply[A](body: Vector[A], last: A): NonEmptyVector[A] = NonEmptyVector(body :+ last)
In general I don't like treating non-empty vector as a head/tail pair because it's so inefficient -- prepending to a vector is O(n) and reallocates the entire vector. Appending is a lot more efficient, so in this case I think it makes more sense.
Honestly I'd probably prefer an interface like:
def apply[A](xs: Vector[A]): Option[NonEmptyVector[A]] = ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably missed something. I have been holding the belief that according to
http://docs.scala-lang.org/overviews/collections/performance-characteristics.html
Prepending to a vector is as efficient as appending? I thought it makes sense based on my (not very thorough) understanding of the implementation of vector. Is there some caveat in this claim?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might propose the following instantiation methods:
// to be consistent with var-arg `apply` methods for other collections
def apply[A](head: A, tail: A*): NonEmptyVector[A]
def fromVector[A](vector: Vector[A]): Option[NonEmptyVector[A]]
// this will throw an exception if the vector is empty
def fromVectorUnsafe[A](vector: Vector[A]): NonEmptyVector[A]
def prepend[A](head: A, tail: Vector[A]): NonEmptyVector[A]
def append[A](vector: Vector[A], last: A): NonEmptyVector[A]
I think that would cover all of the use-cases. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right @kailuowang. I forgot that vectors maintain a start index to allow this kind of thing to be effectively constant time. I was wrong.
I think maybe I got confused because for awhile Vector(x) ++ xs
was very slow. But this was just a bug, not related to x +: xs
.
@zainab-ali thanks, this is getting really close I think! |
*/ | ||
def flatMap[B](f: A => NonEmptyVector[B]): NonEmptyVector[B] = | ||
NonEmptyVector(vector.flatMap(a => f(a).vector)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about reduce
? Since, we know that this is non-empty the reduce(fn: (A, A) => A): A
is a safe signature. That or we could have: def combineAll(implicit: sg: Semigroup[T]): T
(or we could have both).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reduce
and several other methods are available via the Reducible
instance, but I agree that it would probably be nice to have some of these helpers on the class itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they should be called reduceLeft(f: (A, A) => A): A
and reduce(implicit S: Semigroup[A]): A
to be consistent with cats.Reducible
. AFAK, the standard library has similar names.
I've been mulling over the naming conventions for safe / unsafe methods. I agree with the following conventions:
So when a function is safe, but is analogous to an unsafe standard library function, what should we do? Since cats is safe by default, we could keep the names as is, but as @johnynek points out, is causes confusion for users of the standard library, when functions they expect to be unsafe start returning options. In the above comments, we've decided rename the functions instead: TBH, I'm uneasy about this, because I find variation in method names more confusing than variation in signatures. However, since I can't think of a better solution, I'll go with it. We should use the same convention for the |
* Left-associative reduce using the Semigroup of A | ||
*/ | ||
def reduce(implicit S: Semigroup[A]): A = | ||
reduceLeft(S.combine) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
semigroup has combineOption
which can possibly be faster than using combine
.
|
||
/** Gets the element at the index, if it exists */ | ||
def get(i: Int): Option[A] = | ||
Try(getUnsafe(i)).toOption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that toVector.lift(i)
would probably be a more efficient implementation (though this is performance speculation without a benchmark).
|
||
/** Updates the element at the index, if it exists */ | ||
def updated(i: Int, a: A): Option[NonEmptyVector[A]] = | ||
Try(updatedUnsafe(i, a)).toOption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More performance speculation: if (toVector.isDefinedAt(i)) Some(toVector.updated(i, a)) else None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops I guess that would need to wrap it in a NonEmptyVector
too.
👍 thanks @zainab-ali! I've left some minor comments about possibly more performant implementations of a few things, but I'm happy to let those happen in a followup PR. |
* Left-associative reduce using the Semigroup of A | ||
*/ | ||
def reduce(implicit S: Semigroup[A]): A = | ||
S.combineAllOption(tail).foldLeft(head)(S.combine) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
personally I would do: S.combineAllOption(toVector).get
since we know the get can't throw since toVector.nonEmpty
.
👍 |
* universal .toString method. | ||
*/ | ||
def show(implicit A: Show[A]): String = | ||
s"NonEmptyVector(${Show[Vector[A]].show(toVector)})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will give us something like NonEmptyVector(Vector(1, 2, 3))
for NonEmptyVector(1, 2, 3)
.
Maybe we could do "NonEmpty" + Show[Vector[A]].show(toVector)
which should give us NonEmptyVector(1,2,3)
?
This looks great @zainab-ali, I left some minor comments. Another thing which might be useful is an |
def concat(other: NonEmptyVector[A]): NonEmptyVector[A] = NonEmptyVector(toVector ++ other.toVector) | ||
|
||
/** | ||
* Alias for concat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to make this Alias for [[concat]]
so that the scaladoc has a link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see that we are using overloading, so I don't know how that works with scaladoc. I'm generally skeptical of overloading, though it's probably fine in this instance. I think it would prevent us from being able to turn NonEmptyVector
into a value class though, wouldn't it?
It looks we have have multiple thumbs-up. I'm going to give mine too 👍 and merge this. There is probably some follow-up work to do, but let's open a new PR for that. |
This follows up on several small suggestions from comments in typelevel#1137. The only meaningful modification is changing the format of `show` from something like `NonEmptyVector(Vector(1, 2, 3))` to something like `NonEmptyVector(1, 2, 3)`. I've also added a `length` method to `NonEmptyVector`.
Adding
NonEmptyVector
as discussed in #1088