-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Iterator.unfold and IterableFactory.unfold #6851
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Algorithm assumes immutability, but it shouldn't.
@@ -1,6 +1,7 @@ | |||
package scala.collection | |||
|
|||
import java.io.{ObjectInputStream, ObjectOutputStream} | |||
import java.util.Objects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method you're using is so trivial that I don't think it's worth adding this dependency.
|
||
override def hasNext: Boolean = { | ||
if (nextResult eq null) { | ||
nextResult = Objects.requireNonNull(f(state), "null during unfold") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic isn't safe if f
is side-effecting--for instance, if it generates a random sequence that terminates. You need to have a separate var that remembers when you've hit the last item or somesuch. Personally I'd encode it as a uninitialized
bool, and just put (A, S)
in its own var with it being null
as the indication that the stream has stopped (and then pack init
into (null.asInstanceOf[A], init)
to get going).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or better yet use two fields, one for S
and one for A
, and either bools or integer state to keep track of what's going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand where it's unsafe. None
indicates that the stream has stopped
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh. I saw that. And I thought I saw a scenario where it wouldn't work, but I'm unable to recreate it. I suspect I got confused somehow.
|
||
override def hasNext: Boolean = { | ||
if (nextResult eq null) { | ||
nextResult = Objects.requireNonNull(f(state), "null during unfold") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh. I saw that. And I thought I saw a scenario where it wouldn't work, but I'm unable to recreate it. I suspect I got confused somehow.
* @tparam S Type of the internal state | ||
*/ | ||
def unfold[A, S](init: S)(f: S => Option[(A, S)]): Iterable[A] = { | ||
val initialState = init // unfortunately, `Iterable` has an `init` method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This aliasing bugs me, but I'm honestly not sure what the best course of action is.
init
is a good, short parameter name, and I'm not a huge fan of a longer name like initialState
. Ideally, all of the unfold
methods ought to have the same parameter names, so changing this one means changing LazyList.unfold
and Iterator.unfold
as well.
Perhaps lengthen it slightly to initial
?
Let the bikeshedding commence!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if you @inline def initialState = init
?
I tested it, and the The local val is fine as a rename, but having a non-anonymous class generates slightly cleaner bytecode to begin with. (In particular, it avoids an extra
). This is probably JITted out immediately, though. (I didn't check the assembly.) |
*/ | ||
private final class UnfoldIterable[A, S](initial: S)(f: S => Option[(A, S)]) extends AbstractIterable[A] { | ||
override def iterator: Iterator[A] = Iterator.unfold(initial)(f) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of adding it to IterableFactory
(and SortedIterableFactory
, MapFactory
, SortedMapFactory
) instead, so that this would be available on all collections types?
Also, instead of using an UnfoldIterable
class, you can just call from(Iterator.unfold(initial)(f))
(though this would be less efficient than your version).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UnfoldIterable doesn't memoize items since it creates new UnfoldIterator every time iteration is done. Iterable.from just creates an immutable List IIRC. It depends what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably document Iterable.unfold
better to indicate that it doesn't memoize elements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked scala.collection.View and it seems a better fit than scala.collection.Iterable to put collections without memoization. Views are not memoized by default, while Iterables are backed by immutable.List by default. Instead of changing the semantics for one method (unfold) in Iterable companion object, we can move that method (unfold) to View companion object. WDYT? @LPTK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add a View.Unfold
class, then we can have Factory.from(new View.Unfold(initial)(f))
for all types basically for free - similarly to what's done with View.Iterate
, View.Tabulate
, View.Fill
, etc.
Also, maybe we should replace |
Yes, I think having View.unfold is the way to go.
…On June 28, 2018 5:42:36 AM GMT+02:00, Nth ***@***.***> wrote:
NthPortal commented on this pull request.
> + * @return an Iterable by using a function `f` producing elements
of
+ * type `A` and updating an internal state `S`.
+ * @param init State initial value
+ * @param f Computes the next element (or returns `None` to
signal
+ * the end of the collection)
+ * @tparam A Type of the elements
+ * @tparam S Type of the internal state
+ */
+ def unfold[A, S](init: S)(f: S => Option[(A, S)]): Iterable[A] = new
UnfoldIterable(init)(f)
+
+ /** An `Iterable` that calls [[scala.collection.Iterator.unfold
`Iterator.unfold`]]
+ * to create `Iterator`s.
+ */
+ private final class UnfoldIterable[A, S](initial: S)(f: S =>
Option[(A, S)]) extends AbstractIterable[A] {
+ override def iterator: Iterator[A] = Iterator.unfold(initial)(f)
+ }
If we add a `View.Unfold` class, then we can have `Factory.from(new
View.Unfold(initial)(f))` for all types basically for free
|
I think this is ready for merge, if anyone wants to re-review |
@@ -81,7 +81,7 @@ object SerializationStability { | |||
} | |||
} | |||
|
|||
// Generated on 20180605-18:45:47 with Scala version 2.13.0-20180604-234247-8dd6ca5) | |||
// Generated on 20180701-21:01:46 with Scala version 2.13.0-20180701-205044-6e3f96b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we override this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what you're asking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this file overridden? That’s the first time I see it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes broke the test, so I had to update the test. Doing so automatically also updates the timestamp and hash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not actually sure why the changes broke the test, since theoretically they only affect factories and companion objects, which shouldn't store any serialization data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is safe to regenerate at the moment. All collection classes have new SerialVersionUIDs and are serialization incompatible with M3 and earlier versions anyway.
|
||
@Test def unfold(): Unit = { | ||
val it = Iterator.unfold(1)(i => if (i > 10) None else Some((i, i + 1))) | ||
assertSameElements(1 to 10, it) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind adding a test that produces an empty Iterator
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can do :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about a test checking that creation of Iterator.UnfoldIterator and invoking View.Unfold.iterator doesn't invoke generator function (i.e. one that takes state and produces optionally next element and state)? LazyList.unfold doesn't have this property - it always invokes generator function once, even if the LazyList is never used after LazyList.unfold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tarsa that's a good idea - can also do that :)
* @tparam S Type of the internal state | ||
* @return a $coll that produces elements using `f` until `f` returns `None` | ||
*/ | ||
def unfold[A : Ev, S](init: S)(f: S => Option[(A, S)]): CC[A] = from(new View.Unfold(init)(f)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add it to IterableFactory
and EvidenceIterableFactory
we should also add it to SortedIterableFactory
, MapFactory
and SortedMapFactory
.
Or we can remove it from XxxFactory
types and let users write Iterator.unfold(…)(…).to(Xxx)
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I attempted to be consistent with how iterate
, tabulate
and fill
were defined.
Also, if we remove it from IterableFactory
, then it's not defined for any factories at all, which seems undesirable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I’m fine with this argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@julienrf IMHO it would be best not to expect users to rely on iterators for common useful operations, as iterators should be regarded as an error-prone (stateful) lowish-level abstraction, mainly used as implementation details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LPTK Then you can just write: View.unfold(…)(…).to(Xxx)
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@julienrf you can't, because object View
inherits unfold
from IterableFactory
. You could do new View.Unfold(...)(...).to(Xxx)
, but that seems not great.
* @tparam A Type of the elements | ||
* @tparam S Type of the internal state | ||
*/ | ||
def unfold[A, S](init: S)(f: S => Option[(A, S)]): CC[A] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the new generic implementation as efficient as this one and sufficiently lazy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it certainly should be sufficiently lazy, since it just creates a LazyList
from a View
. I can't speak to its performance (I haven't done any benchmarks), but I wouldn't imagine it to be substantially different.
@@ -81,7 +81,7 @@ object SerializationStability { | |||
} | |||
} | |||
|
|||
// Generated on 20180605-18:45:47 with Scala version 2.13.0-20180604-234247-8dd6ca5) | |||
// Generated on 20180701-21:01:46 with Scala version 2.13.0-20180701-205044-6e3f96b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is safe to regenerate at the moment. All collection classes have new SerialVersionUIDs and are serialization incompatible with M3 and earlier versions anyway.
Move ClassTagIterableFactory.iterate up to EvidenceIterableFactory.
I find it difficult to understand how to use Iterator.unfold, and the Scaladoc is somewhat oblique. I found an SO posting that helped, but the accompanying explanation is terse. Here are the code examples from the SO posting:
I wonder if it might be helpful to show an |
I don't disagree that it's oblique. Unfortunately, I'm not sure how to word and explain it better. To channel my inner Seth, a PR improving the clarity of the docs would probably be accepted. I would do it myself if I had any ideas. |
It might be helpful to know a bit about how this method came to be. The use case which inspired the idea, how the original user(s) used this method, the types of tasks that it is particularly well suited, etc. Where did this gem come from? |
it was added to |
Looks like @julienrf did the work. @julienrf, would you be willing to type out information in response to the questions I asked?
|
In teaching, I would start by teaching Then,
An example I like is:
Another example is good old Fibonacci:
At each stage of the loop, we need to have kept track of two |
@SethTisue Thanks for the quick response. You addressed "how", which is helpful. I'd also like to know "why" and "when". How does this method compare with alternatives? When is it better, and why? What types of problems might this method be well suited for? Also I am unclear about how results accumulate. |
@mslinn they accumulate results effectively by calling |
It is more powerful than We can prove that by implementing def apply[A](as: A*): List[A] =
List.unfold(as) { case h +: t => Some((h, t)) case _ => None }
def tabulate[A](length: Int)(f: Int => A): List[A] =
List.unfold(length)(n => if (n > 0) Some((f(n), n - 1)) else None)
def iterate[A](start: A, length: Int)(f: A => A): List[A] =
List.unfold((length, start)) {
case (0, _) => None
case (n, a) => Some((a, (n - 1, f(a))))
} Following the principle of least powerful abstraction, you should use |
This is great reasoning and I like the explanation, but note that def unfold[A, S](start: S)(op: S => Option[(A, S)]): List[A] =
Iterator.
iterate(op(start))(_.flatMap{ case (_, s) => op(s) }).
map(_.map(_._1)).
takeWhile(_.isDefined).
flatten.
toList Thus, one should |
It is helpful to see Why does |
|
@mslinn - No particular reason. |
Resolves scala/bug#10955