-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add Flatten
iterator
#14805
add Flatten
iterator
#14805
Conversation
In Iterators.jl, this is called |
I know, but this one is a bit different since it takes a single iterator and flattens its output, which allows nesting. |
In my view, |
Fair enough. |
Sure, ok |
Presumably, this will be exported somewhere? Needs a NEWS item in that case. |
As a point of reference, there is also some discussion here, which I think implements the same thing JuliaCollections/Iterators.jl#50 |
I see --- it looks like my implementation avoids state, at the cost of some repeated |
That is a good orthogonal design: this is a flatten-iterator which does not count its elements of the child iterators and has no length and works with iterators which can be read only once. |
s = start(f.it) | ||
d = done(f.it, s) | ||
# this is a simple way to make this function type stable | ||
d && error("argument to Flatten must contain at least one iterator") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more specific exception type? ArgumentError maybe?
I'd like to use this in #15409. I have a function ( """
batch(c; nworkers=nworkers(), max_batch_size=100)
Split a collection into batches for processing by `n_workers`.
Equivalent to `split(c, max_batch_size)` when `length(c) >> max_batch_size`.
"""
function batch(c; n_workers=nworkers(), max_batch_size=100)
# Split collection into batches, then peek at the first few batches...
batches = split(c, max_batch_size)
n = Int(n_workers * 4)
head, tail = take_head(batches, n)
# If there are not enough batches, use a smaller batch size...
if length(head) < n
head = vcat(head...)
batch_size = max(1, round(Int, length(head) / n))
return split(head, batch_size)
end
return flatten((head, tail))
end See also type ResumeIterator{T}
itr::T
state
end
"""
resume(itr, state) -> iterator
Returns an iterator that iterates over `itr` starting from `state`.
"""
resume(itr, state) = ResumeIterator(itr, state)
eltype{T}(::Type{ResumeIterator{T}}) = eltype(T)
start(itr::ResumeIterator) = itr.state
next(itr::ResumeIterator, state) = next(itr.itr, state)
done(itr::ResumeIterator, state) = done(itr.itr, state)
"""
take_head(c, n) -> head, tail
Returns `head`: the first `n` elements of `c`;
and `tail`: an iterator over the remaining elements.
"""
function take_head(c, n)
head = Vector{eltype(c)}(n)
s = start(c)
i = 0
while i < n && !done(c, s)
i += 1
head[i], s = next(c, s)
end
return resize!(head, i), resume(c, s)
end |
Ok, in that case I'll plan to merge this soon. We already have |
Thanks Jeff. |
Any thoughts on what The existing |
fb3dd6e
to
1164012
Compare
1164012
to
dd391f9
Compare
@samoconnor |
@mschauer what I've ended up with in #15409 id What about just |
Was it intentional that |
Yes, because we should really wrap the iterators in a module to separate them from eager functions. |
I still don't like the |
Got anything better to call it? |
I still like either |
Iterator Madness continues. This is essentially a chaining iterator that can be applied recursively to eventually implement nested
for
generator expressions as in #4867.