-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supporting Chunking in ZStream.fromIterator #4834
Conversation
/** | ||
* Creates a stream from an iterator that may throw exceptions. | ||
*/ | ||
def fromIterator[A](iterator: => Iterator[A], maxChunkSize: Int = 1): ZStream[Any, Throwable, A] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a heads up that adding a default parameter breaks backwards compatibility, like here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I think since we are not guaranteeing binary compatibility at this point that is okay. I think this is the right API and it should be a very easy change, but open to other perspectives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also add another method with 2 parameters, and the 1 parameter one can delegate to this one.
@iravid Can I get a review on this? |
.fold( | ||
Pull.fail, | ||
iterator => | ||
ZIO.effect { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you know the chunk size, it's probably more efficient to pre-allocate an array, fill it, convert to Chunk
, and take
the number of elements actually read (which is <=
the chunk size).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hinting the chunk builder with the desired size will have the same effect and avoid dealing with the array directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the one watch out here is if the user uses an extremely large maxChunkSize
. For example, if I say "I'm not sure how many elements I have, but just read them all as one Chunk" and do fromIterator(iterator, Int.MaxValue)
I am going to blow up with an OutOfMemoryError
because it is going to try to actually allocate an Array
with Int.MaxValue
size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My POV on this is that if it says "chunk" on the label, it's eager and must fit in memory. So I kinda think it's a fair tradeoff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is right but just to push on this that means that ZStream.fromIterator(Iterator(1, 2, 3, 4, 5), Int.MaxValue)
will blow up. I think definitely if the Iterator
is actually too large to fit in memory it has to fail, but failing when the Iterator
is small is a little counterintuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Maybe we could call it a chunkHint
and that way we would be free to limit it to some reasonable value (our default chunk size, e.g.).
if (maxChunkSize <= 1) { | ||
if (iterator.isEmpty) Pull.end else Pull.emit(iterator.next()) | ||
} else { | ||
val builder = ChunkBuilder.make[A]() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
val builder = ChunkBuilder.make[A]() | |
val builder = ChunkBuilder.make[A](maxChunkSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good with the sizeHint changes!
Wow, you guys really care about getting the naming right. I’m impressed! |
Sometimes we want to pull values from an
Iterator
in chunks for greater efficiency.