Pull-based streaming based on Task / Coeval#280
Merged
Conversation
Member
Author
|
Resuming work on this feature 😃 I've merged with |
Member
Author
|
I've merged this PR in master, even though the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We are introducing a new
Iterant[F[_], A]type in a new sub-project calledmonix-tail, that's going to be a pull-based alternative toObservable, with a customizableF[_]evaluation context and thus powered byTask,Coevalor other monadic types.For abstracting over
F[_]we rely on type classes defined in Typelevel's cats-core along with cats-effect.Rationale
Rx.NET has
IAsyncEnumerable<T>andIAsyncEnumerator<T>. TheEnumerable(Iterable) type is considered the dual of theObservableimplementation andIAsyncEnumeratoris basically anEnumeratorthat returns aFuture(orTask) when callingnext().We need a similar pull-based dual because:
foldRight, which isn't possible to implement forObservable(without converting it into a pull-based type first, hence this new type)Observablewill still be much better for performance and for time-based / reactive operations; actually this new type will not implement operations that are non-deterministic in nature, e.g. it will havezip, but notcombineLatest;-)scala.collection.immutable.Streamtype, theObservabletype might seem to be a little too much, since it is oriented towards reactive stuff; but conversions back and forth will be seamlessHowever the implementation will be different compared to an
Iterable/Enumerator: our implementation will be an example of functional programming in Scala, thus following principles of functional programming, building on the knowledge we acquired and versatility ofTask, so our communication protocol will not be based on Java'sIterator.Design
Basically our
Iterant[F[_], A]is modeled as a bunch of states, similar toTaskandFree, inspired by theListimplementation (in Scala, Haskell and others) and by theStreamingtype in Dogs.The minimal/elegant encoding, with which we can achieve every operation that we want is this:
So this is very similar with Scala's
List, except that:restinNext(the equivalent of thetailin Scala'sList) is anF[A], which can be aTaskor aCoeval, meaning that its evaluation can also be lazy and/or asynchronousHalt(Some(exception)))stop: F[Unit]routine that is supposed to be called whenever we want to prematurely stop the stream processingSo basically with a Scala
Listat any point the only choice is to keep processing theresttail reference. But our stream gives us a choice: either process thetailorstop.Optimizations
The above encoding is minimal, but not efficient. We also need:
The
Cursortype is a light alternative to Scala's and Java'sIterator, whichcan wrap either a normal
Iterator, anArrayor whatever can be iteratedsynchronously and efficiently. This is because:
Iteratorpattern is very, very efficient and best of all it supports our needs for efficient head / tail decomposition, even if it is destructive when compared with Scala'sLinearSeq(e.g.List,Stack), because linked-lists suck for performanceThe
Cursortype is basically anIterator, however I decided to introducea custom type because I want to control its implementation, because we need to specialize it for primitives and because
Iteratordoes not have the right semantics for the provided transformations.For instance when working with arrays, I want
Cursor.mapandCursor.filterand othersuch operations to have strict behavior, not lazy. I also want
Cursorto have specialized implementations for arrays of primitives.So YES, I'm optimising this FP abstraction by shoving an
Iteratorin one of these states. It's dirty, but it is well encapsulated and it works.But that's not all, we also need:
This is an optimisation over signalling
Next(item, F.pure(Halt(None)), F.unit),which in benchmarking shows that it greatly improves the performance of
flatMap.Basically without this state, the
flatMapoperation is slower than the oneimplemented on
Observableand that would be quite bad, given thatObservable.flatMapis forced to do concurrency handling due to theback-pressure protocol handling of
onComplete(currently by means of 2getAndSetoperation per concatenated child).Work in progress
The target is version 3.0.0.
Not all operations I want are implemented, plus I'd rather ship a minimal usable version, than something that breaks compatibility later.