Pull-based streaming based on Task / Coeval by alexandru · Pull Request #280 · monix/monix

alexandru · 2016-12-21T07:43:48Z

We are introducing a new Iterant[F[_], A] type in a new sub-project called monix-tail, that's going to be a pull-based alternative to Observable, with a customizable F[_] evaluation context and thus powered by Task, Coeval or other monadic types.

For abstracting over F[_] we rely on type classes defined in Typelevel's cats-core along with cats-effect.

Rationale

Rx.NET has IAsyncEnumerable<T> and IAsyncEnumerator<T>. The Enumerable (Iterable) type is considered the dual of the Observable implementation and IAsyncEnumerator is basically an Enumerator that returns a Future (or Task) when calling next().

We need a similar pull-based dual because:

a pull-based communication model is many times easier to reason about for certain use-cases
it's easier to provide support for certain FP-ish operations, like foldRight, which isn't possible to implement for Observable (without converting it into a pull-based type first, hence this new type)
Observable will still be much better for performance and for time-based / reactive operations; actually this new type will not implement operations that are non-deterministic in nature, e.g. it will have zip, but not combineLatest ;-)
for people wanting a replacement for Scala's own scala.collection.immutable.Stream type, the Observable type might seem to be a little too much, since it is oriented towards reactive stuff; but conversions back and forth will be seamless

However the implementation will be different compared to an Iterable / Enumerator: our implementation will be an example of functional programming in Scala, thus following principles of functional programming, building on the knowledge we acquired and versatility of Task, so our communication protocol will not be based on Java's Iterator.

Design

Basically our Iterant[F[_], A] is modeled as a bunch of states, similar to Task and Free, inspired by the List implementation (in Scala, Haskell and others) and by the Streaming type in Dogs.

The minimal/elegant encoding, with which we can achieve every operation that we want is this:

sealed abstract class Iterant[F[_], A]

case class Next[F[_], A](
    item: A,
    rest: F[Iterant[F, A]],
    stop: F[Unit])
    extends Iterant[F ,A]

case class Suspend[F[_], A](
    rest: F[Iterant[F,A]],
    stop: F[Unit])
    extends Iterant[F,A]

case class Halt[F[_], A](ex: Option[Throwable])
    extends Iterant[F, A]

So this is very similar with Scala's List, except that:

the rest in Next (the equivalent of the tail in Scala's List) is an F[A], which can be a Task or a Coeval, meaning that its evaluation can also be lazy and/or asynchronous
the stream can end in an error (Halt(Some(exception)))
we need to handle file handlers and network sockets, hence we need a stop: F[Unit] routine that is supposed to be called whenever we want to prematurely stop the stream processing

So basically with a Scala List at any point the only choice is to keep processing the rest tail reference. But our stream gives us a choice: either process the tail or stop.

Optimizations

The above encoding is minimal, but not efficient. We also need:

case class NextCursor[F[_], A](
    cursor: BatchCursor[A],
    rest: F[Iterant[F, A]],
    stop: F[Unit])
    extends Iterant[F, A] 


case class NextBatch[F[_], A](
    batch: Batch[A],
    rest: F[Iterant[F, A]],
    stop: F[Unit])
    extends Iterant[F, A]

The Cursor type is a light alternative to Scala's and Java's Iterator, which
can wrap either a normal Iterator, an Array or whatever can be iterated
synchronously and efficiently. This is because:

performance of heap-allocated linked-lists, especially linked lists evaluated by means of a trampolined call-stack is absolutely terrible
IMO any streaming abstraction sucks badly if it cannot iterate over arrays efficiently (so without trashing cache locality or the garbage collector)
the Iterator pattern is very, very efficient and best of all it supports our needs for efficient head / tail decomposition, even if it is destructive when compared with Scala's LinearSeq (e.g. List, Stack), because linked-lists suck for performance

The Cursor type is basically an Iterator, however I decided to introduce
a custom type because I want to control its implementation, because we need to specialize it for primitives and because Iteratordoes not have the right semantics for the provided transformations.

For instance when working with arrays, I want Cursor.map and Cursor.filter and other
such operations to have strict behavior, not lazy. I also want Cursor to have specialized implementations for arrays of primitives.

So YES, I'm optimising this FP abstraction by shoving an Iterator in one of these states. It's dirty, but it is well encapsulated and it works.

But that's not all, we also need:

case class Last[F[_], A](item: A)
  extends Iterant[F, A]

This is an optimisation over signalling Next(item, F.pure(Halt(None)), F.unit),
which in benchmarking shows that it greatly improves the performance of flatMap.
Basically without this state, the flatMap operation is slower than the one
implemented on Observable and that would be quite bad, given that
Observable.flatMap is forced to do concurrency handling due to the
back-pressure protocol handling of onComplete (currently by means of 2
getAndSet operation per concatenated child).

Work in progress

The target is version 3.0.0.

Not all operations I want are implemented, plus I'd rather ship a minimal usable version, than something that breaks compatibility later.

alexandru · 2017-08-02T15:33:23Z

Resuming work on this feature 😃 I've merged with master, which contains the upgrade to Cats 1.0.0-MF.

alexandru · 2017-08-04T13:12:44Z

I've merged this PR in master, even though the Iterant type isn't finished, but it provides a good base to build on and this PR was getting too large already.

alexandru added 30 commits September 2, 2016 19:11

Stream experiment, reinitiated

87f2dcf

TaskStream, CoevalStream

7200117

Add type-class laws, w00t!

0e1d038

Merge branch 'master' into wip-streams

eb98b8d

Add more tests

52640e2

Add Stream monad instances, refactor tests

7deee19

Add more tests

b67fb5e

Run-loop improvements, take 1

a3947b6

Run-loop optimization

867682a

Refactoring, add new operators (asyncBoundary)

f092b89

Use final val in BooleanCancelable

963dafd

Add comments

33a78d0

Optimize Task.gatherUnordered and change its signature

7a32893

Optimizations

ef08853

Fix

57772d3

Fix travis build, add Task.executeAsync

157ede2

Update minitest to 0.25

765904b

Update travis

a044052

Fixes, tests

36e1dec

Fix 2.12 travis build

947655c

Add Observable.executeWithFork, Observable.executeWithModel, tests

da0386c

Add Observable.fork(fa,s)

d488fb8

Comment on unsafeCreate

690f016

Optimizing Task.gather and Task.mapBoth

b2fcf80

Small fix

4081e17

Issue #243 - add the TrampolineScheduler for the JVM

386fae5

Add tests for TrampolineScheduler, fix ReferenceScheduler

8bce030

Merge branch 'master' into trampoline-improv

2d7ebb1

AsyncStateActionObservable should no longer modify the execution model

5df3057

Observable.fromAsyncStateAction should use the default execution model

4d7fd5d

alexandru added 2 commits March 24, 2017 11:32

Add Iterant.takeWhile

d22c1b9

Add completeL/foreachL

aa09105

alexandru closed this May 4, 2017

alexandru deleted the wip-streams branch May 4, 2017 05:31

alexandru restored the wip-streams branch June 27, 2017 20:14

alexandru reopened this Jun 27, 2017

alexandru added 4 commits June 30, 2017 09:02

Half-arsed merge, still broken

f288d37

Fixed most problems, except the Monad laws suites

01b7e25

Merge remote-tracking branch 'upstream/master' into wip-streams

0733416

Fix tests after Cats upgrade.

4985d31

alexandru modified the milestones: 3.0.0, 2.3.0 Aug 2, 2017

alexandru added 10 commits August 3, 2017 10:35

Start to expect referential transparency in user provided functions

7bc27a6

Reviewing available ops, add comments, review suspensions

247b5ab

Switch tailRecM

3e0a177

Update copyright headers

cbbb1ba

Add tests

f0e82ee

Add comments, change suspend behavior

03ae2db

Add tests for Iterant[IO, ?], add comments

43ada74

Add builders tests

10384f5

Implement MonadError

ab756de

Fix comments

b0dc5a1

alexandru changed the title ~~WIP: pull-based streaming based on Task / Coeval~~ Pull-based streaming based on Task / Coeval Aug 4, 2017

alexandru added 2 commits August 4, 2017 15:25

Fix JS tests

2f06b5a

Get rid of SharedDocs

a025d09

alexandru merged commit 4ceea29 into master Aug 4, 2017

alexandru mentioned this pull request Aug 15, 2017

We need a toIO operation and type class typelevel/cats-effect#73

Closed

alexandru deleted the wip-streams branch January 21, 2018 07:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pull-based streaming based on Task / Coeval#280

Pull-based streaming based on Task / Coeval#280
alexandru merged 100 commits intomasterfrom
wip-streams

alexandru commented Dec 21, 2016 •

edited

Loading

Uh oh!

alexandru commented Aug 2, 2017

Uh oh!

alexandru commented Aug 4, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

alexandru commented Dec 21, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale

Design

Optimizations

Work in progress

Uh oh!

alexandru commented Aug 2, 2017

Uh oh!

alexandru commented Aug 4, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

alexandru commented Dec 21, 2016 •

edited

Loading