Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research "eager" notification #128

Closed
zenparsing opened this issue Dec 22, 2016 · 32 comments
Closed

Research "eager" notification #128

zenparsing opened this issue Dec 22, 2016 · 32 comments

Comments

@zenparsing
Copy link
Member

zenparsing commented Dec 22, 2016

Definitions:

  • Eager Notification: sending notifications to the observer before the call to subscribe has completed.
  • Eager Observable: an observable that sends notifications to observers during the call to subscribe.

Motivations:

  • Eager notification results in confusing "Zalgo"-esque behavior in which notifications can be received by the observer before the subscribe call has returned. The code which performs the subscribe cannot generally know whether notifications will arrive "eagerly" or not, and users will commonly make the invalid assumption that notifications will not. In the past, JS has attempted to eliminate these kinds of confusing conditions.
  • Eager notification makes the behavior of the type incredibly difficult to reason about. See SubscriptionObserver should never throw, just like Promise #119
  • Eager notification results in subtle bugs when developing combinators and requires careful programming. Examples: 1, 2
  • Eager notification necessitates an additional synchronous unsubscription mechanism, like the current observer.start method.

Goals:

  • Explore the possibility of eliminating eager notification in order to make Observable easier to understand and less error-prone.
  • Avoid discussion of whether Zalgo is a problem for "real" programmers.
  • Avoid direct comparison to Promise notification and propagation.

Open Questions:

  • What use cases require eager notification?

Related Previous Discussion:

Contributing:

Let's try to keep the discussion focused the stated goals, and avoid open-ended back and forth. If you can, try to provide some insight on one of the open questions listed above.

@RangerMauve
Copy link

Would this require Observables to send notifications asynchronously, on the next tick for instance? Would that mean that every call to subscribe will add another tick of delay?

@zenparsing
Copy link
Member Author

Currently we catch errors that are thrown from the subscriber function and attempt to send them to the observer's error method. That's a case of eager notification, but it results in some confusing behaviors.

If an error is thrown from next, we do not send the error to the observer's error method:

let observable = new Observable(sink => {
  setTimeout(() => { sink.next(1); }, 100);
});

observable.subscribe({
  next(v) { throw new Error('from observer') },
  error(e) {
    // This will not receive the error above ^^^
  },
});

Except it will if the observable emits eagerly!

let observable = new Observable(sink => {
  sink.next(1);
});

observable.subscribe({
  next(v) { throw new Error('from observer') },
  error(e) {
    // This will receive the error above ^^^
    // WTF Observable!
  },
});

If we eliminated eager notification, then I suppose that we would need to change the subscribe behavior such that errors thrown from the subscriber function are not caught and sent to observer.error. That would also eliminate the confusing situation illustrated above.

@zenparsing
Copy link
Member Author

@RangerMauve

Would this require Observables to send notifications asynchronously, on the next tick for instance?

Each notification would still be sent synchronously, you just (somehow) wouldn't be allowed to send notifications before subscribe has completed.

Would that mean that every call to subscribe will add another tick of delay?

No. We would need to introduce latency for Observable.of, though, since it currently does eager notification.

Rather than attempting to design a solution though, I'd like to understand what use cases absolutely require eager notification.

@benjamingr
Copy link

benjamingr commented Dec 22, 2016

@zenparsing how would you implement an Observable.range(1, 100000) without it and without allocating a 100000 sized array?

Currently we catch errors that are thrown from the subscriber function and attempt to send them to the observer's error method. That's a case of eager notification, but it results in some confusing behaviors.

Promises do this too:

new Promise(() => { throw new Error(":("); }); // rejects the promise
new Promise(() => setTimeout(() => { throw new Error(":("); })); // throws globally

This is just how wrapped async code (like promises/observables) works with raw code, if you used Observable.delay in your observable example it would have worked fine. There is nothing we can do (short of zones) that would catch general errors, and it has been known to be a Messy Business.

@zenparsing
Copy link
Member Author

@benjamingr The only difference for Observable.range would be that it would have to schedule a job to start sending the range, instead of sending immediately.

I'm not proposing that we make "next" calls async. I'm only trying to understand if "eager" notification (defined above) is a requirement.

Also, let's not bring promises into this discussion, as I find the comparisons to be incredibly confusing.

@benjamingr
Copy link

benjamingr commented Dec 22, 2016

That makes sense (for .range).

I'm not proposing that we make "next" calls async

You've made that plenty obvious.

I think @staltz had some good motivating examples for eager notification.

Also, let's not bring promises into this discussion, as I find the comparisons to be incredibly confusing.

Promises were just to show it's not something we should aim to solve for the general case (your setTimeout example).

@zenparsing
Copy link
Member Author

Promises were just to show it's not something we should aim to solve for the general case (your setTimeout example).

Gotcha. The example I posted above is just meant to illustrate that the current behavior, where in some cases error is called with an error thrown from next and sometimes it isn't, is really confusing and hard to explain.

@zenparsing
Copy link
Member Author

@benjamingr

Let's not use the term "sync subscription" because I'm not proposing that we make "subscribe" async either. I just want to know whether there are use cases for eager notification.

@benjamingr
Copy link

Roger.

Going through our Rx code - this change would not cause any issues for us. I tend to be +1 on this change.

@zenparsing
Copy link
Member Author

One use case that I've seen for eager notification is "Property", described in #83.

Basically, it's an observable for a sequence of changes to a "property" value. The user wants to see the current value immediately when they subscribe. I illustrated a non-eager alternative here in which the current value is sent in a future turn, if the property is not updated beforehand.

@benjamingr
Copy link

@zenparsing pinging @mweststrate who wrote MobX:

Michel - MobX does this (run the function synchronously in autorun). We're discussing changing the semantics of native ES observables to emit next events only after the subscribe function has completed in order to deal with multiple consumers where one throws and other things.

In MobX terms this means autorun always runs the function afterwards and autorun is not synchronous. How problematic would that be for MobX?

@zenparsing
Copy link
Member Author

Similar in spirit to Property is ReplaySubject. On subscribe, it immediately sends out a buffered list of previous notifications (leaving aside custom schedulers).

If we couldn't send notifications eagerly, ReplaySubject would have to delay subscription for a tick.

@trxcllnt
Copy link

Wow, this is frustrating. This was argued and settled literally a year ago. Observables are a primitive. If you want scheduling, compose it in.

@zenparsing
Copy link
Member Author

@trxcllnt That is not a helpful comment.

@benlesh
Copy link

benlesh commented Dec 22, 2016

Given that we'd like to be able to model EventTarget with Observable, I don't think we can give up eager notification.

The following all happens in the single job on the event loop:

console.log('start');

const handler = e => console.log('clicked');

document.addEventListener('click', handler);
document.dispatchEvent(new Event('click'));
document.removeEventListener('click', handler);

console.log('end');

// "start"
// "click"
// "end"

You can't model that with Observable without eager notification. And in scenarios where you might be synchronously dispatching N-events and you only want to take a few of them, you'd be forced to buffer all of those values in an unbounded buffer (or via closures in memory) while you're scheduling.

Also, given the nature of Observables as being a template to create a bridge of observation between a data producer and an observer, adding unnecessary scheduling through the middle of that observation chain isn't really going to help anyone, and will have some performance costs. I've brought this up before and my concerns were minimized, but the costs are there, I've had to measure them in the past. And at scale, in Node, for example, it's going to matter.

Observables are a primitive. You can compose in scheduling, you can't compose in removal of scheduling.

@zenparsing
Copy link
Member Author

You can't model that with Observable without eager notification

This is false. Please re-read the definition section above.

@trxcllnt
Copy link

I'm sorry, but it is frustrating. The more the spec deviates from proven, battle-tested prior art, the more buggy edge cases are introduced or possible use-cases precluded.

It's frustrating because I do this all day, every day, for money. A huge part of my job is solving really hard problems (async microservices on distributed GPUs, yay!), pushing Observables to the max. Changing core stuff like this breaks things like our carefully designed high performance recursive async back-pressure-sensitive GPU-sync'd rendering loop.

The es-observable spec is trying to improve on an idea that has years of academic and industry research behind it. I literally work with people whose PhD work inspired Rx. Yet we keep proposing fundamental changes to the core behavior for aesthetic reasons, without any research, analysis, or testing as to the implications. Observable is a like finely tuned equation, you can't just change the order of operations and expect it will still work.

I'll try to go back and dig up all the arguments and examples, some made in this repo, some made in others, when I'm back to my laptop.

@zenparsing
Copy link
Member Author

@trxcllnt I sympathize, and for what it's worth, I'm in complete agreement that we should stick to the battle-tested design as closely as possible. I'm just concerned that this one aspect (eager notification) is really confusing and tricky. I think we both know that the eager notification leads to all kinds of little edge cases that must be dealt with carefully.

Any input here would be great, but I want it to stay focused on eager notification use cases (like ReplaySubject). I'd like to put together a list.

@benlesh
Copy link

benlesh commented Dec 22, 2016

This is false. Please re-read the definition section above.

How would this work?

Observable.range(0, 1000000000000).take(5)

or this?

Observable.from((function* () {
  let i = 0;
  while (true) yield i++;
}()).take(5)

In order to accomplish what you're saying, you'd need to queue-schedule all calls to observer.next, and wait for subscribe to finish before you could send them to the next observable down the stream (in take). In the case of these synchronous observables (which we know we need for EventTarget), operators like take need to be able to synchronously flag that they're done receiving values by unsubscribing. This means that in order to write an Observable that did a range as above, any average user would need access to the scheduling mechanism so that they could schedule each next recursively, enabling them to synchronously check whether the consumer was done (observer.closed), otherwise they risk building a huge queue in memory.

Unless this is an angle to try to get coroutine-type behavior out of Observable. Which still wouldn't prevent convoluted solutions to the above problems.

@zenparsing
Copy link
Member Author

zenparsing commented Dec 22, 2016

Observable.range(0, 1000000000000).take(5)

Observable.range would have to be implemented something like this:

Observable.range = function(from, to) {
  return new this(sink => {
    enqueueJob(() => {
      for (let i = from; i < to; ++i) {
        if (sink.closed) { return; }
        sink.next(i);
      }
      sink.complete();
    });
  });
};

Same goes for Observable.of, Observable.from(iterable), etc.

@trxcllnt
Copy link

trxcllnt commented Dec 22, 2016

@zenparsing the example here was meant to illustrate a case where asynchronous subscription precludes solving some task, in that case a recursive DFS, as async subscription implicitly requires trampolining, which is a breadth-first strategy. Unfortunately it seems the message was lost in translation, perhaps because many people don't do potentially-async DFS in their day jobs.

The Falcor-Router project is an example of Observables in the wild that relies on synchronous subscription. Routes return Observables (or things we can make into Observables) that produce values. We have to get those values synchronously, as they're inserted into a graph of aggregated results, and that graph decides which routes to execute next.

@benlesh
Copy link

benlesh commented Dec 23, 2016

Observable.range would have to be implemented something like this:

I think that implementation is missing a check of sink.closed in the for loop conditional. Otherwise take will not work, which was the other part of my query.

So everyone that wants to create a synchronous observable needs to have access to this enqueueJob functionality and understand its implications? Presumably observer will error if you try to synchronously next during subscription? What does that do? Send the error, asynchronously, but on the same job down the error path?

If someone wants to notify the moment a WebSocket starts trying to connect, they can't do it, unless they have this enqueueJob function. If someone wants to notify on subscription started in any way, they cannot do it, period.

// this isn't possible with the proposed change.
const measuredData = new Observable(observer => {
  observer.next({ type: 'START', ts: performance.now() });
  doSomeAsyncThing((data) => {
    observer.next({ type: 'END', ts: performance.now(), data });
    observer.complete();
  });
});

Given that this is the only real goal of this issue:

Explore the possibility of eliminating eager notification in order to make Observable easier to understand and less error-prone.

If a major point of this design change is to improve understandability, I think it's plainly failed. The ergonomics are just as bad if not worse. Users are still required to know that sometimes observables are synchronous and sometimes they're not, but they must also understand that calls to observer.next within the Observable constructor are either asynchronous or not allowed, and then have some knowledge of how to write or use some existing trampoline/queue scheduler and what that even means.

@zenparsing
Copy link
Member Author

I think that implementation is missing a check of sink.closed in the for loop conditional.

Right, fixed.

If someone wants to notify the moment a WebSocket starts trying to connect, they can't do it, unless they have this enqueueJob function.

I don't understand this use case.

If someone wants to notify on subscription started in any way, they cannot do it, period.

That seems like a reasonable use case, thanks.

Please keep in mind that I'm not advancing any proposal. I'm just trying to better understand why we allow this behavior that causes a bunch of trouble (like #119, apparently). Also, I'm really exhausted from arguing about this stuff.

@zenparsing
Copy link
Member Author

@trxcllnt Thanks for the links (although it will probably take me some time to unpack them 😄 )

@benlesh
Copy link

benlesh commented Dec 23, 2016

I don't understand this use case.
That seems like a reasonable use case, thanks.

Sorry, they're basically the same use case. Measuring a time to connection or the like.

@benlesh
Copy link

benlesh commented Dec 23, 2016

Also, I'm really exhausted from arguing about this stuff.

Same.

I hope none of this seems personal. Although I can understand @trxcllnt's frustration because I feel like we've talked about several of these issues before, I think it's probably worth rehashing them because it's been a while. After all, these are questions that I'm sure the committee will bring up multiple times, as I'm sure a lot of them don't have their heads as fully into this proposal as some of us do. They're busy people and there are a lot of proposals.

@trxcllnt
Copy link

Oh right, I forgot groupBy has to have sync subscription.

@benjamingr
Copy link

@trxcllnt - mind elaborating on groupBy?

I also don't understand why trampolining forces DFS to BFS. Here's regular DFS:

var graph = {
  a: {b: {c:true}, d: {}},
  e: {}
};
function dfs(g, predicate) {
  for(const edge in g) {
    //console.log(edge, predicate(g[edge]));
    if(predicate(g[edge])) return g[edge];
    var rec = dfs(g[edge], predicate);
    if(rec) return rec;
  }
}
dfs(graph, x => x === true); // returns true, and if you uncomment the log logs a,b,c

Now, let's make the DFS async with promises which trampoline (let's make graph expansion async and the predicate async:

const getEdges = async o => Object.keys(o); 
async function dfs(g, predicate) {
  for(const edge of await getEdges(g)) {
    if(await predicate(g[edge])) return g[edge];
    var rec = await dfs(g[edge], predicate);
    if(rec) return rec;
  }
}
dfs(graph, async x => x === true)

Now that both getEdges and predicate have to trampoline I'm still getting the same order of execution.

@benlesh
Copy link

benlesh commented Dec 26, 2016

@benjamingr. Because as soon as you get a new value to group you have to create a subject or observable and next into it synchronously. In order for any consumer to use that value they have to be able to subscribe also synchronously.

@benjamingr
Copy link

@Blesh yeah that's a much more motivating use case than the DFS :)

@trxcllnt
Copy link

@benjamingr sorry, should have clarified. the DFS behavior isn't deterministic -- it's contingent on whether falcor-router route handlers return synchronous Observables (or Arrays, etc.). This is by design. async-await doesn't allow sync notification, so it can't model the problem. And that's the point -- forcing async subscription necessarily limits the number of possible systems that can be expressed.

@benjamingr
Copy link

@trxcllnt cool, I understand. It's generally anything that's groupBy "like".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants