Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize for single observer #1121

Closed
benjamingr opened this issue Dec 31, 2015 · 10 comments
Closed

Optimize for single observer #1121

benjamingr opened this issue Dec 31, 2015 · 10 comments
Labels
feature PRs and issues for features
Milestone

Comments

@benjamingr
Copy link
Contributor

Continuing my discussion with @Blesh from angular/angular#5876

We're talking about cases like clicks(myElement).map(fn1).filter(fn2).subscribe(...).

I wrote there:

I'd expect Rx to optimize this under the hood and not really create a new subscription/observable/observer until it has to (that is, I'd expect it to "forward" the original subscription and then copy on write when a second subscriber is attached). The most common case at least for things like http calls (but definitely not the only one) is to have a single subscriber. I guess I'd expect RxJS to optimize for that (like bluebird does).

@Blesh wrote:

... There could be similar optimizations for operators that are completely synchronous like filter, map and scan when chained back to back.

I wrote:

The unicast lazy operation model gives you the ability to aggressively optimize those calls - I think it could be a big win for cases where performance is an issue. I think "expect one subscriber, and copy on write when that assumption fails" would mainly give people who use a new RxJS observable for singular events a big performance boost - and even make it viable

Even asynchronous ones can be composed to a single action in production builds.


I think optimizing for single subscriber could be a big performance win, I think we can safely optimize flatMap(...).map(...) into flatMap(..., ...) and "bail out" if a second subscriber is added to the original result and similarly for most/all combinators. The single subscriber case is the most common in my code and I assume other code bases are no different - I think it should be optimized aggressively at the expense of multiple subscribers being initially a little slower.

@benlesh
Copy link
Member

benlesh commented Dec 31, 2015

cc/ @trxcllnt

This optimization is actually made a little trickier by the lift architecture we went with. If every operator had it's own Observable (the "pair" architecture we looked at originally), it would trivial implement map on those Observable types and have that cover the optimization by cloning the observable with the proper result selector.

With lift, this is more difficult because there is one, shared, Observable type that is being used through the operator chain.

Another difficulty is that right now, there's no back reference to the Subscriber that made the source Observable. So there's no way to clone it and add the appropriate transformation (filter, map, etc).

I think what we'd have to something like

  1. analyze this.operator in lift to get an idea what the source operator was and if it supports the optimization (flattening with the current operator).
  2. clone the operator in this.operator from within lift, and set the appropriate properties on it (resultSelector for example).

All-in-all, though, I'd worry about adding all of that conditional logic.

... I'm sure there's a better way.

@jhoguet
Copy link

jhoguet commented Jan 1, 2016

if this is too off-topic, please let me know and I will create another issue - that said, I was about to use the same title :)

I don't understand why the default is cold observables.

For instance...

const s = new Rx.Subject();

o = s.map(i => {
  console.log('side effect ' + i)
  return i * 2;
});

o.subscribe(v => console.log(v));
o.subscribe(v => console.log(v));

s.next(2)

outputs

side effect 2
4
side effect 2
4

But this is not what I would expect. I can fix this by using .share()

const s = new Rx.Subject();

o = s.map(i => {
  console.log('side effect ' + i)
  return i * 2;
}).share();

o.subscribe(v => console.log(v));
o.subscribe(v => console.log(v));

s.next(2)

outputs

side effect 2
4
4

But why would I need to do that? Given that the Subject is pushing the same input, I should get the same output to functions like map. Isn't this the whole point of shifting to an RFP approach? From my point of view this only benefits those that are causing side effects in map because otherwise they would never know or care. Given that you have an operator like do we could preserve the current default by making do be the point at which the observable stops sharing.

I suspect there is either a fundamental RFP reason for this which I have not learned yet, or this is incidental complexity caused by some implementation detail. If it is the latter I would love to find a way to fix that before 5 comes out of beta.

In summary, not only do I agree with @benjamingr that we should optimize for one observer, I think that multiple observers should be able to leverage memoized computation for the portion of the observable chain they share.

A compromise might be explicit side effects like do which might be the point where the implicit share stops. IMO I wouldn't even want this. I am constantly needing to make my cold observables hot with share, never the other way around.

@trxcllnt
Copy link
Member

trxcllnt commented Jan 1, 2016

@Blesh
I think what we'd have to something like

  1. analyze this.operator in lift to get an idea what the source operator was and if it supports the optimization (flattening with the current operator).
  2. clone the operator in this.operator from within lift, and set the appropriate properties on it (resultSelector for example).

Yes, though I'm not sure I agree this is more difficult with lift. lift just creates the Observable (or async function) linked list so that upon subscription (aka invocation), the Operators can create the Subscriber linked-list (aka computation state).

The difficulty I see with lift is the same problem in any single-inheritance Class implementation (the "diamond problem"). Subclasses of Observable that implement their own lift won't receive the benefit of optimizing certain operators.

But we can add to the Operator type so that the Operator implementations optimize the Subscriber linked list upon subscription. Even though we'd have the same number of intermediate Observables and Operator instances, we'd have fewer Subscriber instances (and thus, hops when notifying), and that's usually where I've seen potential performance problems.

That said, what we have today is already significantly improved over current Rx, so much so that I feel comfortable building my idea for a Falcor-integrated UI library on top of it.

@trxcllnt
Copy link
Member

trxcllnt commented Jan 7, 2016

@jhoguet as for your question, "cold" and "hot" don't refer to unicast vs. multicast, they refer to the state of the computation. An Observable represents an asynchronous computation (aka, it's a function that can return multiple values between now and infinity).

"Cold" Observables are just like functions which haven't been called (subscribed to) yet. Each time you call it (aka, subscribe to it), you're re-running whatever calculation the Observable performs.

"Hot" Observables are just regular cold Observables that you've shoved a Subject between you and the cold Observable source. When you subscribe to it, you're really subscribing to the Subject over and over. Subjects can have any number of Observers (just like EventEmitter, etc.), but the original cold source has only one Observer (the Subject that's sitting between you and the source).

When you "fix" your example by using share, you're really just inserting a Subject between the map Observable and your Observers.

@jhoguet
Copy link

jhoguet commented Jan 7, 2016

@trxcllnt thanks for the explanation. I logged a new issue so as not to diverge this issue any further than I already have. #1148

@monfera
Copy link

monfera commented Jan 8, 2016

@benjamingr you wrote:

We're talking about cases like clicks(myElement).map(fn1).filter(fn2).subscribe(...).
...
I'd expect Rx to optimize this under the hood and not really create a new subscription/observable/observer until it has to

Have transducers been considered, for example, with transducers, a chain of map, filter etc. applications would be fused into one application. Analogous with loop fusion.

@benlesh benlesh added feature PRs and issues for features type: discussion labels Jan 15, 2016
@benlesh benlesh added this to the 5.0.0 release milestone Jan 15, 2016
@jadbox
Copy link
Contributor

jadbox commented Feb 14, 2016

@monfera Interesting idea: single observer optimize [at least] the transducer route for hot regions of code where performance matters

@benjamingr
Copy link
Contributor Author

@jadbox @Blesh recently worked on a PR that transduces (in the terminology used here) chains of map #1323

@benlesh
Copy link
Member

benlesh commented Jun 17, 2016

I'm going to close this issue, as it's gone stale. But collapsing operators is still on my radar.

@benlesh benlesh closed this as completed Jun 17, 2016
@lock
Copy link

lock bot commented Jun 7, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jun 7, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature PRs and issues for features
Projects
None yet
Development

No branches or pull requests

6 participants