fix(subject): Faster subscriptions management in Subject#7231
fix(subject): Faster subscriptions management in Subject#7231dubzzz wants to merge 7 commits intoReactiveX:masterfrom
Conversation
The current implementation of the Subject was backing itself on an array of observers. When one observer unsubscribed itself, it was performing a linear scan on the array of observers and dropping the item from the array of observers. The algorithm complexity of this operation was O(n) with n being the number of observers connected to the Subject.
It caused code like:
```js
const allSubscriptions = [];
const source = new Subject();
for (let index = 0 ; index !== 1_000_000 ; ++index) {
allSubscriptions.push(source.subscribe()); // rather quick
}
for (const subscription of allSubscriptions) {
subscription.unsubscribe(); // taking very long
}
```
The proposed approach consists into changing our backing collection (an array) into a Map. But we lose somehow the set capability on subject.observers (might possibly be patched in some way).
Following that change the command `cross-env TS_NODE_PROJECT=tsconfig.mocha.json mocha --config spec/support/.mocharc.js "spec/**/Subje*-spec.ts"` passed to 452ms from 10s.
| let numResultsReceived = 0; | ||
| const allSubscriptions: Subscription[] = []; | ||
| const source = new Subject<number>(); | ||
| const numSubscribers = 100000; |
There was a problem hiding this comment.
With 100k items, the version of the code currently on master exceeds the timeout of mocha on a GitHub Codespaces (it reached 10s contrary to new code taking around 400ms). But I don't really know which bar I should put their as the more powerful runners will be the more unlikely the failure even if we reintroduce a slow version. Maybe you have suggestion of it 🤔
There was a problem hiding this comment.
May I should move it to another file by following recommendations in https://github.com/ReactiveX/rxjs/blob/master/CONTRIBUTING.md#performance-tests
|
I believe we had similar sort of PR and did not made changes, but search fails me tonfind exact one. Also in my opinion this seems for extreme cases: if you really have more than even 10,000 subscriptions I'd expect app's architecture need to be revisited. What are the real world problem we'd like to fix with breaking changes? |
Good point! On my side, I patched the issue locally by packing all my subscriptions behind a single one, itself pipping data back to my original subscribers. My case was a grid containing thousands of cells in which each cell subscribes to a shared store to see updates. My local fix has been to drop the subscribe done by the cell to an home-made version of a subscribe dealing with adding and removing subscribed callbacks into a Set that will be read by the single real subscription done on the source subject. I though about submitting this potential fix as actually even if the problem is not that horrible in small cases, we still have extra time taken by a linear look-up while we could just have a immediate delete by switching to a Map. Idea was to improve a lot very huge case but also smaller ones. |
|
I can probably start to work onto measuring the performance impact on various cases including: 1/2/10/100 subscribers and time to subscribe/emit/unsubscribe. So that we can check whether or not it could bring negative (clearly a no-go) or positive performance impact on small and nominal cases 🤔 |
|
Before doing any additional work may need some archaeology to confirm if there was a public / internal discussions for this and why we didn't. It would be bad if you spend amount of time to prepare PR then we find out some historical reason we can't land this. |
|
I added a test that passes in |
+ Re-adds the currentObservers optimization + Uses a `for(;;)` array index loop to avoid creating an iterator for the currentObservers array on each action (`next`, `error`, `complete`) + Moves the `Map` usage to the `observers`, where we should actually be tracking what observers we really have + Moves `observers` property to be internal + Adds some documentation to the `currentObservers` property to prevent future confusion. BREAKING CHANGE: `Subject#observers` is now internal. If you wanted to use this property to check to see if the subject is observed, use `Subject#observed` instead.
This is what I get for using the in-browser IDE.
|
lol...using the in-browser IDE is great, except I can't run the tests. I should really start using Stackblitz for this. 🤔 |
|
Actually... wait, I'm moving too fast because I was excited about this optimization. This particular code won't work, you can see it was breaking a lot of tests even on that initial commit, even beyond the test I added. ...I'll rollback my other changes. ...but I'll think about this a bit. This can probably be optimized. |
|
Since I destroyed this one with the Web IDE, I'm going to close it and reopen it in it's original state at #7234 . |
Description:
The current implementation of the Subject was backing itself on an array of observers. When one observer unsubscribed itself, it was performing a linear scan on the array of observers and dropping the item from the array of observers. The algorithm complexity of this operation was O(n) with n being the number of observers connected to the Subject.
It caused code like:
The proposed approach consists into changing our backing collection (an array) into a Map. But we lose somehow the set capability on subject.observers (might possibly be patched in some way).
The command
cross-env TS_NODE_PROJECT=tsconfig.mocha.json mocha --config spec/support/.mocharc.js "spec/**/Subje*-spec.ts"with the new test added:BREAKING CHANGE:
Subject.observers has been impacted. If needed we can probably make it backward compatible in some ways. Here is what changed:
Related issue (if exists):