Skip to content

fix(subject): Faster subscriptions management in Subject#7231

Closed
dubzzz wants to merge 7 commits intoReactiveX:masterfrom
dubzzz:faster-subjects-unsub
Closed

fix(subject): Faster subscriptions management in Subject#7231
dubzzz wants to merge 7 commits intoReactiveX:masterfrom
dubzzz:faster-subjects-unsub

Conversation

@dubzzz
Copy link
Contributor

@dubzzz dubzzz commented Mar 23, 2023

Description:

The current implementation of the Subject was backing itself on an array of observers. When one observer unsubscribed itself, it was performing a linear scan on the array of observers and dropping the item from the array of observers. The algorithm complexity of this operation was O(n) with n being the number of observers connected to the Subject.

It caused code like:

const allSubscriptions = [];
const source = new Subject();
for (let index = 0 ; index !== 1_000_000 ; ++index) {
  allSubscriptions.push(source.subscribe()); // rather quick
}
for (const subscription of allSubscriptions) {
  subscription.unsubscribe(); // taking very long
}

The proposed approach consists into changing our backing collection (an array) into a Map. But we lose somehow the set capability on subject.observers (might possibly be patched in some way).

The command cross-env TS_NODE_PROJECT=tsconfig.mocha.json mocha --config spec/support/.mocharc.js "spec/**/Subje*-spec.ts" with the new test added:

  • Now: 452ms
  • Before: 10s

BREAKING CHANGE:

Subject.observers has been impacted. If needed we can probably make it backward compatible in some ways. Here is what changed:

  • Now a getter property, was previously a basic property
  • The value change from one call to another -we recompute it anytime we access it)
  • There is no setter anymore
  • The value is never null

Related issue (if exists):

The current implementation of the Subject was backing itself on an array of observers. When one observer unsubscribed itself, it was performing a linear scan on the array of observers and dropping the item from the array of observers. The algorithm complexity of this operation was O(n) with n being the number of observers connected to the Subject.

It caused code like:

```js
const allSubscriptions = [];
const source = new Subject();
for (let index = 0 ; index !== 1_000_000 ; ++index) {
  allSubscriptions.push(source.subscribe()); // rather quick
}
for (const subscription of allSubscriptions) {
  subscription.unsubscribe(); // taking very long
}
```

The proposed approach consists into changing our backing collection (an array) into a Map. But we lose somehow the set capability on subject.observers (might possibly be patched in some way).

Following that change the command `cross-env TS_NODE_PROJECT=tsconfig.mocha.json mocha --config spec/support/.mocharc.js "spec/**/Subje*-spec.ts"` passed to 452ms from 10s.
let numResultsReceived = 0;
const allSubscriptions: Subscription[] = [];
const source = new Subject<number>();
const numSubscribers = 100000;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With 100k items, the version of the code currently on master exceeds the timeout of mocha on a GitHub Codespaces (it reached 10s contrary to new code taking around 400ms). But I don't really know which bar I should put their as the more powerful runners will be the more unlikely the failure even if we reintroduce a slow version. Maybe you have suggestion of it 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I should move it to another file by following recommendations in https://github.com/ReactiveX/rxjs/blob/master/CONTRIBUTING.md#performance-tests

@kwonoj
Copy link
Member

kwonoj commented Mar 23, 2023

I believe we had similar sort of PR and did not made changes, but search fails me tonfind exact one.

Also in my opinion this seems for extreme cases: if you really have more than even 10,000 subscriptions I'd expect app's architecture need to be revisited. What are the real world problem we'd like to fix with breaking changes?

@dubzzz
Copy link
Contributor Author

dubzzz commented Mar 23, 2023

Also in my opinion this seems for extreme cases: if you really have more than even 10,000 subscriptions I'd expect app's architecture need to be revisited. What are the real world problem we'd like to fix with breaking changes?

Good point! On my side, I patched the issue locally by packing all my subscriptions behind a single one, itself pipping data back to my original subscribers. My case was a grid containing thousands of cells in which each cell subscribes to a shared store to see updates. My local fix has been to drop the subscribe done by the cell to an home-made version of a subscribe dealing with adding and removing subscribed callbacks into a Set that will be read by the single real subscription done on the source subject.

I though about submitting this potential fix as actually even if the problem is not that horrible in small cases, we still have extra time taken by a linear look-up while we could just have a immediate delete by switching to a Map. Idea was to improve a lot very huge case but also smaller ones.

@dubzzz
Copy link
Contributor Author

dubzzz commented Mar 23, 2023

I can probably start to work onto measuring the performance impact on various cases including: 1/2/10/100 subscribers and time to subscribe/emit/unsubscribe. So that we can check whether or not it could bring negative (clearly a no-go) or positive performance impact on small and nominal cases 🤔

@kwonoj
Copy link
Member

kwonoj commented Mar 23, 2023

Before doing any additional work may need some archaeology to confirm if there was a public / internal discussions for this and why we didn't. It would be bad if you spend amount of time to prepare PR then we find out some historical reason we can't land this.

@kwonoj kwonoj added blocked AGENDA ITEM Flagged for discussion at core team meetings labels Mar 23, 2023
@benlesh
Copy link
Member

benlesh commented Mar 25, 2023

I added a test that passes in master locally just to make sure this didn't break anything. However, the tests have been running for > 13 minutes now, so I suspect something is wrong. This is a decent idea, but it makes the changes in the wrong spots, and removes other optimizations.

benlesh added 4 commits March 24, 2023 22:00
+ Re-adds the currentObservers optimization
+ Uses a `for(;;)` array index loop to avoid creating an iterator for the currentObservers array on each action (`next`, `error`, `complete`)
+ Moves the `Map` usage to the `observers`, where we should actually be tracking what observers we really have
+ Moves `observers` property to be internal
+ Adds some documentation to the `currentObservers` property to prevent future confusion.

BREAKING CHANGE: `Subject#observers` is now internal. If you wanted to use this property to check to see if the subject is observed, use `Subject#observed` instead.
This is what I get for using the in-browser IDE.
@benlesh
Copy link
Member

benlesh commented Mar 25, 2023

lol...using the in-browser IDE is great, except I can't run the tests. I should really start using Stackblitz for this. 🤔

@benlesh benlesh self-assigned this Mar 25, 2023
@benlesh
Copy link
Member

benlesh commented Mar 25, 2023

Actually... wait, I'm moving too fast because I was excited about this optimization. This particular code won't work, you can see it was breaking a lot of tests even on that initial commit, even beyond the test I added. ...I'll rollback my other changes. ...but I'll think about this a bit. This can probably be optimized.

@benlesh
Copy link
Member

benlesh commented Mar 25, 2023

Since I destroyed this one with the Web IDE, I'm going to close it and reopen it in it's original state at #7234 .

@benlesh benlesh closed this Mar 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AGENDA ITEM Flagged for discussion at core team meetings blocked

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants