Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core semantics #6434

Merged
merged 5 commits into from Jul 5, 2021
Merged

Core semantics #6434

merged 5 commits into from Jul 5, 2021

Conversation

benlesh
Copy link
Member

@benlesh benlesh commented May 28, 2021

This is a starting place to put together a document that outlines the semantics of functionality provided by the core library. This will be used as a reference for the "correct" behaviors of the library starting in version 8. We probably can't do that in version 7 without breaking someone, but it's definitely something we want to get defined, as it keeps coming.

@benlesh benlesh added 8.x Issues and PRs for version 8.x AGENDA ITEM Flagged for discussion at core team meetings labels May 28, 2021
@@ -47,8 +48,6 @@ Before you submit your Pull Request (PR) consider the following guidelines:

- Create your patch, following [code style guidelines](#coding-style-guidelines), and **including appropriate test cases**.
- Run the full test suite and ensure that all tests pass.
- Run the micro and macro performance tests against your feature branch and compare against master
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was unrelated, but I saw it and it needs to be removed.

- [Subject](#subject)
- [Body](#body)
- [Footer](#footer)
- [Contributing to RxJS](#contributing-to-rxjs)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was automatic.

- If the operation performed by the operator can tell it not change anything about the output of the source, it MUST return the reference to the source. For example `take(Infinity)` or `skip(0)`.
- Operators that accept a "notifier", that is another observable source that is used to trigger some behavior, must accept any type that can be converted to an `Observable` with `from`. For example `takeUntil`.
- "Notifiers" that are provided to operators are expected to emit a value in order to trigger notification. Completion does not count as a notification.
- "Notifiers" provided directly to the operator MUST be subscribed to *before* the source is subscribed to. "Notifiers" created via factory function provided to the operator SHOULD be subscribed to at the earliest possible moment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

questions that might be worth also answering here:

  • which types of value emission should be allowed? maybe also add the TS type that is required (I guess ObservableInput<any>)
  • what happens if a notifier errors before/after next?
  • how to deal with subscription references of notifiers when they error/complete without next notification?
  • should notifiers always be one-time emissions? basically, they always behave like from(notifier).pipe(take(1)) from the operator's point of view?
  • "Notifiers must be unsubscribed from at the earliest possible moment once they notified or when the operator is not interested in being notified anymore"?


The purpose of these semantics is provide predictable behavior for the users of our library, and to ensure consistent behavior between our many different operators.

## Operators
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where applicable it would be very helpful to have "dos and don'ts" examples

- Events that happen after the completion of a source should happen after the source finalizes. This is to ensure that finalization always happens in a predictable time frame relative to the event.
- `Error` objects should never be retained longer than necessary. This is a possible source of memory pressure.
- `Promise` references should never be retained longer than necessary. This is a possible source of memory pressure.
- Operators that split a source `Observable<T>` into many child observables `Observable<Observable<T>>` should emit child observables that do not stop because the original consumer subscription is unsubscribed. This is because those child observables may be consumed outside of that subscription lifecycle. For example, a user could capture a grouped (child) observable emitted from `groupBy` and subscribe to it elsewhere. The purpose of the `groupBy` was to create observables, not dictate their lifespan.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be helpful to explain when to ultimately unsubscribe from the source and how to "detect" this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just working on a window style custom operator and while looking at the window source code I noticed that this operator family unsubscribes the nested observables immediately when the primary subscription is unsubscribed from:
https://stackblitz.com/edit/qsuarg?devtoolsheight=33&file=index.ts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more i dig into the window/groupBy use cases the more I think that it is much more intuitive if the group lifecycles are tied to the top-level subscription and consuming them outside of its context is an anti-pattern, if we talk about window/groupBy as operators. is there some widely used pattern that I missed that requires this?

i also tried to compare it to the refactorings done to the multicasting operators, where connect/share operators are also entirely handled within the primary subscription(s), while the connectable observable factory was a part of the multicast operator that was taken out of pipeline/operators into an observable factory so the connect method can be handled outside of the scope of a primary subscription.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to "detect" this case.

If the operator takes a source of Observable<A> returns Observable<Observable<B>>, (windowX, groupBy) that's how you'd know. There's nothing to really "detect" in this case.

The more i dig into the window/groupBy use cases the more I think that it is much more intuitive...

Yeah, I don't disagree. I'm putting this in here as a starting/discussion point. A lot of people don't even realize that this is a thing. So, I've added it to the core semantics document so the core team can agree on a behavior and move that way over the course of 7 -> 8 -> 9 versions. It very well might be the correct decision to stop groupBy from behaving this way. Or maybe we need to add some sort of "eject" functionality... either way, it's a point of discussion, and I'm glad you're picking up on it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's nothing to really "detect" in this case.

by "detect" I meant, how does the operator find out when to unsubscribe from the source. I know how groupBy does it and window* doesn't do it, but I would expect a general guideline here on what the correct "detection mechanism" is.


After giving this some more thought on my side, I have to justify my original opinion on this topic.
Lets first talk about window*:

const sub1 = of(1, 2, 3)
  .pipe(
    windowCount(1),
    take(1),
    mergeAll()
  ).subscribe()
  // I would expect this to emit `1` (it currently does not).

let w: Observable<number>;
const sub2 = of(1, 2, 3)
  .pipe(
    windowCount(1),
    take(1),
  )
  .subscribe((v) => void (w = v));
w.subscribe();
// I would expect this to be UB,
// it will never emit in a sync case and may or may not work in the async case

let w: Observable<number>;
const sub3 = of(1, 2, 3)
  .pipe(
    windowCount(1),
    take(1),
  )
  .subscribe((v) => void v.subscribe());
  // this could work and emit `1` (it currently does not),
  // but I would consider this bad practice and should be discouraged.
  // if it eases implementation then UB would be fine too IMHO.

why should the first example work? the reason is the semantics of the inner observable. it has a clear definition of its lifecycle: the first window starts before the first source emission, emits the following 1 next notification, and then completes. since windowCount only operates on the sequence of notifications (and not the contents of notifications) and/or external notifiers (on other variants of the window operator) it is "predictable" what the inner observables will emit and I can imagine uses cases for chaining it with take et al.

this is also still in line with what I originally said: the observable is consumed within its "primary" subscription (referring to sub1 here), as the nested/inner subscriptions are "invisible" to the user. this is not the case for the second example.

in the groupBy case it's basically the same, but I just don't see any use case for pipe(groupBy(selector), take(n)) without having the possibility to control group openings. this was proposed in #6425. currently this can be worked around with something like pipe(startWith(x, y, z), groupBy(selector), map(skip(1)), take(n)), but it does not feel very nice.

Copy link
Member

@trxcllnt trxcllnt Jul 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operators that split a source Observable<T> into many child observables Observable<Observable<T>> should emit child observables that do not stop because the original consumer subscription is unsubscribed.

Why not just multicast the groupBy() Observable and manually connect it?

The inner Observables are tied to the lifetime of the groupBy() subscriber because Observables are pure functions. The inner GroupedObservables are not meant to leak outside the monad:

// good
src.groupBy().flatMap((group) => {
   return group.map((x) => `${group.key}:${x}`);
}).subscribe(console.log.bind(console));

// bad
src.groupBy().subscribe((group) => {
   group.subscribe((x) => console.log(`${group.key}:${x}`));
});

Multiple subscriptions to the inner GroupedObservables (or leaking them via subscribe()) necessarily represents mutable state, which is what multicast() and connect() are for:

const groups = src.groupBy()
   .map((group) => group.map((x) => `${group.key}:${x}`))
   .multicast();

const groupState = groups.connect();

groups.subscribe((group) => {
   group.subscribe((x) => console.log(`${x}-1`));
   group.subscribe((x) => console.log(`${x}-2`));
});

// kill all subscribers to all inner groups
setTimeout(() => groupState.unsubscribe(), 1000);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trxcllnt LMAO. I swear I didn't write groupBy to behave like this, I thought you did. I agree it's weird. I only recorded this semantic because I wanted to draw attention to it. I'd be much happier if it did not have that behavior.

- "Notifiers" that are provided to operators are expected to emit a value in order to trigger notification. Completion does not count as a notification.
- "Notifiers" provided directly to the operator MUST be subscribed to *before* the source is subscribed to. "Notifiers" created via factory function provided to the operator SHOULD be subscribed to at the earliest possible moment.
- The observable returned by the operator function is considered to be the "consumer" of the source. As such, the consumer MUST unsubscribe from the source as soon as it knows it no longer needs values before proceeding to do _any_ action.
- Events that happen after the completion of a source should happen after the source finalizes. This is to ensure that finalization always happens in a predictable time frame relative to the event.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"event" is a confusing term here. is general behavior meant, like cleanup or downstream notifications?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably link terms to the glossary.


- MUST be a function that returns an [operator function](https://rxjs.dev/api/index/interface/OperatorFunction). That is `(source: Observable<In>) => Observable<Out>`.
- The observable returned by the operator function MUST subscribe to the source.
- If the operation performed by the operator can tell it not change anything about the output of the source, it MUST return the reference to the source. For example `take(Infinity)` or `skip(0)`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ...can tell it does not change... I had trouble parsing this sentence without that

Comment on lines 19 to 20
- `Error` objects should never be retained longer than necessary. This is a possible source of memory pressure.
- `Promise` references should never be retained longer than necessary. This is a possible source of memory pressure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give examples or some guidance on what "longer than necessary" means? If I ever plan on sending the error to the user then, by definition, it will be necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what else to put here. Basically like creating an Error object well in advance of some async throw. They should be created on-demand if at all possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if you create a rejected Promise and don't call .catch() before the next turn of the event loop, node will throw an unhandledRejection event.

@benlesh benlesh removed the AGENDA ITEM Flagged for discussion at core team meetings label Jun 2, 2021
@benlesh benlesh changed the title WIP: Core semantics Core semantics Jun 21, 2021
@benlesh benlesh added AGENDA ITEM Flagged for discussion at core team meetings 7.x Issues and PRs for version 6.x labels Jun 21, 2021
@benlesh
Copy link
Member Author

benlesh commented Jun 21, 2021

@ReactiveX/rxjs-core I think we should try to get this landed and iterate on it. Otherwise it's going to flounder as a PR like most of our documentation PRs do. haha. I think the main thing I want to settle though is around the "splitting operators" mentioned above.

Copy link
Member

@kolodny kolodny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I'm very happy with having a doc like this to get everything on the same mindset. Agree that we can iterate on the harder splitting operators.

Copy link
Collaborator

@cartant cartant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this, but IMO the referentially-transparent bit isn't quite right.

## Operators

- MUST be a function that returns an [operator function](https://rxjs.dev/api/index/interface/OperatorFunction). That is `(source: Observable<In>) => Observable<Out>`.
- The returned operator function MUST be [referentially transparent](https://en.wikipedia.org/wiki/Referential_transparency). That is to say, that if you capture the return value of the operator (e.g. `const double => map(x => x + x)`), you can use that value to operate on any many observables as you like without changing any underlying state in the operator reference. (e.g. `a$.pipe(double)` and `b$.pipe(double)`).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the operator that must be referentially transparent; not the operator function. There should be no requirement for the returned function to be referentially transparent.

Calls to share must be referentially transparent, but calls to the operator function - that share returns - are not.

My understanding of referential transparency is that a function is referentially transparent if you can replace its being called with the value that it returns, so it's the operator that needs to be referentially transparent for this to make sense:

That is to say, that if you capture the return value of the operator...


- MUST be a function that returns an [operator function](https://rxjs.dev/api/index/interface/OperatorFunction). That is `(source: Observable<In>) => Observable<Out>`.
- The returned operator function MUST be [referentially transparent](https://en.wikipedia.org/wiki/Referential_transparency). That is to say, that if you capture the return value of the operator (e.g. `const double => map(x => x + x)`), you can use that value to operate on any many observables as you like without changing any underlying state in the operator reference. (e.g. `a$.pipe(double)` and `b$.pipe(double)`).
- The observable returned by the operator function MUST subscribe to the source.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this really be a 'MUST'. There will be no subscription to the source for a take(0), for example. Is that something that is going to have to change?

@benlesh
Copy link
Member Author

benlesh commented Jul 5, 2021

Okay... I'm going to merge this as-is, and we can make subsequent PRs to clean up the minutiae. I will be VERY happy to fix the semantics of groupBy, which IMO are a nightmare. I'm glad @trxcllnt pointed out that he didn't agree with it, either. haha.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
7.x Issues and PRs for version 6.x 8.x Issues and PRs for version 8.x AGENDA ITEM Flagged for discussion at core team meetings
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants