Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symbol.observer name #25

Closed
zenparsing opened this issue Jun 4, 2015 · 24 comments
Closed

Symbol.observer name #25

zenparsing opened this issue Jun 4, 2015 · 24 comments

Comments

@zenparsing
Copy link
Member

I'm not a fan of the "Symbol.observer" name, because nouns should only be used for function names if it names the thing being returned. In this case the "observer" is the argument and the "subscription" is the return value.

Are there any other options here? Throwing some things out:

  • Symbol.observe : Seems nice, but doesn't leave us with the option of using observe to name functions that return observables. Also, there was overlap with Object.observe, but it's unclear to me that there is still a conflict.
  • Symbol.subscribe : This would be confusing since we also have the "subscribe" method.
  • Symbol.subscribeSync : That's what it does, but bleh!
  • Symbol.startSubscription : Getting longer and longer...

Any thoughts?

@benjamingr
Copy link

Well, the good thing about it is that it's similar to Symbol.iterator

@domenic
Copy link
Member

domenic commented Jun 4, 2015

What is the motivation for separating subscribe and this method?

@zenparsing
Copy link
Member Author

@domenic Good question. The main semantic difference between the two is that subscribe schedules the execution of the subscriber function on a job queue, whereas @@observer executes the subscriber immediately.

The queued behavior is important for zalgo-avoidance: for general usage we don't want any notifications to occur before the "subscribe" method call has returned.

The synchronous subscription behavior of @@observer is motivated by performance concerns. The intention is to allow libraries/combinators the ability to start an upstream subscription without having to schedule anything on the job queue. In those situations (like the map/filter examples) zalgo-avoidance is not really a problem.

It is less than ideal to have two functions whose behavior is so similar. The downside is that users implementing domain-specific combinators might be confused about which method is appropriate for their situation.

@domenic
Copy link
Member

domenic commented Jun 4, 2015

Yes, I do not think it's defensible to have both. Either zalgo-avoidance is important (and in fact critical, to avoid plan interference attacks) or it is not. The function-class perspective tends to indicate to me that it is not, but I haven't studied enough the usage of observables to see whether that would open up plan interference attacks. On the other hand, the async-programming-primitive perspective tends to indicate zalgo-avoidance is critical, and that observables should have sync subscribe if and only if promises get one.

@zenparsing
Copy link
Member Author

Various individuals at the TC39 meeting agreed that zalgo-avoidance was critical, and that synchronous subscribe had resulted in bugs.

@domenic
Copy link
Member

domenic commented Jun 4, 2015

Cool, then I think we cannot defend the presence of synchronous subscribe at all. This method should be asynchronous, should be named [Symbol.subscribe](), and either subscribe() should not exist or it should be an alias (like [Symbol.iterator]() and entries()).

@benlesh
Copy link

benlesh commented Jun 4, 2015

Cool, then I think we cannot defend the presence of synchronous subscribe at all.

I used to be in this camp. But now I disagree.

The idea to put the synchronous subscribe method on a Symbol came from @briancavalier, during a phone meeting between he and @jhusain and I. That call started with Brian and I opposing the need for a synchronous subscribe method, while we were trying to come up with a spec for streams-library interop.

It's important for developers interested in real reactive streams programming (in which you actually manage your side effects appropriately) to have synchronous subscription for performance reasons. For more naive use-cases and what will be short-term general use, the "zalgo-avoidance" method being the recommended, easy-to-discover method of subscription is totally acceptable (in fact, I think it's awesome!).

For more advanced use cases where developers are creating operators, particularly operators that do things like flatten other Observables, there absolutely must be a synchronous subscription method. Removal of the synchronous subscription method would cripple Observable's ability to compose in a performant manner by hindering the development of performant combinators. Scheduling a task for each subscription in nested flatMaps would perform horribly.

I agree that having a zalgo-avoidance method is critical, but I think it's just as critical to have a synchronous subscribe for advanced use cases.

@domenic
Copy link
Member

domenic commented Jun 4, 2015

I don't think performance arguments without data hold any value.

And I think that an analogous argument can be made for promises, so you must convince the committee of both at the same time for this to be acceptable.

@benjamingr
Copy link

An important thing to realize is that the overhead for an async function call is the same if you have 10000 calls or 1 call since it groups all the calls and executes them synchronously after a single timer.

In addition, implementations can detect that the subscriber fired the event asynchronously (since they know before and after the observer function ran) (that is - post subscribe) and avoid the async overhead altogether and execute it synchronously (while still keeping async semantics).

Also +1 for "I don't think performance arguments without data hold any value"

@zenparsing
Copy link
Member Author

I tried to gather some data on this, using v8's internal job queueing mechanism:

#3 (comment)

Clearly, scheduling jobs is less performant than directly calling a function (possibly with inlining). However, I don't think those numbers are very useful. It would be really great to have performance data over real-world observable use cases with combinators and flatMaps and everything.

@benjamingr
Copy link

@zenparsing and it appears @domenic already mentioned what I said here at #3 (comment)

That's a common misconception, but not true. Once you're in the microtask queue, you've already paid the cost, and it won't be paid again.

@zenparsing
Copy link
Member Author

@jhusain You might be interested in this thread : )

I tried doing the mouse drag demo, using "subscribe" for the combinators instead of @@observer and couldn't perceive any difference. That makes perfect sense for mouse drags, because you only perform a couple of subscriptions ("mouseup" and "mousemove") for each "mousedown".

If you think of the events in a table, the table for the mouse drag demo is very "wide": the inner notifications dominate the outer notifications.

For the synchronous @@observer behavior to make any noticeable performance impact, we'd need a table that was very "tall". That is, each inner observable would have to have relatively few elements, and the outer observable would have to have many elements or fire very frequently.

Can we think of any common real-world scenario which in which we would have such a "tall" event table?

@benlesh
Copy link

benlesh commented Jun 4, 2015

I don't think performance arguments without data hold any value.

I don't think we really need data to know that calling a function immediately performs better than scheduling it.

And I think that an analogous argument can be made for promises

Perhaps, but Promises are not sets, they're single values. The possible combinators over single values are limited. Promises aren't lazy, so subscription means basically nothing when discussing promises. Where subscription in terms of Observables has a lot of meaning.

That's a common misconception, but not true. Once you're in the microtask queue, you've already paid the cost, and it won't be paid again.

... except for in every async handler composed down a reactive stream.

var images = getAnArrayOfImageURLs();
var canvas = document.querySelector('#myCanvas');
var context = canvas.getContext('2d');

Observable.fromArray(images)
  .flatMap(src => {
      var img = new Image();
      img.src = src;
      return Rx.Observable.fromEvent(img, 'load').map(img);
   })
  .subscribe(img => {
     context.drawImage(img, 0, 0);
  });

So here, if we get rid of synchronous subscription methods entirely, under the hoods of flatMap and map, we're scheduling subscription on the same queue... but the "shared cost" ends there... On the next image load event fired we're scheduling a new queue, and then for every single load event after that. It just doesn't scale well. This is done for absolutely no real benefit to me as an end-user developer.

If flatMap and map were using @@observer (omg-zalgo) under the hood, the overhead of creating N queues for N image load events is gone... But since I used the subscribe (zalgo-safe) method in my code, I see no sign of Zalgo as a developer.

This is one, extremely contrived example.... but it's not terribly far fetched.

@domenic
Copy link
Member

domenic commented Jun 4, 2015

You are seriously arguing that the slow part of that code is the microtask, not, say, loading the image?

@benjamingr
Copy link

I don't think we really need data to know that calling a function immediately performs better than scheduling it.

Of course, the thing is like @domenic said is that the penalty for scheduling a function asynchronously happens once and can be avoided in practice in most cases. For example, in a chain of mapd observers you only need to perform the async deferral once - and even that's only true if you're only just setting the handlers up once you know you're in a future turn than the one you installed the handlers in you can invoke them synchronously.

This is a very simple optimization that some promise libraries do - basically you get async subscription semantics visibly but in practice your'e doing sync function dispatch most of the time.

Perhaps, but Promises are not sets, they're single values. The possible combinators over single values are limited. Promises aren't lazy, so subscription means basically nothing when discussing promises. Where subscription in terms of Observables has a lot of meaning.

Agreed, promises are immutable and not lazy - so you don't really have the issue of async dispatching needed in transformations but only in subscription - again, the spec can say that it requires asynchronous subscription on those but engines can actually perform synchronous subscription under the hood in a lot of cases.

On the next image load event fired we're scheduling a new queue, and then for every single load event after that. It just doesn't scale well.

Ignoring what Dom said about the image load totally dominating that use case - you'd only need a single async queue here.

If flatMap and map were using @@observer (omg-zalgo) under the hood, the overhead of creating N queues for N image load events is gone... But since I used the subscribe (zalgo-safe) method in my code, I see no sign of Zalgo as a developer.

Exactly - but there is nothing stopping an implementation from doing that anyway in that case :)

@benlesh
Copy link

benlesh commented Jun 4, 2015

You are seriously arguing that the slow part of that code is the microtask, not, say, loading the image?

@domenic Performance isn't about identifying the single, slowest part of a system. You look at the cost of all of the parts, and remove what isn't necessary. Images being slow to load, has nothing to do with the additional unnecessary cost of the microtask queue scheduling, and it's one example.

Ignoring what Dom said about the image load totally dominating that use case - you'd only need a single async queue here.

@benjamingr depends on the implementation, but brand new scheduling each time something comes back in a different frame (which all of the load events in the example will do) there's added overhead and there's no benefit.

@benlesh
Copy link

benlesh commented Jun 4, 2015

Either way, this isn't a discussion about removing [Symbol.observer] and it's off-topic.

@benlesh
Copy link

benlesh commented Jun 4, 2015

FWIW, I've never liked the name observer, I thought it should be observe. But I never liked Iterable.prototype.iterator either.

@briancavalier
Copy link

The idea to put the synchronous subscribe method on a Symbol came from @briancavalier, during a phone meeting between he and @jhusain and I

As @Blesh hinted, I'm strongly in favor of a primary API that never delivers an event to an observer in the call stack in which it begins observing.

I believe @jhusain's primary concerns were around 2 things: 1) higher-order streams, specifically heterogenous ones (involving streams from multiple impls), potentially compounding async hits, and, 2) potential impact on resource constrained devices. But I don't want to speak for him, and I think it would be best if he could present them.

For first-order streams, I don't see any problem with only an async API. Heterogenous higher-order streams (e.g. rxstream.flatMap(most.of)) would pay some cost on each event in the "outer" stream. Just to reiterate what has already been said, homogenous higher-order streams can use internal mechanisms to avoid the extra async in that case if they want.

So, I think we ended up believing that some well-hidden, unergonomic API, like a symbol, could be a reasonable compromise for heterogenous higher-order streams to interoperate. That said ...

I don't think performance arguments without data hold any value

I would also like to see data on whether this is a real problem or not.

(p.s. I think the terminology "async subscribe" is problematic, because it sounds like the impl has to wait a tick before putting the observer into a list. In that case, it's possible the observer could miss events generated after it begins observing but still in the same call stack. That would be unintuitive, imho.)

@tgriesser
Copy link

Just came across this version of the observable spec, and noticed that the subscribe is now currently defined as async, then found this ticket.

One use case for sync subscription was brought up over on React in @sebmarkbage's "sideways data loading" proposal - facebook/react#3398 - the general idea being:

For each key/value in the record. Subscribe to the Observable in the value.

subscription = observable.subscribe({ onNext: handleNext });

We allow onNext to be synchronously invoked from subscribe. If it is, we set:

this.data[key] = nextValue;

Otherwise we leave it as undefined for the initial render. (Maybe we set it to null?)

Forcing an async subscription here would prevent data from being available during React's synchronous initial render.

@zenparsing mentioned above:

Various individuals at the TC39 meeting agreed that zalgo-avoidance was critical, and that synchronous subscribe had resulted in bugs.

Curious if you could expand on some real-world cases where bugs / zalgo are a critical issue here.

Also a more general question, what is the relation between this proposal and the spec here: https://github.com/jhusain/observable-spec

@zenparsing
Copy link
Member Author

Hi @tgriesser, thanks for dropping in!

As currently spec'ed, subscribe is async: the subscriber function runs in a job queue and no notifications can be sent synchronously. However, there is also a symbol-named method which does the subscription synchronously (the Symbol.observer method). The intention is that libraries and frameworks could use the symbol-named version for synchronous subscription.

As you can see from reading this thread, there is some contention over providing two methods which only differ in sync/async behavior; this is a design facet that we're currently working out.

I read over the React proposal you mentioned, but since I'm not particularly knowledgable about React I couldn't quite follow. I'd like to understand this use case for synchronous subscription more, though. Can you explain it in general terms as a series of synchronous steps?

When the zalgo issue was discussed at the TC39 meeting, there were members of the Angular team present and they brought up the point that developers are able to cope with callback APIs when they can rely on the callbacks being executed either always synchronously, or always asynchronously, but not mixed. @jhusain mentioned he was aware of similar experiences. I have personally found myself making assumptions about callback asynchrony when writing simple Observable examples.

The observable-spec repo is meant to capture the interoperability requirements for different observable implementations. This ES implementation will conform to those requirements. There might be some areas where that spec and this work is out-of-sync, but we'll get them sync'd up as we go forward.

@tgriesser
Copy link

Thanks for the explanation!

I read over the React proposal you mentioned, but since I'm not particularly knowledgable about React I couldn't quite follow. I'd like to understand this use case for synchronous subscription more, though. Can you explain it in general terms as a series of synchronous steps?

  1. Imagine a single Observable store of data, where the store updates over time as data is fetched/updated
  2. A "component" (UI Element container) is provided a map of Observable objects referencing a value in this store; a subscription is made on each of these objects when the component is initially rendered
  3. The initial render of the React component tree is synchronous, so if a value exists for the observable, next would be invoked synchronously, where the next handler will set this.data[key] = value
  4. If no value was provided synchronously, set this.data[key] = null
  5. As the value in the observed store changes over time, next is called, updating the property stored in the component
  6. When the component is unmounted, unsubscribe from any existing subscriptions
class Avatar extends Component {

  observing() {
     // If this value exists, it will be provided to `next` at 
     // subscription time, setting this.data.image - otherwise it'll get there eventually
     return {image: store.userImage()} 
  }

  render() {
    if (!this.data.image) return null
    return <img src={image} />
  }

}

In the browser it's not a big deal, the component will just re-render on nextTick and no one will know any better. On the server it's trickier, since the data is already present in the observable, the fact that it's not available synchronously means dependent portions of the UI will not exist on initial (and only) render. The alternative would be to make React rendering async, which is not something I can speak for other than to say that from related ticket's I've found, it's not something that looks will happen soon.

@sebmarkbage also comments in the thread

They might become more complicated (and slower) if the committee insists that calling next schedules a micro-task like Promises.

As I understand, the Zalgo issue with Promises stem from the fact that are always meant to be consumed as a future, whereas with observables the main goal seems to be putting control in the hands of the producer. I think it'd be reasonable to say that it's up to the producer to determine when next is called (synchronously or asynchronously), as the value is not an immutable single value, but a stream which will be changing over time anyway.

Making Observable a thenable (#15) seems like it could be a better solution, perhaps the fulfillment of return rather than next could be always guaranteed to happen next-tick?

@zenparsing
Copy link
Member Author

@tgriesser Just to clarify, all are agreed that notifications (next, throw, return) flow from the producer to the consumer synchronously. The only question is whether subscription is always async, or if we additionally provide a back-door way to synchronously subscribe.

I'm still digesting your overview (which is very helpful, btw!).

It seems like in the server-side situation, every observable would need to have its data immediately available for this to work.

@zenparsing
Copy link
Member Author

Closing this for cleanup - feel free to reopen if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants