Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cases for synchronous subscription. #38

Closed
benjamingr opened this issue Jul 6, 2015 · 72 comments
Closed

Use cases for synchronous subscription. #38

benjamingr opened this issue Jul 6, 2015 · 72 comments

Comments

@benjamingr
Copy link

It would be great if we had a list of use cases for synchronous subscription in order to better understand if/when [Symbol.observer] is needed.

Please include:

  • Description - Why is this use case important and who are the users.
  • Code - a minimal example demonstrating the use case and showcasing it. preferably codepen/jsbin/jsfiddle but a gist is also fine, demonstrating the issue.
  • Why synchronous subscription? - why would asynchronous subscription not work here, what benefit does synchronous subscription bring to the table?

Pinging people who want synchronous inspection or have expressed interest in it before: @jhusain @Blesh @tgriesser

Bonus points for weakness of async subscription compared to existing DOM or Node/io APIs. If you use "performance" as a reason make sure you understand what an async trampoline means (batching asynchronous function calls), how async tagging works (only defer actions that were not already async and only once per chain) and so on.

@benlesh
Copy link

benlesh commented Jul 6, 2015

Basically all use cases for Observable when you want Observable to be performant.

If you're programming functionally, you should be using [Symbol.observer] every single time, otherwise you're wasting time scheduling things on a micro task queue that don't need to be scheduled.

@benjamingr
Copy link
Author

otherwise you're wasting time scheduling things on a micro task queue that don't need to be scheduled.

Repeating myself:

you use "performance" as a reason make sure you understand what an async trampoline means (batching asynchronous function calls), how async tagging works (only defer actions that were not already async and only once per chain) and so on.

Basically, you don't really need to put almost anything on the microtask queue. I see my previous points about the trampoline and async tagging were unclear. Allow me to elaborate.

Trampoline:

You have:

for(var i = 0; i < 1000; i++){
    observables[i].subscribe(funcs[i]);
}

Let's say all 1000 observables have results for all 1000 functions (the "worst case").

Naively, one would schedule 1000 microtasks and perform a thousand calls to the system. In practice one schedules a single function on the microtask queue that calls all 1000 callbacks in sequence at once. So you pay for the latency of the scheduler exactly once and not 1000 times. This is utilized for example in Bluebird's scheduling. A queue is created and function calls are batched.

Async tagging

Although the trampoline makes things generally pretty fast, you still have to pay latency once per batch. This is wasteful. Instead, remember that we created the observable, in our end it was something like:

function listen(element, eventName) {
    return new Observable(sink => {
        // call sink.next
    });
}

But on the Observable side we get to implement next so we:

  • Create the sink and a boolean flag, start the flag as false.
  • Call the constructor executor argument.
  • Set the flag to true.
  • When next is called, if the flag is false set it to true and schedule a microtask via the trampoline. Otherwise, if the flag isn't false (that is, the constructor ran to completion before next was called) - call the function synchronously.

This means that in 99.99% of cases fn() is called synchronously, and in the 0.01% of cases it's not it's in cases where it is actively saving you from Zalgo.

In purely functional coding - you should not notice any performance impact. You can see this here in bluebird here.

As for async stack traces and stack construction that @jhusain mentioned it is taken care of by the platform anyway (async stack traces) so it should not be an issue.

@benlesh
Copy link

benlesh commented Jul 6, 2015

So you pay for the latency of the scheduler exactly once and not 1000 times.

In that one non-real-world case, yes, you've done only 1 unnecessary thing. But performance is more about macro performance than micro performance.

In an application that is written completely functionally with observables, you're going to incur that cost for every flatMapped/merged/cattered observable in your application, throughout the life span of your app.

As I'm sure @jhusain and @trxcllnt will be able to tell you, Netflix has need of Observables that will run in non-JITTed environments on weak hardware. In that scenario, you'll likely feel scheduling. Now arguably, "native" Observables won't make it to those platforms any time soon; But any library that wants to target this spec will be less ergonomic to use in those environments because of this feature.

I don't deny the need for an async subscribe for the masses. I just wish there was a more ergonomic way for people that know what they're doing to use the sync subscribe.

@benlesh
Copy link

benlesh commented Jul 6, 2015

I don't deny the need for an async subscribe for the masses.

... but I will spend my career explaining to people why they don't need to use it.

@tgriesser
Copy link

@benjamingr I'll try to gather some thoughts on this, as I've been using [Symbol.observe] with RxJS Next and too have found myself feeling the same re: ergonomics.

Just so I can better understand the desire for the async subscription here, do you think you could also provide a similar answer the inverse of your question:

It would be great if we had a list of use cases for asynchronous subscription in order to better understand if/when Zalgo occurs with Observables.

@benlesh
Copy link

benlesh commented Jul 6, 2015

@tgriesser it's mostly around problems with mutating state and imperative programming.

var foo = true;
myObservable.subscribe(x => {
  foo = false;
});
assert(foo);

People write (bad) code assuming that the subscribe method will be asynchronous all the time, and they get bitten. This gets exacerbated when you're dealing with a third party library that returns observables:

var myObservable = thirdParty.getDataSource();

myObservable.subscribe(x => doThings(x));

notifySubscriptionToDataSource();

Now imagine the above thirdParty.getDataSource was created in one of two ways:

var thirdParty = {
  getDataSource() {
    return Rx.Observable.range(0, 100);
  }
}

or

var thirdParty = {
  getDataSource() {
    return Rx.Observable.create(observer => {
       var i = 0;
       var id = setInterval(() => observer.next(i++), 1000);

       return () => {
         clearInterval(id);
       }
    });
  }
};

In the former case, notifySubscriptionToDataSource is not called until after the subscription has started and the entire observable has completed. In the latter case, notifySubscriptionToDataSource is called before the subscription has started.

@trxcllnt
Copy link

trxcllnt commented Jul 6, 2015

@benjamingr

This means that in 99.99% of cases fn() is called synchronously, and in the 0.01% of cases it's not it's in cases where it is actively saving you from Zalgo.

The Observable monad is the excalibur that slays Zalgo.

In purely functional coding - you should not notice any performance impact.

If your code is purely functional, it doesn't matter whether subscribe is synchronous or asynchronous, in which case any asynchronous scheduling is wasted effort.

@zenparsing
Copy link
Member

Thanks @benjamingr for the examples.

@jhusain 's comments on the DOM in the other thread got me thinking about something about our current async-subscription design.

Keep in mind that in the current design, subscribe is async but next is sync.

// This is a multicast observable:  you can put data into it and it will send that data to
// all subscribers
let multicast = createMulticaster();

// We add a subscription to the multicast source
let subscription = multicast.subscribe(value => console.log(`Got value: ${ value }`));

// We tell the multicast source to send a data value to all subscribers
multicast.send("some-data");

The question is: should the onNext callback above be executed? Should "Got value: some-data" be logged? What is the intuitive answer?

@trxcllnt
Copy link

trxcllnt commented Jul 6, 2015

@zenparsing yes, "Got value" should be logged. The root of this problem is a fundamental disagreement between imperative and functional programming. By introducing a Subject (an Observable + Observer), you've gone off the reservation of immutable determinism and out into the sand dunes of race conditions.

Observables aren't fancy EventEmitters, and we shouldn't proscribe behaviors that don't make sense in purely functional code.

@zenparsing
Copy link
Member

@trxcllnt Cool. (Although we can't make the ES standard library into an imperative vs. functional battleground. Instead, we should focus on making the API consistent, simple, and usable.)

Anyone else have an intuition?

@benlesh
Copy link

benlesh commented Jul 6, 2015

we should focus on making the API consistent, simple, and usable

I agree. It's my opinion that it's "consistent" if it's synchronous if it can be. It might even force people to learn better programming practices.

Keep in mind, if this is implemented, we're only introducing a type; Nothing in the JS core would return this type, so using Observable would be a strictly opt-in programming experience. Furthermore, those that want that experience are probably looking to program functionally rather than imperatively.

On a related note, if "simple" is a goal, then I personally think we shouldn't try to get the disposal behavior to work with generators, because the behavior is decidedly less simple than prior art (RxJS 2).

@domenic
Copy link
Member

domenic commented Jul 7, 2015

As a TC39 member: I think if people are going to opt-in to functional programming, they should use a library, and we should not standardize this in the JS language, which is pretty clearly imperative.

@domenic
Copy link
Member

domenic commented Jul 7, 2015

@zenparsing I have a hard time building intuition about your example because a "multicast observable" and "send" are not things that appear in the proposal.

@trxcllnt
Copy link

trxcllnt commented Jul 7, 2015

@domenic That's the point -- you can opt-in to mutability if Observables start immutable, but you can't opt-in to immutability if Observables start mutable.

If someone wants asynchronous subscriptions, it's very simple to schedule the subscription on something like nextTick: sourceObservable.subscribeOn(Scheduler.nextTick).

@jhusain
Copy link
Collaborator

jhusain commented Jul 7, 2015

I think we should probably keep this thread focused on use cases for asynchronous subscription.

In the other thread I outlined two:

  1. modeling EventTarget which allows for synchronous notification.
  2. epoll-style push/pull IO in which a notification is sent saying a value is available, but the value is retrieved via a pull. This allows the potentially expensive process of retrieving a value to be avoided if the consumer is not interested in the data.

@jhusain
Copy link
Collaborator

jhusain commented Jul 7, 2015

FYI I will try and follow up with code examples for each of there use cases when I get a chance.

@trxcllnt
Copy link

trxcllnt commented Jul 7, 2015

3: Depth-first vs. breadth-first expand is only possible with synchronous subscribe.

@domenic
Copy link
Member

domenic commented Jul 7, 2015

The epoll case doesn't make any sense, as that's definitely not how epoll works. (epoll includes a buffer, for one, and when not used directly in a polling fashion, it's commonly wrapped in an event loop library, which will make it async.) https://blog.domenic.me/reading-from-sockets/ may be helpful.

@domenic
Copy link
Member

domenic commented Jul 7, 2015

@trxcllnt please follow the requirements outlined in the OP, your one-sentence description is pretty impenetrable.

@jhusain
Copy link
Collaborator

jhusain commented Jul 7, 2015

I would be very interested if anyone on this thread Who is advocating for synchronous subscription could come up with a use case in which asynchronous subscription meaningfully hurts performance. Thus far I have been a unable to prove this in a real-world use case.

It's important to understand that the committee is strongly swayed by real world use cases. Theoretical arguments, while they may be valid, are going to be less influential.

@trxcllnt any ideas? You have a lot of experience using observable to build very chatty event streams (ex. Gestures, scrolling virtualization). Can you demonstrate that asynchronous subscription will hurt performance in any of these use cases.

JH

On Jul 6, 2015, at 7:00 PM, Domenic Denicola notifications@github.com wrote:

@trxcllnt please follow the requirements outlined in the OP, your one-sentence description is pretty impenetrable.


Reply to this email directly or view it on GitHub.

@trxcllnt
Copy link

trxcllnt commented Jul 7, 2015

@domenic @jhusain Example of a recursive tree-walk as depth-first, breadth-first, and asynchronous breadth-first parameterized by scheduler.

var Observable = Rx.Observable;

Observable.prototype.expand = function (selector) {
  var source = this;
  return Observable.create(function (observer) {

    var disposables = new Rx.CompositeDisposable();

    subscribeInner(source);

    return disposables;

    function subscribeInner(innerObs) {
      var innerSad = new Rx.SingleAssignmentDisposable();
      disposables.add(innerSad);
      innerSad.setDisposable(innerObs.subscribe(
        function onNext(x) {
          subscribeInner(selector(x));
          observer.onNext(x);
        },
        function onError(e) {
          observer.onError(e);
        },
        function onCompleted() {
          disposables.remove(innerSad);
          if(disposables.length === 0) {
            observer.onCompleted();
          }
        }
      ));
    }
  });
};

var fileSystem = {
  "C:": {
    "ProgramFiles": {
      "Chrome": {
        "app": "chrome.exe",
        "images": {
          "icon": "chrome-icon.ico"
        }
      },
      "Firefox": {
        "app": "firefox.exe",
        "images": {
          "icon": "firefox-icon.ico",
          "user": {
            "ptaylor": {
              "icon": "ptaylor-icon.png"
            }
          }
        }
      },
      "Opera": {
        "app": "Opera.exe",
        "images": {
          "icon": "Opera-icon.ico"
        }
      }
    }
  }
};

function ls(path, tree, scheduler) {
  scheduler = scheduler || (Rx.Scheduler.currentThread);
  if(path) {
    path = path + "/";
  }
  return Observable.create(function (observer) {
    return Observable
      .fromArray(Object.keys(tree), scheduler)
      .map(function(key) {
        return { path: path + key, value: tree[key] };
      })
      .subscribe(observer);
  });
}

function listResources(scheduler) {
  ls("", fileSystem, scheduler)
    .expand(function (resource) {
      var path = resource.path;
      var fileOrDir = resource.value;
      if(typeof fileOrDir !== "object") {
        return Observable.empty();
      }
      return ls(path, fileOrDir, scheduler);
    })
    .filter(function(resource) {
      return typeof resource.value !== "object";
    })
    .subscribe(function(resource) {
      console.log(resource.path + " -- " + resource.value);
    });
}

console.log("Depth-first:");
listResources(Rx.Scheduler.immediate);

console.log("Breadth-first:");
listResources(Rx.Scheduler.currentThread);

console.log("Asynchronous breadth-first:");
listResources(Rx.Scheduler.timeout);

/* output:
Depth-first:
C:/ProgramFiles/Chrome/app -- chrome.exe
C:/ProgramFiles/Chrome/images/icon -- chrome-icon.ico
C:/ProgramFiles/Firefox/app -- firefox.exe
C:/ProgramFiles/Firefox/images/icon -- firefox-icon.ico
C:/ProgramFiles/Firefox/images/user/ptaylor/icon -- ptaylor-icon.png
C:/ProgramFiles/Opera/app -- Opera.exe
C:/ProgramFiles/Opera/images/icon -- Opera-icon.ico
Breadth-first:
C:/ProgramFiles/Chrome/app -- chrome.exe
C:/ProgramFiles/Firefox/app -- firefox.exe
C:/ProgramFiles/Chrome/images/icon -- chrome-icon.ico
C:/ProgramFiles/Opera/app -- Opera.exe
C:/ProgramFiles/Firefox/images/icon -- firefox-icon.ico
C:/ProgramFiles/Opera/images/icon -- Opera-icon.ico
C:/ProgramFiles/Firefox/images/user/ptaylor/icon -- ptaylor-icon.png
Asynchronous breadth-first:
C:/ProgramFiles/Chrome/app -- chrome.exe
C:/ProgramFiles/Firefox/app -- firefox.exe
C:/ProgramFiles/Chrome/images/icon -- chrome-icon.ico
C:/ProgramFiles/Opera/app -- Opera.exe
C:/ProgramFiles/Firefox/images/icon -- firefox-icon.ico
C:/ProgramFiles/Opera/images/icon -- Opera-icon.ico
C:/ProgramFiles/Firefox/images/user/ptaylor/icon -- ptaylor-icon.png
*/

@benjamingr
Copy link
Author

@tgriesser

Just so I can better understand the desire for the async subscription here, do you think you could also provide a similar answer the inverse of your question:

Sure:

someExternalSource().subscribe(val => showUi(val));
hideUi();

If someExternalSource suddenly starts fetching the result from cache at the second page load - we have a hidden UI instead of a visible one with the value because the subscription ran synchronously "this one time". These bugs are very hard to find and I think we should do our best to help prevent them.


@Blesh

In that one non-real-world case, yes, you've done only 1 unnecessary thing. But performance is more about macro performance than micro performance.

Actually, you'd do zero async scheduling in the vast majority of examples. In the vast majority of examples your source is actually asynchronous already so you'll not need any scheduling because of async tagging anyway.

Keep in mind, if this is implemented, we're only introducing a type; Nothing in the JS core would return this type, so using Observable would be a strictly opt-in programming experience.

The idea is that the JS host environments would return observables for multiple valued objects. Otherwise - why would we even be specifying observables? People can just use third party libraries.

@trxcllnt

If your code is purely functional, it doesn't matter whether subscribe is synchronous or asynchronous, in which case any asynchronous scheduling is wasted effort.

Right, but as I said before in my message about async scheduling, you're only doing it if you're running it through a sync data source.

code example for expand

I'm not sure what this demonstrates at all except that changing schedulers changes behaviour.

If someone wants asynchronous subscriptions, it's very simple to schedule the subscription to happen asynchronously: Observable.subscribeOn(Scheduler.nextTick).

For what it's worth, I'm very amendable to allowing a setScheduler for letting people choose how to schedule their observables. Although that greatly complicates the proposal it's very useful for a variety of other use cases too.

@zenparsing

The question is: should the onNext callback above be executed? Should "Got value: some-data" be logged? What is the intuitive answer?

Of course, 100% yes.

@jhusain

FYI I will try and follow up with code examples for each of there use cases when I get a chance.

Please do, that's why I opened this thread in the first place.

I would be very interested if anyone on this thread Who is advocating for synchronous subscription could come up with a use case in which asynchronous subscription meaningfully hurts performance. Thus far I have been a unable to prove this in a real-world use case.

Yes, that's what I've been seeing too. A lot of big words and appealing to 'pure functional programming' without any such data. Data would go a long way.

@benjamingr
Copy link
Author

Synchronous subscription in terms of an API is a huge footgun for anyone not doing a very particular type of "pure functional" programming where not only data is only moved in Observables, the UI layer doesn't interact with subscriptions directly. For those people, which I believe are the majority of users, we're introducing a class of concurrency bugs which are extremely hard to debug.

Functional programming doesn't mean we don't have state, it's just controlled. Doing monadic pure state book-keeping has proven to be very unpopular although widely available in JavaScript and catering to people doing that exact type of observable usage over the vast majority of users sounds like something the TC will never approve. Especially if async subscription allows for the same expressiveness and as @trxcllnt said, you're not even aware of async subscription when doing async scheduling.

What the TC might approve is a synchronous subscription API if it is proven useful and you can do things with it you can't with async subscription. Which is why what I've asked for from the start is _code examples with detailed explanation that necessitate sync subscription. So far we didn't get any.

@jhusain
Copy link
Collaborator

jhusain commented Jul 7, 2015

Thanks for the code example @trxcllnt. Can you provide a little more context around it? I can see that using an immediate way scheduler vs a trampoline scheduler (the poorly-named currentThread scheduler) easily switches from a depth-first search to a breadth-first search. Would I not get the same result from a using a version of the trampoline scheduler with a stack vs a queue? It's early in the morning so I'm sorry if I'm misunderstanding the example.

.

JH

On Jul 6, 2015, at 8:06 PM, Paul Taylor notifications@github.com wrote:

@domenic @jhusain Example of a recursive tree-walk as depth-first, breadth-first, and asynchronous breadth-first parameterized by scheduler.

var Observable = Rx.Observable;

Observable.prototype.expand = function (selector) {
var source = this;
return Observable.create(function (observer) {

var disposables = new Rx.CompositeDisposable();

subscribeInner(source);

return disposables;

function subscribeInner(innerObs) {
  var innerSad = new Rx.SingleAssignmentDisposable();
  disposables.add(innerSad);
  innerSad.setDisposable(innerObs.subscribe(
    function onNext(x) {
      subscribeInner(selector(x));
      observer.onNext(x);
    },
    function onError(e) {
      observer.onError(e);
    },
    function onCompleted() {
      disposables.remove(innerSad);
      if(disposables.length === 0) {
        observer.onCompleted();
      }
    }
  ));
}

});
};

var fileSystem = {
"C:": {
"ProgramFiles": {
"Chrome": {
"app": "chrome.exe",
"images": {
"icon": "chrome-icon.ico"
}
},
"Firefox": {
"app": "firefox.exe",
"images": {
"icon": "firefox-icon.ico",
"user": {
"ptaylor": {
"icon": "ptaylor-icon.png"
}
}
}
},
"Opera": {
"app": "Opera.exe",
"images": {
"icon": "Opera-icon.ico"
}
}
}
}
};

function ls(path, tree, scheduler) {
scheduler = scheduler || (Rx.Scheduler.currentThread);
if(path) {
path = path + "/";
}
return Observable.create(function (observer) {
return Observable
.fromArray(Object.keys(tree), scheduler)
.map(function(key) {
return { path: path + key, value: tree[key] };
})
.subscribe(observer);
});
}

function listResources(scheduler) {
ls("", fileSystem, scheduler)
.expand(function (resource) {
var path = resource.path;
var fileOrDir = resource.value;
if(typeof fileOrDir !== "object") {
return Observable.empty();
}
return ls(path, fileOrDir, scheduler);
})
.filter(function(resource) {
return typeof resource.value !== "object";
})
.subscribe(function(resource) {
console.log(resource.path + " -- " + resource.value);
});
}

console.log("Depth-first:");
listResources(Rx.Scheduler.immediate);

console.log("Breadth-first:");
listResources(Rx.Scheduler.currentThread);

console.log("Asynchronous breadth-first:");
listResources(Rx.Scheduler.timeout);

/* output:
Depth-first:
C:/ProgramFiles/Chrome/app -- chrome.exe
C:/ProgramFiles/Chrome/images/icon -- chrome-icon.ico
C:/ProgramFiles/Firefox/app -- firefox.exe
C:/ProgramFiles/Firefox/images/icon -- firefox-icon.ico
C:/ProgramFiles/Firefox/images/user/ptaylor/icon -- ptaylor-icon.png
C:/ProgramFiles/Opera/app -- Opera.exe
C:/ProgramFiles/Opera/images/icon -- Opera-icon.ico
Breadth-first:
C:/ProgramFiles/Chrome/app -- chrome.exe
C:/ProgramFiles/Firefox/app -- firefox.exe
C:/ProgramFiles/Chrome/images/icon -- chrome-icon.ico
C:/ProgramFiles/Opera/app -- Opera.exe
C:/ProgramFiles/Firefox/images/icon -- firefox-icon.ico
C:/ProgramFiles/Opera/images/icon -- Opera-icon.ico
C:/ProgramFiles/Firefox/images/user/ptaylor/icon -- ptaylor-icon.png
Asynchronous breadth-first:
C:/ProgramFiles/Chrome/app -- chrome.exe
C:/ProgramFiles/Firefox/app -- firefox.exe
C:/ProgramFiles/Chrome/images/icon -- chrome-icon.ico
C:/ProgramFiles/Opera/app -- Opera.exe
C:/ProgramFiles/Firefox/images/icon -- firefox-icon.ico
C:/ProgramFiles/Opera/images/icon -- Opera-icon.ico
C:/ProgramFiles/Firefox/images/user/ptaylor/icon -- ptaylor-icon.png
*/

Reply to this email directly or view it on GitHub.

@jhusain
Copy link
Collaborator

jhusain commented Jul 7, 2015

The most obvious problem with forced asynchronous subscription is the inability to faithfully model EventTarget. This is a serious issue, because it is one of the stated goals of the proposal. This is a worthwhile goal because EventTarget is both ubiquitous and awful. Observable would be a big improvement over EventTarget because addEventListener/removeEventListener methods hanging off an object cannot be ergonomically composed. Ergonomic composition of push streams is a big win, sync subscription or not.

Using Observable to model EventTarget is straightforward with synchronous subscription.

Let's say we had the following contract:

EventTargetObservable.listen(eventName, useCapture): Observable

Anything that currently implements EventTarget could easily opt into this contract without breaking changes. Now take a look at this code example which demonstrates the hazard of asynchronous subscription.

// receives notification
domElement.addEventListener("click", e => e.preventDefault(), false);

// receives notification
domElement.listen("click", false)[Symbol.observer](e => e.preventDefault%28%29);

// misses data!
domElement.listen("click", false).subscribe(e => e.preventDefault());

domElement.dispatchEvent("click", new ClickEvent());

Any proposal that attempts to model Event target will run into the same issue if it attempts to schedule subscription on a separate job. Note that scheduling notification is also not an option, because preventDefault() must be invoked within the same job.

JH

On Jul 7, 2015, at 2:40 AM, Benjamin Gruenbaum notifications@github.com wrote:

Synchronous subscription in terms of an API is a huge footgun for anyone not doing a very particular type of "pure functional" programming where not only data is only moved in Observables, the UI layer doesn't interact with subscriptions directly. For those people, which I believe are the majority of users, we're introducing a class of concurrency bugs which are extremely hard to debug.

Functional programming doesn't mean we don't have state, it's just controlled. Doing monadic pure state book-keeping has proven to be very unpopular although widely available in JavaScript and catering to people doing that exact type of observable usage over the vast majority of users sounds like something the TC will never approve. Especially if async subscription allows for the same expressiveness and as @trxcllnt said, you're not even aware of async subscription when doing async scheduling.

What the TC might approve is a synchronous subscription API if it is proven useful and you can do things with it you can't with async subscription. Which is why what I've asked for from the start is _code examples with detailed explanation that necessitate sync subscription. So far we didn't get any.


Reply to this email directly or view it on GitHub.

@benjamingr
Copy link
Author

@jhusain

So just to be clear, we are arguing that EventTarget requires synchronous subscription because it would be impossible to interop between addEventListener and Observables otherwise, right?

This is the sort of use case I was talking about, nice work!

I've written a concrete example of this sort of issue, if this is what you've had in mind let me know and I'll formalize it in a markdown document:

HTML:

<div id="father">
    <div id="child">Click</div>
</div>

JavaScript:

let father = document.querySelector("#father"),
    child  = document.querySelector("#child");

father.addEventListener("click", e => alert("Hi"), false);
child.listen("click", false).subscribe(e => e.stopPropagation());

child.click();

Result: the alert fires since addEventListener subscribes synchronously but subscribe subscribes asynchronously.

Implication: It is impossible to stop propagation or prevent default behaviour of synchronous events that are listened for via addEventListener handlers.

Note that in asynchronous events or events dispatched by the platform would not be an issue at all, we are strictly talking about events dispatched by JavaScript here.

Also note that preventDefault doesn't work except for click events, as it would not be a trusted event, propagation would still work.

Let me know if this is what you've had in mind.

@zenparsing
Copy link
Member

Implication: It is impossible to stop propagation or prevent default behaviour of synchronous events that are listened for via addEventListener handlers.

Clarification: it's not really a propagation or prevent-default issue (those are symptoms, rather than cause). The root issue (which I alluded to upstream) is that we expect any events which occur after a call to subscribe to be picked up by that subscription.

In other words, we intuitively want subscription to happen synchronously. I think this is true for all observables, and my current opinion is that async subscription just won't work for that reason.

This is what I'm thinking:

  • Subscription should happen synchronously, as pointed out above.
  • In order to protect against zalgo, "well-behaved" observables should never execute any observer callbacks synchronously.

I'm not quite sure what this means for the API, at the moment.

@benjamingr
Copy link
Author

Clarification: it's not really a propagation or prevent-default issue (those are symptoms, rather than cause). The root issue (which I alluded to upstream) is that we expect any events which occur after a call to subscribe to be picked up by that subscription.

We can just queue those next calls and dispatch them on the next event loop iteration. This is what the browser does anyway in asynchronous events. The issue I describe above is specifically an issue of DOM events being dispatched by JavaScript synchronously and is an artifact of the current API for event dispatching - not the model.

If the DOM events were dispatched from JS asynchronously, this would not be an issue. What I believe @jhusain is speaking of is specifically compatibility with the way events currently work in the DOM.


That is:

 var o = new Observable(sink => sink.next(2));
 o.subscribe(v => console.log(v);
 console.log(1);

Should always log:

1
2

And in particular, it should always log 2. Mainly because in the general case it could really be a part of a library that does:

var cachedTwo = null;
function getTwo(){
     if(!cachedTwoFromBefore) {
         // maybe also cache here?
         return new Observable(sink => fetch("/twoUrl").then(::sink.next).then(::sink.return);
     } else {
         return new Observable(sink => sink.next(2), sink.return());
     }
}

Which would be called as:

 var o = getTwo();
 o.subscribe(v => console.log(v);
 console.log(1);

Here, if there is no zalgo protection, we have a race condition based on whether or not getTwo cached the result before. This is a major class of bugs people complain about and the reason people dislike sync subscription is because of that particular use.

@zenparsing
Copy link
Member

@benjamingr There's a general problem here that I think is important. Any pub-sub mechanism which allows synchronous dispatch won't work in the expected way if implemented with async subscription. In any such system, you have to register the subscriber synchronously.

@zenparsing
Copy link
Member

We could specify that Observable always notifies asynchronously, but subscribes synchronously, but then we're back in the same issue with EventTarget.

@Blesh Yes, and also that means observation would potentially require O(n) space for buffering. We really would like observation to be O(1) space, given that it should be appropriate for delivering huge datasets.

@RangerMauve
Copy link

@Blesh I kinda like the idea of subscribing being sync and notification being async. Any particular reason why that's a bad idea?

@benlesh
Copy link

benlesh commented Jul 7, 2015

@zenparsing very true.

If you start with synchronous subscribe, you can opt-in to async subscribe. If you start with async subscribe, you can never opt back in to synchronous subscribe.

@trxcllnt I agree with this wholeheartedly.

Observables are immutable Functions that can return zero or more values whenever it wants (synchronously or asynchronously).

This is correct. I think the TC39 needs to realize that Observables are a proposal more on-par with something like adding other functional constructs such as first-class functions, lambdas, Function.prototype.apply, or es7 function bind. Functional programming in JavaScript is on the rise, and this would add a powerful feature to the language.

@benlesh
Copy link

benlesh commented Jul 7, 2015

@RangerMauve just as @zenparsing pointed out, you could easily end up spamming your micro-task queue with tons of observed values. Think of Observable.range(0, Number.MAX_VALUE).

@RangerMauve
Copy link

@Blesh Would the "trampolining" method mentioned earlier help alleviate a lot of the overhead from that?

@benjamingr
Copy link
Author

@zenparsing

Actually, my previous suggestion doesn't work all, because notifications could come in out-of-order. A notification sent after the subscribe call but before "nextTick" would be received before the buffered notifications. Bleh...

Why? The semantics of micro tasks vs macrotasks ensure that exactly this will not happen. The handlers will fire after every microtask but before any macrotask (the "late queue").

@Blesh

This is correct. I think the TC39 needs to realize that Observables are a proposal more on-par with something like adding other functional constructs such as first-class functions, lambdas, Function.prototype.apply, or es7 function bind.

That's very hard to argue given observables can, and have been implemented in user land successfully before. We need to show current APIs implemented in terms of observables like Object.observe, DOM APIs, Node APIs and so on and explain how observables solve the issue better. On my side it would be easier in terms of "no need to download RxJS and wrap APIs" but that's about it although this is a bit off topic. (To be clear though, I'm active here because I'm very interested in seeing observables in the language and I think a good default is incredibly important for users).

I think the two people who have been consistently talking in terms of functional programming features in the language are @domenic and @jhusain and I think it would be great if #34 could get a little love. If we're talking FP I wondered if laziness is a property of the comonadability of observables that function-classes can encapsulate (like this for promises for example) and me , @domenic and @headinthebox are (in the initial stages of) discussing this, I'm still figuring this out but I think function classes have a huge potential for the language and if we include them lots of nice things happen (o.subscribe(sub) just becomes o(sub) and a lot of other interesting things). That's a separate proposal though :)

@benjamingr
Copy link
Author

@RangerMauve

@Blesh Would the "trampolining" method mentioned earlier help alleviate a lot of the overhead from that?

No, if you do:

Observable.range(0, 10000).subscribe(function(v){
    console.log(v);
});
console.log("Here");

You would need to queue 10000 and pump them one by one. Observables are a very strict (non lazy) type in that regard. This is what @zenparsing means by O(n) instead of O(1)


To not leave an incorrect impression, we're still figuring out if cases like this where 10000 items are created synchronously are common (and if a range function existed in native observables, it could easily avoid this pitfall even with async subscription by being aware of it).

@RangerMauve
Copy link

@benjamingr Ah, that makes sense. Thanks for the explanation.

@benlesh
Copy link

benlesh commented Jul 7, 2015

we're still figuring out if cases like this where 10000 items are created synchronously are common

I think I was talking with @trxcllnt and there are implementations of Animation written in Rx that actually do something like Rx.Observable.range(0, Number.MAX_VALUE) and then map/flatMap those into animation ticks and takeUntil the animation completes.

I'm not sure what the exact implementation is, or how common it is, so perhaps @trxcllnt could elaborate, but that's one I can think of off the top of my head that could get hairy with default async subscription or observation.

@benjamingr
Copy link
Author

@Blesh that would be fantastic, although keep in mind subscription would also need to be synchronous here.

@trxcllnt
Copy link

trxcllnt commented Jul 7, 2015

@benjamingr @zenparsing

After further thought, I've identified two more operations that are impossible without synchronous subscription -- the grouping (groupBy, groupByUntil, groupJoin) and window operators.

Description

These operators split an Observable of items into an Observable of inner Observables of items, based on custom identifying logic. When the operators are next'd a value, they may identify the need to create a new inner Observable. They next the new inner Observable to the subscriber, who has a chance to subscribe to it immediately. After nexting the new inner Observable, the operators next the original value to the new inner Observable.

This affects anybody who wants to group events by type, count, time, or relation.

Why synchronous?

If subscription is asynchronous, the subscription to the inner Observable won't have been scheduled by the time the splitting operator nexts the value, and the new inner Observer won't hear the event.

Code

Example JSBin here.

var Observable = Rx.Observable;
var currentThread = Rx.Scheduler.currentThread;
var RAFScheduler = Rx.Scheduler.requestAnimationFrame;
var canvas = document.getElementById('c');
var context = canvas.getContext('2d');

canvas.width = window.innerWidth;
canvas.height = window.innerHeight;
context.font = "12px Helvetica";
context.fillStyle = "white";

function randomNumbers(scheduler) {
  return Observable.create(function subscribe(observer) {
    scheduler.scheduleRecursive(function(reschedule) {
      observer.onNext(Math.random());
      reschedule();
    });
  });
}

randomNumbers(RAFScheduler)
  .map(function(value, allColorsCount) {
    if(value < 0.33) {
      return {
        color: "green",
        count: allColorsCount
      };
    } else if(value < 0.66) {
      return {
        color: "blue",
        count: allColorsCount
      };
    } else {
      return {
        color: "red",
        count: allColorsCount
      };
    }
  })
  .groupBy(function(x) { return x.color; })
  .flatMap(function(innerObservable, colorIndex) {
    return innerObservable.map(function(x, colorCount) {
      var percent = colorCount / x.count;
      var boxWidth = Math.round(canvas.width * 0.33);
      var boxHeight = Math.round(canvas.height * percent);
      var boxX = colorIndex * boxWidth;
      var boxY = canvas.height - boxHeight;
      var percentText = x.color + ": " + Math.round(percent * 10000)/100 + "%";
      var textMetrics = context.measureText(percentText);
      return {
        percent: percent,
        color: x.color,
        x: boxX,
        y: boxY,
        textX: boxX + Math.abs(boxWidth - textMetrics.width) * 0.5,
        textY: boxY - 12,
        percentText: percentText,
        width: boxWidth,
        height: boxHeight
      };
    });
  })
  .subscribe(function(box) {
    context.clearRect(box.x, 0, box.width, canvas.height);
    context.fillStyle = box.color;
    context.fillRect(box.x, box.y, box.width, box.height);
    context.fillStyle = "black";
    context.fillText(box.percentText,
                     Math.round(box.textX),
                     Math.round(box.textY));
  });

@trxcllnt
Copy link

trxcllnt commented Jul 8, 2015

@benjamingr it would be great if #34 could get a little love

Yes! That proposal perfectly captures the essence of Observables, and beyond sweet syntactic sugar and potential closure optimizations, would eliminate any ambiguity around what an Observable actually is.

@benlesh
Copy link

benlesh commented Jul 8, 2015

@trxcllnt I thought about mentioning this one, too. It seems like one could get around that by buffering in the outer GroupByObserver, but then you're going to run into the same "O(n)" buffering problem @zenparsing mentions above. You're completely right that async subscription would be way less than ideal.

@trxcllnt
Copy link

trxcllnt commented Jul 8, 2015

@Blesh correct. If you schedule subscriptions, you have to schedule next's through the groupBy Observable or risk dropping values. But then you don't have a "groupBy" operator, you have a "groupByAndObserverOn" operator.

@zenparsing
Copy link
Member

@jhusain I'm not so sure that async subscription can survive this. The general idea is:

Suppose that subscribe is called at time t0, and the subscription is scheduled to run at tS. Then any event tN where N > 0 will be missed if N < S.

I think that covers the issues for both EventTarget and groupBy. @trxcllnt do you agree?

@trxcllnt
Copy link

trxcllnt commented Jul 8, 2015

@zenparsing yes, I think that covers everything.

@benjamingr
Copy link
Author

@trxcllnt the synchronous example in groupBy is excellent, but as @zenparsing and @Blesh pointed out, it can be overcome by buffering. I do see the pain point in implementing a userland .flatMap-type operation though. I'd love to see benchmarks showing this is slow (we'll iterate them) as those would go a long way to justify sync subscription to the TC.

An Observable.range(0, Infinity) as Ben pointed out would be even more convincing - so if you have that - definitely share.

@domenic
Copy link
Member

domenic commented Jul 8, 2015

Just getting back to this 50-message-deep thread after a plane flight and night's sleep... Here are some things that I'm picking up on.

@trxcllnt

Setting the scheduler is not part of this proposal; if that is a prerequisite for safely using observables then that needs to be added to this proposal (which will presumably need to go back to stage 0 since that's a pretty big change).

@jhussein

If serving as a basis for EventTarget is an important part of this proposal, I think there's a lot more work to do there. The details on capture, bubbling, cancelation (which is not compatible with observable cancelation, note) need to be reproduced, as do the exact timing issues and ordering issues surrounding e.g. removing an event listener mid-capturing phase. I am not at all convinced that sync vs. async subscription is the blocker here, and I think it'd be premature to use EventTarget as a motivating use case for sync subscription before it's made clear that observable can handle the rest of the EventTarget semantics.

Also,

Note that scheduling notification is also not an option, because preventDefault() must be invoked within the same job.

is not really true; preventDefault() can be called at any time. The question is when the UA checks whether the event is canceled. Click is a special case here; all other events dispatched from script do not have "default actions" that do such checks. See e.g. https://codereview.chromium.org/894913002/ and linked bugs/spec issues.

I personally think it would be a lot more feasible to have some kind of EventTarget -> observable adapter. However, others on the Chrome team are opposed to observables unless they can actually be used as part of the layered platform strategy, i.e., we can literally reimplement EventTarget on top of them and throw away existing EventTarget code. So that is a prerequisite for the proposal advancing, it is true.

@zenparsing

This is what I'm thinking:

  • Subscription should happen synchronously, as pointed out above.
  • In order to protect against zalgo, "well-behaved" observables should never execute any observer callbacks synchronously.

This seems pretty reasonable to me on the face of it, and I found it very surprising that observables as-proposed went in a different direction than promises in this regard (which also do sync-subscribe, async-call-of-listener). I still have another 35 messages to work through but I am curious if this ends up going where I think it's going.

@benjamingr @jhussein

I/O actually requires unbounded buffering, and it is for this reason observables are not suitable for I/O.

@jhussein

It would be great to know just how many Event Targets synchronously notify. My guess is that this is not a very common hazard in practice.

All of them do when used with dispatchEvent from author code. When UAs dispatch events they always enqueue a task to do so however.

@Blesh

I think the TC39 needs to realize that Observables are a proposal more on-par with something like adding other functional constructs such as first-class functions, lambdas, Function.prototype.apply, or es7 function bind

This attitude will make advancing observables through TC39 very difficult, I can assure you. You called this FUD earlier so maybe you are not interested in talking about this rationally but I thought I'd try one more time.

@benjamingr
Copy link
Author

Hope you had a good flight @domenic :)

Setting the scheduler is not part of this proposal; if that is a prerequisite for safely using observables then that needs to be added to this proposal (which will presumably need to go back to stage 0 since that's a pretty big change).

What @trxcllnt demonstrated there is that scheduling asynchronously vs. synchronously produces different execution order. As we later noted that was not a motivating example for sync subscribe.

@jhusain If serving as a basis for EventTarget is an important part of this proposal, I think there's a lot more work to do there.

I definitely agree, I'd like to see a detailed survey and analysis of points of match and mismatch and I'd like to see more work with DOM people about compatibility. If EventTarget is indeed a goal of this proposal things need to be very clear. Specifically #12 needs love.

As for preventDefault, other than click all other events should not be affected by synchronous dispatch as events will not be treated as trusted and the default action will not fire anyway. This is why I used stopPropagation in my example of the issue instead.

@benjamingr @jhussein I/O actually requires unbounded buffering, and it is for this reason observables are not suitable for I/O.

Any form of I/O where the platform will have to wait requires unbounded IO. It's really easy to show with:

var streamOrEmitterOrAsyncIterator = url.readAsChunked("/petabyteFile.xml"); // because xmls are big :P
while(true); 
// consume and move forward

Pretty much, any non-preemptive model of async IO (and to an extent preemptive ones too) needs to deal with backpressure and buffering issues. The question is how much additional buffering will be needed by making subscriptions async?

If Observable.range(0, Infinity).subscribe(v => fn(v)) is code that is actually running then buffering is going to be problematic. In general if unbounded buffering is required for async subscription is a problem in realistic use cases and in addition to the regular buffering overhead, then we need to consider sync subscription for that reason (and not sooner).

So far we've yet to see code with this requirement but I think both @Blesh and @trxcllnt are working on finding good examples. When and if they do, that would make a convincing argument for sync subscription in my humble opinion.

@domenic
Copy link
Member

domenic commented Jul 8, 2015

As for preventDefault, other than click all other events should not be affected by synchronous dispatch as events will not be treated as trusted and the default action will not fire anyway. This is why I used stopPropagation in my example of the issue instead.

Right. The supposition there is that you'd want to interoperate between .addEventListener and .subscribe uses, which is not 100% clear to me. But I guess it would fall out naturally if you tried to do the layering.

> Any form of I/O where the platform will have to wait requires bounded IO. It's really easy to show with: This is probably off-topic but your example does not show this at all, as far as I can tell. I understand the rest of your post though so maybe it is immaterial. We could take it to another thread or IRC if you think it's important.

In IRC we discovered that @benjamingr had meant to write "requires unbounded buffering" not "requires bounded IO", which I do indeed agree with.

@benlesh
Copy link

benlesh commented Jul 8, 2015

You called this FUD earlier so maybe you are not interested in talking about this rationally but I thought I'd try one more time.

@domenic I apologize, and I deleted that comment. I have a strong impression you don't want this proposal to succeed, so I'm likely to misread every one of your issues and comments as having that underlying goal.

That is a personal failing on my part that makes it hard to stay objective when reading your comments.

Again, apologies. I'll try to stay objective.

@jhusain
Copy link
Collaborator

jhusain commented Jul 8, 2015

@domenic @slightlylate made it clear to me that Chrome wants to be able to build EventTarget on top of Observable. I'm trying to meet that bar, but the the basic type signature must stabilize first. @domenic do you acknowledge that sync subscription is required to build EventTarget? If Chrome plans to block the proposal if they can't build EventTarget on top of Observable, then I think we have to move forward with a sync subscription option and close this issue.

JH

On Jul 8, 2015, at 8:03 AM, Domenic Denicola notifications@github.com wrote:

As for preventDefault, other than click all other events should not be affected by synchronous dispatch as events will not be treated as trusted and the default action will not fire anyway. This is why I used stopPropagation in my example of the issue instead.

Right. The supposition there is that you'd want to interoperate between .addEventListener and .subscribe uses, which is not 100% clear to me. But I guess it would fall out naturally if you tried to the layering.

Any form of I/O where the platform will have to wait requires bounded IO. It's really easy to show with:

This is probably off-topic but your example does not show this at all, as far as I can tell. I understand the rest of your post though so maybe it is immaterial. We could take it to another thread or IRC if you think it's important.


Reply to this email directly or view it on GitHub.

@benjamingr
Copy link
Author

This discussion seems to have died out - if anyone has anything to add let me know and I'll reopen.

@jhusain
Copy link
Collaborator

jhusain commented Jul 28, 2015

Reopening because as I'm putting together a summary of changes I'm somewhat confused by conclusion. It's clear we need sync subscription for EventTarget and for certain operators like GroupBy. It seems the original proposal of both an async and sync subscription option still works.

@zenparsing
Copy link
Member

@jhusain we should open a new issue and reference this one, because this issue has gotten really long.

#49

@arman-g
Copy link

arman-g commented Nov 14, 2019

In my case I do see one use case using it in angular interceptor to set a request header with bearer token if it is expired or does not exist in a local storage. The token request in the interceptor must be synchronous so that it can set the bearer token in the header before the async call starts.

@benjamingr
Copy link
Author

@arman-g in your case you could subscribe synchronously since Angular interceptors are allowed to be async and there would be no benefit from synchronous subscription.

Not that it really matters that much since this proposal is mostly dead and I am sure most of us (at least me) are pretty ashamed of our conduct and the way we chose to converse here :]

This conversation is a great example of seeking to be understood rather than to understand - it resulted in nothing, no one got smarter about whether or not firehosing or unbounded buffering (synchronous subscription) is important etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants