Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should unsubscribe invoke return on the observer? #14

Closed
zenparsing opened this issue May 29, 2015 · 55 comments
Closed

Should unsubscribe invoke return on the observer? #14

zenparsing opened this issue May 29, 2015 · 55 comments

Comments

@zenparsing
Copy link
Member

In Rx, observables created with Observable.create have the following unsubscribe (dispose) semantics:

When dispose is called:

  • Close the outer observer so that no more notifications will be sent to the inner observer.
  • Call the cleanup function and drop the reference to it so that it won't be called again.
  • The default cleanup function is a no-op.

The current spec allows the user to call return from the cleanup function in order to shut down the observer's generator. If the user does not provide a cleanup function, then the default behavior will be to invoke "return" on the generator.

The rationale is that, when using a generator function's generator as the observer, we want to give the generator function a chance to clean up resources before we drop the reference to it.

For example:

function* generatorFunction() {
    let resource = getResource();
    try {
        while (true) resource.use(yield);
    } finally { resource.release() }
}

let generator = generatorFunction();
generator.next(); // Prime the generator

let subscription = sequenceOfLines.subscribe(generator);

// Unsubscribe after subscription has started
Promise.resolve().then(_=> subscription.unsubscribe());

// We would expect resources held by the generator to be released

@erights seemed to agree with this intuition.

The other point of view (the "Rx" view) is that calling "unsubscribe" is a way of notifying the producer that no more notifications should be sent (even "return"). Under this interpretation, it wouldn't make much sense to invoke "return", or anything else, after the consumer has explicitly given instruction not to.

What arguments can we come up with to resolve this apparent dilemma?

@zenparsing
Copy link
Member Author

@jhusain I'm in the process of going through some of Rx's combinators to see if I can find some use cases which might help resolve this one way or another.

@benlesh
Copy link

benlesh commented Jun 4, 2015

In RxJS there are three ways to shut down Observable:

  1. Observable completes successfully -> return/onCompleted is called, then cleanup
  2. Observable errors -> throw/onError is called, then cleanup
  3. explicit unsubscribe (aka dispose) -> cleanup is called

RxJS doesn't currently allow you to call onCompleted or onError from in the cleanup action. I mean, you can, but they don't do anything.

If prior art matters at all, that's the behavior as I understand it.

@zenparsing
Copy link
Member Author

@Blesh That's my understanding as well - thanks for clarifying.

In RxJS, is there ever a situation where the observer might hold resources which need to be released when the subscription is finished (whether by calling dispose on the subscription, or otherwise)? If so, how is that handled?

@benlesh
Copy link

benlesh commented Jun 4, 2015

In RxJS, is there ever a situation where the observer might hold resources which need to be released when the subscription is finished?

Not for Observers, no. Observers are basically objects with event handler functions on them.

@benlesh
Copy link

benlesh commented Jun 4, 2015

Don't let this muddy the waters, but one thing that might need clarification if you see an Rx.Subject as an Observer.... The Observer aspect of Rx.Subject is not what requires it's dispose() method... it's the multicast Observable aspect of the Subject that requires it, and it simply disconnects its observers.

TL;DR: Subjects can be used as Observers, but aren't anything like the Observers discussed in this spec. They just implement the same interface.

@benjamingr
Copy link

I'm not really sure what clean up of the observer actually mean - the observer is just the listener what you need to clean up is typically the source.

The subscription function itself typically doesn't clean up things. In your example it'd make more sense to create a new observable from the old one that uses the data and then use the observable dispose semantics on that.

@zenparsing
Copy link
Member Author

If the observer is a generator implemented with a generator function, then it's pretty easy to imagine a situation where the observer (i.e. the generator) holds resources which need to be cleaned up before the reference to the generator is dropped.

In ES6 we're careful to call return on iterators/generators pretty much anywhere we use the iteration protocol for this very reason. It would seem incongruous to not do the same thing here.

@erights @domenic have any thoughts to lend here? We're at a bit of a standstill at the moment.

@domenic
Copy link
Member

domenic commented Jun 8, 2015

Nothing too helpful, just the observation that this mapping of generators to observables looking ever-more tenuous :-/

@zenparsing
Copy link
Member Author

My understanding of a conversation with @jhusain yesterday:

In ES6, we conflate two types of continuation: "return" and "terminate". When receiving continuations via yield in a generator function, there would be no observable difference between the two: they both would trigger finally blocks.

However, when receiving continuations via callback (as we are attempting to do with Observable), there is a semantic difference between the two: "return" indicates successful completion of the producing code, whereas "terminate" would indicate unconditional termination of the producing code.

Rx's "onComplete" expresses the "return" semantic, but not the "termination" semantic.

On the other hand, for compositionality we still need to somehow feed a "terminate-ish" continuation into generators.

Does that sound about right @jhusain?

@jhusain
Copy link
Collaborator

jhusain commented Jun 10, 2015

This is correct. Working on fleshing this explanation out in more detail. Will comment soon:

@jhusain
Copy link
Collaborator

jhusain commented Jun 10, 2015

Mark and I worked on this today and I think we understand the space. I'm going to try and capture our consensus here.

First let's trying to define unsubscription. Unsubscription is the graceful termination of a function by either consumer or producer, producing no return value. The word "subscription" is suboptimal, because while it has event connotations it is applicable to both iteration and observation. As a result I will henceforth refer to unsubscription as termination.

On the surface it would definitely appear as though termination was akin to return(undefined). After all this is what happens when you terminate iteration today:

for(let x of xs) {
  if (x == 3) {
    break;
  }
}

...translates to...

vat iter = xs[Symbol.iterator]();
var iterResult;
while(!(iterResult = iter.next()).done) {
  if (x == 3) {
    iter.return();
    break;
  }
}

Invoking return(undefined) would seem to very closely approximate termination. Returning undefined ends the function gracefully and produces no return value. However note that mapping the notion of termination to return(undefined) means that it is impossible to distinguish after the fact whether the generator function has been terminated, or has completed successfully with an undefined result.

// called after breaking out of a for...of loop
// iteration completed or terminated?
iter.next() // {done: true, value: undefined }

The inability to distinguish termination is not a problem as long as neither party cares. Thus far in iteration no such distinction has been necessary. However in user-land libraries the distinction between unsubscription and completion is regularly made. In user land, Observable libraries do not invoke any callbacks after unsubscription. The closest semantic to the generator's return() is the Rx Observer's onCompleted() which is not invoked when an Observable is unsubscribed. I am trying to determine whether this is necessary, or just how the implementation shook out.

If it is indeed necessary to distinguish between termination and returning an undefined value, the solution may be to add a new terminate semantic to generators. I believe this can be done in a backward-compatible way. However I don't want to discuss it too much until I can prove that the distinction between termination and returning undefined is truly important. I am collecting motivating cases at the moment.

@zenparsing
Copy link
Member Author

It seems to me that the semantics of forEach wants the "terminate" signal: the returned promise should resolve even in the event of an unsubscription.

This is a somewhat contrived example which I think demonstrates the point:

let dataSource = new Observable(sink => {

    setTimeout(_=> {

        // After 5s, send some data and stop
        sink.next(1);
        sink.return();

    }, 5000);
});

// A combinator that unsubscribes after a given number of seconds
function expireAfter(seconds) {

    return new Observable(sink => {

        let sub = this.subscribe(sink);

        // Unsubscribe after a number of seconds
        setTimeout(_=> { sub.unsubscribe() }, seconds * 1000);

        return sub;
    });
}

async function af() {

    // Expire the subscription after 1 second, and wait for the stream to finish.
    // Even though the inner subscription has been cancelled, the promise should
    // still resolve.
    await dataSource::expireAfter(1).forEach(x => console.log(`Got data: ${ x }`));
    console.log("done");
}

af();

Of course, in expireAfter one could simply call sink.return() before unsubscribing, but then we're back to the same issue with the combinator "lying" about the end-of-stream signal.

@zenparsing
Copy link
Member Author

Attempting to record some of the movement on this...

The suggestion was made by @jhusain to make the observable a "disposable", in the .NET sense of the word. If we think of the observer as a state machine which holds resources which must be cleaned up, then this approach appears to make sense. On unsubscription, instead of calling observer.return() we would instead attempt to call observer[Symbol.dispose](). In order to make this work with generator-function (GF) generators, we add the following logic:

  1. If the observer has a Symbol.dispose method, then call observer[Symbol.dispose]().
  2. Else, if the observer has a return method, then call observer.return().
  3. Else, do nothing.

This concept seemed promising to me, but after working with it over the weekend I found the following issues:

  • Adding a "disposable" type to ES is actually a more broad TC39 proposal that adding Observable. It seems like Observable would be blocked on Symbol.dispose.
  • Adding a fourth method imposes a burden on combinator authors; they must implement all four semantics in order to offer the correct forwarding and handling behavior.
  • If we add the dispatch logic above, then we give return two different meanings depending on the context of the containing observer object. This seems bad; for simplicity we want each method to have one clear meaning.
  • If we are serious about implementing the "disposable" pattern, then "dispose" should be invariably called before dropping the reference to the observer. In other words, we need to call "dispose" after calling "throw" or "return", or after the observer returns a done result. In that case, the solution overshoots the problem. We don't really need a generic disposal protocol. We only need a way to represent cancellation to the consumer.

I had some more thoughts on this last night that might be interesting. I'll drop an additional comment in a bit.

@benjamingr
Copy link

What's the difference between a return and a Symbol.dispose?

(I thought return was dispose)

@zenparsing
Copy link
Member Author

@benjamingr That was my original understanding as well : )

Hypothetically, an observer might want to respond differently depending on whether the sequence was successfully completed or simply cancelled (via unsubscribe). If we use return for both cancellation and completion, then users who want to tell the difference will need to "reassemble" that information somehow.

In RxJS, onComplete is not called on unsubscription. The observer is left hanging, with the assumption that whoever called unsubscribe would be responsible for performing any related cleanup. If we take that approach though, our mission of using generators as observers kinda falls apart a bit.

I feel like there's a simple solution that we're overlooking.

@benjamingr
Copy link

I think the lack of distinction is important - but I'll think about it more.


If only we knew who the person with about the best understanding of observables and the duality between iterators-iterable and observer-observable and that someone would also have recently agreed to help :D

Summoning @headinthebox , help us our duality is breaking with iterators and observable cancellation. This is exactly where a few words from you could ace the bunch of us a lot of time :)

@headinthebox
Copy link

The role of disposables in the actual dualization has always been kind of handwavy (if you watch my talks, you will see that I grey out IDisposable and then pull a rabbit out of my hat to re-introduce it).

While re-implementing Rx from scratch recently for Mobile, I do think that I finally have some grip on it such that I can formally explain it, but I am not sure that fits in with the road with multiple ways to achieve successful termination you have taken in JavaScript because of exactly the reasons explained in this thread and which makes a clean dualization harder to achieve.

What I do not really understand in the discussion above is how the actual flow of things is supposed to work in JavaScript. In Rx, the producer calls onNext, onCompleted, and onError and the consumer receives those callbacks.

The consumer is the one who calls unsubscribe, either by automatically by convention (http://download.microsoft.com/download/4/E/4/4E4999BA-BC07-4D85-8BB1-4516EC083A42/Rx%20Design%20Guidelines.pdf) after receiving onError or onCompleted or out of bound (via the subscription returned from subscribe) which signals to the producer that the consumer does not want any more notifications.

Notifications flow from producer to consumer, cancellation flows from consumer to producer.

It does not make sense to me to call onCompleted after unsubscribing, because that would (a) signal the stream terminated successfully, which is not the case since the stream might still be producing values, and (b) it would trigger an auto-unsubscribe, which would then call onCompleted again and cause an infinite loop.

(In some sense using .NET cancellation tokens and cancellation token sources https://msdn.microsoft.com/en-us/library/dd997289(v=vs.110).aspx in Rx would be cleaner than disposables. But that is another discussion)

@benlesh
Copy link

benlesh commented Jun 25, 2015

It does not make sense to me to call onCompleted after unsubscribing, because that would (a) signal the stream terminated successfully

I agree with this statement completely. The consumer really shouldn't be telling the producer that it completed successfully.

I think this problem with duality in JavaScript stems from the idea that generators can't be "unsubscribed" from by a consumer in a way that would trigger a tear down of scarce resources. I mean, you could simply stop calling next on the iterator, but then the generator's finally block will never be hit.

As such, a consumer is then forced to signal it's desire to unsubscribe via calling return on the iterator or nexting in a value for the generator to make a decision on (which would likely result in the generator returning)

Perhaps the problem is there is a fundamental flaw in the design of generator that makes duality difficult in this case?

@benjamingr
Copy link

think this problem with duality in JavaScript stems from the idea that generators can't be "unsubscribed" from by a consumer in a way that would trigger a tear down of scarce resources. I mean, you could simply stop calling next on the iterator, but then the generator's finally block will never be hit.

As such, a consumer is then forced to signal it's desire to unsubscribe via calling return on the iterator or nexting in a value for the generator to make a decision on (which would likely result in the generator returning)

ELI5, how is .return on an iterator not exactly the same as unsubscription sand the name. I was under the impression this is what the duality is.

@benjamingr
Copy link

*sans, also the first dot should be a question mark, also GH's mobile interface is terrible :D

@benlesh
Copy link

benlesh commented Jun 25, 2015

In current Rx, you can terminate the producer/consumer relationship in three ways:

  1. success: producer completes successfully (return or onCompleted).
    • success handler called
    • clean up performed
  2. failure: producer emits and error (throw or onError).
    • error handler called
    • clean up performed
  3. unsubscription: the consumer notifies the producer that it's just done.
    • clean up performed

The thing we're toying with here is whether or not that last one should exist. I'm on the fence, honestly. I can't think of many use-cases for unsubscription that couldn't be covered by success or failure paths.

That coupled with the fact that there's really no way to unsubscribe from a current generator. You can't just stop calling next on the generator and expect it to magically know to call it's finally block. So you must call return or throw.

@benlesh
Copy link

benlesh commented Jun 25, 2015

... So while I agree with @headinthebox that it's weird, maybe it's not so weird? haha. Now I"m flip-flopping.

I can't come up with a concrete example where a non-success, non-failure termination is helpful. Especially when you can pass a value to return and analyze it.

One thing is clear though, the clean up is coupled to the subscription. It's my opinion, after having tried to implement the latest ideas behind this spec, that the subscription should actually be an observer/generator instance, since at the end of the day, if unsubscription is calling return under the hood, then unsubscribe is really just an alias for return, and it might pose some value to have next and throw available as well. Also during the implementation, you realize pretty quickly that Observer and Subscription are tightly coupled and should probably be one class. Again, just my opinion. (which changes daily, heh)

@headinthebox
Copy link

OK, now I am confused. Number 3 is extremely common. For instance say a consumer only wants to receive 3 elements from the producer. Then after 3 onNext's it unsubscribes from the source and (and possibly calls onCompleted() on itself, but not always. For example in many event processing sceanrios you just unsubscribe from the source, just like you can simply remove an event handler from an event source).

@benjamingr
Copy link

@Blesh

unsubscribe is really just an alias for return,

Yes, exactly.

In my opinion we resolve a lot of ambiguity if we call "disposing" or "cancellation" as what it really is - "disinterest in the sequence from the consumer, allowing it to perform clean up if it's interested". This is essentially what return does, this is what unsubscribe does, and this is what dispose does in a weak sense.

@headinthebox

Number 3 is extremely common.

Yes, but is unsubscribe inherently different from a generator's return?

In JavaScript, generators yield expressions (that is, you can var a = yield b), the iteration protocol specifies a .return method and a .throw method to signal we're done with the sequence or that we're signalling an error back to the sequence. That is - the consumer can (with iterators) signal that we're disinterested in the sequence, or that it should go to the error state.

When the iterator is also a generator, .return runs all finally blocks (performing cleanup), and throw throws an exception that can be caught (and also thus runs finally blocks). The running of finally blocks in return seems to parallel the cleanup nature and disinterest-more-than-explit-request-to-dispose semantics of unsubscription.

Promises for example (as tasks), utilize this by when the promise yielded rejects it .throws into the generator and when a promise originating in a coroutine is cancelled (the return value of an async function) in fact .return is called in order to prevent additional actions and to force clean up.

Does this make sense to you?

@headinthebox
Copy link

Excuse my ignorance of the subtleties of JavaScript generators, but with iterators does the consumer call return if it wants the producer to stop? Is that like in python where you throw an exception from the consumer to the producer?

@benjamingr
Copy link

@headinthebox

with iterators does the consumer call return if it wants the producer to stop? Is that like in python where you throw an exception from the consumer to the producer?

I've written a short summary for you of the protocol - hope it's not long.

The current (and final) design in JavaScript (the one that browsers follow) is the following protocol:

  • Iterable<T> - has a Symbol.iterator method that when called returns an iterator instance.
  • Iterator<T> - an iterator for type T, typically obtained via alling Symbol.iterator, has the following methods:
  • .next() (must implement), calling this returns an object with the following properties:
    • done - true or false, indicating whether the iterator is done iterating the sequence.
    • value - the value of the next item in the iteration. See example in Note 1, note that even if done is true the iterator may yield a value.
  • .throw(exception) (optional, generators implement this) - throws a value into the sequence, signalling an error. See note 2 for example. It also progresses the iterator and returns an object with {done, value}
  • .return(value) (optional, generators implement this) - returns from the iterator, signals that we're not interested in any more values from the source. See example in Note 3. Also returns an iteration result with done:true and the value, runs finally clauses.

Note 1: The iteration protocol on an array

var arr = [1,2,3,4,5];
var iterator = arr[Symbol.iterator]();
iterator.next(); //Object {value: 1, done: false} 
iterator.next(); //Object {value: 2, done: false} 
iterator.next(); //Object {value: 3, done: false} 
iterator.next(); //Object {value: 4, done: false} 
iterator.next(); //Object {value: 5, done: false} 
iterator.next(); //Object {value: undefined, done: true}
iterator.next(); //Object {value: undefined, done: true}
iterator.next(); //Object {value: undefined, done: true}
// and so on

Note 2: The iteration protocol with throw on a generator.

function* gen(){
    var i = 0;
    try{
           while(true){ 
                yield i++;
           }
    } catch (e) {
        return "Done";
    }
}

var iterator = gen();
iterator.next(); // Object {value: 0, done: false}
iterator.next(); // Object {value: 1, done: false}
iterator.next(); // Object {value: 2, done: false}
iterator.next(); // Object {value: 3, done: false}
iterator.next(); // Object {value: 4, done: false}
iterator.throw(new Error()); // Object {value: "Done", done: true}
iterator.throw(new Error()); // VM339:2 Uncaught Error (exception propagated to upper scope)

Note 3: Iterator with return:

function* gen(){
    var i = 0;
    try{
           while(true){ 
                yield i++;
           }
    } finally {
        console.log("Hi");
    }
}
var iterator = gen();
iterator.next(); // Object {value: 0, done: false}
iterator.next(); // Object {value: 1, done: false}
iterator.next(); // Object {value: 2, done: false}
iterator.next(); // Object {value: 3, done: false}
iterator.return(); // not implemented in Chrome yet, logs `"Hi"` and returns an object with {value: undefined, done: true`

@zenparsing
Copy link
Member Author

@headinthebox Thanks for jumping in here.

Yes, in JS iteration, the consumer calls "return" when early-exiting (for example when breaking out of a for-of).

In JS, iteration is really two cooperating sequences: a request sequence and a response sequence. The request sequence is a series of next(undefined) calls on the iterator, with a terminating return(undefined) call to indicate early-exit if the iterator's response has not already indicated done-ness (with a { done: true } IterationResult).

Generator functions will then have their finally blocks executed when return is called on their generators.

Presumably, the dual of this setup works like so: the producer sequence is a series of next calls with a terminating return or throw. The consumer can return an IterationResult from these calls to indicate done-ness.

Maybe this little sketch will help:

Sequence: 1, 2, 3, where consumer indicates break after receiving 3

Iteration

Consumer            Producer
=============================================
next(undefined)     { value: 1, done: false }
next(undefined)     { value: 2, done: false }
next(undefined)     { value: 3, done: false }
return()            [ignored]

Observation

Producer    Consumer
=============================================
next(1)     { value: undefined, done: false }
next(2)     { value: undefined, done: false }
next(3)     { value: undefined, done: true }

I agree that it doesn't really make sense from a duality perspective to call return on the consumer when early-exiting with unsubscribe. On the other hand, it seems like finally blocks (cleanup) should run on the observer in the event of cancelation. Or at least that cancelation signal should be allowed propagate somehow.

@benjamingr
Copy link

@zenparsing

I agree that it doesn't really make sense from a duality perspective to call return on the consumer when early-exiting with unsubscribe.

I'm sorry for being the dumb guy in the room of smart people - but I still don't understand why. It seems like .return is an excellent dual for unsubscribe, the only thing different is the name.

@zenparsing
Copy link
Member Author

@benjamingr If you're in the room, you're smart : )

return(x) is the dual of the IterationResult { value: x, done: true }. And "dual" in this case just means we've swapped a method call for a return value without changing the meaning.

Now that you've got me thinking about it, I fudged a little bit on that sketch, where I put in "[ignored]". It's true that for-of ignores whatever comes back from the return call, but in the typical case it would be this:

Consumer            Producer
=====================================================
next(undefined)     { value: 1, done: false }
next(undefined)     { value: 2, done: false }
next(undefined)     { value: 3, done: false }
return()            { value: undefined, done: true }*

*Ignored by for-of

So I suppose in the common case, early-exit does result in the producer sending a termination signal. Wow - thanks!

@jhusain
Copy link
Collaborator

jhusain commented Jun 25, 2015

This proposed change to the proposal agreed upon in the last meeting has raised a few important concerns:

  1. Why is the proposed tweak on the Observable type not the strict dual of the ES2015 Iterable?
  2. Why do we need a dispose method, when we already have return(undefined)?
  3. If we use a @dispose symbol, will that use up a symbol that would've been used in the future by some general purpose finalization scheme?

Let me start by stating my opinion on a few matters:

I believe that Iterable and Observable are dual. However in ES2015 the Iterable contract is (sensibly) simplified, removing semantics that are not useful in the context of iteration.

I also believe that there is meaningful difference between the successful completion of observation with no value (return(undefined)) and the termination of observation triggered by a consumer (@dispose()). Furthermore I believe there are compelling use cases for consumers to be able to ergonomically differentiate between Return and Termination. These use cases don't appear to be present in Iteration.

First let's address the question of duality. Note that the Observable type approved in the last meeting already broke from strict duality with the ES2015 Iterable type.

type ES2015Generator = ( Next v | Error e | Return v ) -> ( Next v | Error e | Return v )
type ES2015Iterable = () -> ES2015Generator
type ES2016Observable = ES2015Generator -> Subscription

Note the introduction of the Subscription in the proposed Observable type, whereas the strict dual of the ES2015Iterable would be ES2015Generator -> (). The inclusion of the Subscription object allows the consumer to asynchronously unsubscribe, and allows the Observable to model most of the webs observation APIs.

The subscription introduces a new semantic to observation: termination. Termination gracefully terminates observation, producing no return value. Termination is distinct from Return. The former is an indication that the consumer would no longer like to participate in observation. Return indicates the successful completion of the stream of values.

Should we be concerned about the fact that Iteration apparently doesn't have a termination semantic? Does this mean we aren't the dual of Iterable any more?

No. We have simply left the terminate semantic out of iteration, because there are no use cases for a producer to Terminate instead of Return. That is why we can cancel out the Subscription object in ES2015Iterable.

type ES2015Iterable = Subscription () -> ES2015Generator

However when we take the dual of Iterable (Observable), the Subscription is sent to the consumer rather than the producer. There appear to be valid real-world use cases for consumers to differentiate between Terminate and Return. I'll get to that in a moment. First let me restate the original design, and the proposal for changing it.

In the last TC-39 meeting Kevin and I proposed that an unsubscription would result in a Return message being sent to the generator. This gives generators the opportunity to clean up scarce resources in the event of unsubscription.

var sub = observable.subscribe(function*(){
try {
var outputFile = open("output.txt")
while(token = yield) {
ouptutFile.write(token);
}
finally {
outputFile.close();
}
})

sub.unsubscribe() // calls generator.return() and falls through finally blocks. Note that this process mirrors the premature exit of a for...of loop, in which the return method of the generator is also invoked.

All of this lines up quite nicely with the way Iteration works in ES2015. So why am I proposing a change?

In the design proposed at the meeting, consumers cannot differentiate between a return and a terminate. When you unsubscribe we send a return(undefined) to the consumer. However the consumer would've received the exact same message if the producer successfully completed and returned a value. Is this loss of semantic visibility important?

Let's imagine a sequential set of requests, one that attempts to retrieve data from a cache, and another attempts to retrieve data from a data store if no data was found in the cache.

var sub = getEVCache("customer", 163).
subscribe({
return(v) {
if (!v == null) {
getDB("customer", 163).
subscribe({
return(v) { animateIn(v); }
});
}
else { animateIn(v); }
});

backButton.click = function() { sub.unsubscribe() ; };

Note in the example above that undefined is an actionable value: it means that the data is not found in the cache, and we should go ahead and try and retrieve it from the data store. However if the back button is pressed during the first request we don't want to trigger a second. This is the difference between terminate and return.

Of course it is possible for consumers to use state to differentiate between Terminate and Return. Unfortunately this is not ergonomic.

var unsub = false;
var sub = getEVCache("customer", 163).
subscribe({
return(v) {
if (!unsub && !v == null) {
getDB("customer", 163).
subscribe({
return(v) { animateIn(v); }
});
}
else { animateIn(v); }
});

backButton.click = function() { unsub = true; sub.unsubscribe() ; };

Note that every time the return function is invoked asynchronously we need to check state to differentiate between termination and a successful return. Using state to coordinate asynchronous programs is a foot gun, and one that can be avoided if communicate the difference between terminate and return to the consumer by invoking @return on the generator. If unsubscription invoked @dispose on the generator, a generator holding scare resources could free them, and consumers that wanted to proceed only on successful completion could do so ergonomically without introducing any state. That's the idea behind the proposed change.

Here is the proposed change to the current design agreed-upon in the last meeting.

type ES2016Generator = (Next v | Error e | Return | Terminate) -> (Next v | Error e | Return v))
type ES2016Observable' = ((Next v | Error e | Return | Terminate) -> (Next v | Error e | Return v)) -> Subscription

Note the inclusion of the Terminate input in the generator. This maps to a call to a @dispose method on the generator object. This method would be invoked on unsubscription.

var subscription = observable.subscribe({
@dispose {
console.log("observation has been terminated")
}
}

sub.unsubscribe();

// prints "observation has been terminated"

The proposed introduction of the terminate semantic on the Generator (dispose method) on the observer corresponds to the introduction of the Subscription. When an unsubscription occurs, the Generator is sent a Terminate message (dispose).

If I had things to do over again, I would likely have introduced a @dispose method to generators in ES2015. It would have been more correct to invoke dispose() rather than return(undefined) when a consumer chose to terminate (ie. break out of a for...of). This would allow a producer to distinguish between Terminate and Return. Regardless this ship sailed in ES6 when we decided to invoke return(undefined) when breaking out of a for...of loop. Changing this behavior would be a breaking change at this point. The good news is that there doesn't appear to be any apparent real-world use cases for the producer distinguishing between Terminate and Return during Iteration.

Kevin also mentioned that he is concerned that using the @dispose symbol might conflict with a future general-purpose finalization proposal for JavaScript, and I agree. I also believe that the behavior of trying to detect the existence of @dispose, and if not present, invoking return is odd. This is really a shortcut, and we should be doing the right thing: adding a terminate method to generators produced by generator functions in ES2016.

My revised proposal is this: add a @TeRMiNaTe method to generators produced by generator functions in ES2016. This method will go through finally blocks, but produce no return value. Using @TeRMiNaTe rather than @dispose will allow future general-purpose finalization schemes to use @dispose. A break within a for...of will still call return(undefined) to respect the fact that we did not introduce termination semantics to iteration in ES2015. However when invoking unsubscribe on a subscription, we will invoke the @TeRMiNaTe method on the generator if present. If the @TeRMiNaTe method is not present, we will not invoke any generator methods.

@benjamingr
Copy link

@jhusain thanks for the read, from the top down:

I also believe that there is meaningful difference between the successful completion of observation with no value (return(undefined)) and the termination of observation triggered by a consumer (@dispose()).

Talking about iterators, you can distinguish explicit termination via .return vs successful completion (next() called until it returned {value:..., done:true}).

Furthermore I believe there are compelling use cases for consumers to be able to ergonomically differentiate between Return and Termination. These use cases don't appear to be present in Iteration.

Can you please explain this? Talking about iteration - I thought return was unnatural (unsubscribe/dispose/etc) termination where natural termination is just calling the iterator until it's done and returns a {done:true, value:...} result.

Note the introduction of the Subscription in the proposed Observable type, whereas the strict dual of the ES2015Iterable would be ES2015Generator -> ().

Backtracking to my question, why is that different from a .return on a generator? The fact we're using a generator to create the observer doesn't mean the observable's interface has to be effected.

The ES2015Generator here is the source (observable) here and the subscription is the observer. The duality should be between the subscription we speak of and the iterator, what should be dual to the iteration protocol is the subscription and not the source.

The fact the source of the observable is created with a generator is nice, but unrelated to the duality.

I think that if we read the rest of your comment with that in place (the iterator duals the observer of the observable and not the observable itself) - everything makes sense without additional semantics. What am I missing?

@zenparsing
Copy link
Member Author

@jhusain This proposal feels good to me. Do you think we need to make terminate a symbol though? I don't think there will be backward compatibility issues if we make it a normal string name, and it would provide better usability and better consistency:

observable.subscribe({
    return() { console.log("Completion is so nice!") },
    terminate() { console.log("You unsubscribed me you scalawag!") },
});

But (as always) I might be missing something.

@benjamingr
Copy link

@zenparsing do you understand the duality proposed in that model? I don't understand why it's not (cc @jhusain):

type FullIterable (e.g. generator) = () -> ES2015Iterator
type ES2015Iterator = ( Next::v -> IterResult | Error:: e -> IterResult | Return::Maybe[v] -> IterResult)
type ES2016Observable (e.g. created with ES2015Generator) = () -> ESObserver
type ESObserver = (onNext::v -> ?? | onError:: e -> ?? | return:: v -> ??)

This makes more sense and is directly dual with observable/observer and iterable/iterator. You have parallels for regular completion, termination (for cancellation/dispose for example), next (vs onNext).

The only thing I'm not sure is what onNext returns, with real duality, it'd be

onNext:: IterResult -> v

Where you get an iterResult (no onCompleted callback on Observers) and the onNext parameter is a {value, done} pair telling you both if the observer is done and the next value. The return value gets sent back to the observer and it can react based on it (in a push way)

onError:: IterResult -> v

Tells you what value the observer is after the error, and lets you recover from it (since control is inverted), the return value lets you recover from it.

 return:: Maybe[v] -> IterResult

Calling return terminates the subscription, its job is to act like a generator's return signaling disinterest and unsubscription (like a dispose and a terminate). You can return a value to it, note that unlike onError and onNext return keeps its signature in the duality like Dispose - it does not become any more push than the iterator version is actually pull, it gives the observer a chance to synchronously provide the consumer any last result it might not have been given the chance to signal through onNext.

As for sending explicit undefined vs sending a value, we have the same issue with generators and it has never been a real problem, you can easily detect arguments.length and check if undefined was sent vs. no value or you can send a Symbol.void if you don't want a value.

Now, All I did was reverse the arrows, from what I understand from @headinthebox this is how you get duality. I didn't change any contract and what I got back seems to be able to fulfil unsubscription semantics pretty well.


I also want to point out that how the observable is created (with a generator, or not (for example through a native function call)) is irrelevant completely to the duality.

@domenic
Copy link
Member

domenic commented Jun 26, 2015

I think @benjamingr has an important insight that if the focus is on duality, we should be indeed embracing that in a more formal way, instead of the somewhat-handwavey way that we have now. Having a signature of next({ value, done }) instead of next(value) seems obvious, from this perspective. And indeed, from this perspective, the attempt to make return map to C#'s OnCompleted seems just wrong, since in JS we don't use MoveNext() -> boolean and Current -> T, but instead next() -> { value: T, done: boolean }.

@benjamingr
Copy link

I think @benjamingr has an important insight that if the focus is on duality, we should be indeed embracing that in a more formal way, instead of the somewhat-handwavey way that we have now.

Thanks. I just want to point out again that the duality itself has to do with the observable/observer and not how they're created. The same way that an iterator is determined by how it's implemented and you can implement your own iterators without anything to do with generators. The same way generators are not required (or even important) to the duality neither are the specifics of observable sources.

The fact that the standard way to create an observable in the spec is by passing a subscriber argument to observable isn't part of the duality just like the promise constructor being used to create promises isn't part of what a promise is and that many APIs that return promises don't use it. Similarly if APIs in the future use observables the users would not even be aware of a subscriber calling next, regardless of the structure of the subscriber parameter using next(value) or next({done, value}) the observer should send {done, value} pairs if it is to dual iterators.

@jhusain
Copy link
Collaborator

jhusain commented Jun 27, 2015

Seems like there are two claims being made here:

The first is that we do not have duality.

The second is that we need to take the dual of each method on the JavaScript Generator type to ensure duality.

I believe both of these claims stem from a simple misunderstanding. Evaluating duality is a useful exercise because it validates that the Observable type supports all of the same combinators as the Iterable. Once we have validated that the types are dual, we should craft an API that exposes the semantics in an ergonomic way. The former is an objective process. The latter is a subjective process that takes into account the idioms of the language, and the expected usage of the API for target use cases.

Duality is easier to perform on one function than many, and so it is simpler to evaluate duality by expressing the Generator as a function type. The simplest definition of a Generator as a function type is this:

(Next v | Throw e | Return v) -> (Next v | Throw e | Return v)

Every interaction between producer and consumer can be modeled using this function. A sender can send a value, an error, or a return value, which can be answered with a value, an error, or a return value.

Hopefully it is clear that there are a large number of different ways in which we can map these semantics to methods on an object without changing the fundamental semantics. The design of Generator in ES2015 is a series of pragmatic and stylistic decisions that take into account the pre-existing idioms of the language, and the ergonomics of the API for intended use cases. In JavaScript the decision to model a generator as an object with multiple methods instead of a function was a pragmatic one given the pervasiveness of object-oriented programming in the language. These subjective design decisions have no impact on the underlying semantics and therefore the duality.

JavaScript doesn't have discriminated unions like ML, but it captures the (Next v | Return v) part of (Next v | Error e | Return v) within the IterationResult returned the call to next():

interface IterationResult {
  done: bool, 
  value: Object
}

The semantics could have been conveyed with a raw value and a "done" property on the Iterator as in other languages like C#. Once again this was a subjective design decision, and simply a different way of expressing the same basic semantics. The Error semantic is captured by the fact that every JavaScript function may throw. Note once again that the decision to use the exception throwing capabilities of the function rather than add an "error" member to the IterationResult is a sensible but subjective design consideration that takes into account the language's pre-existing idioms.

The decision to invoke next(), throw(), or return() instead of passing an IterationResult to one function was also completely arbitrary. It would've been just as expressive to pass a single IterationResult with an optional error property as an input to a single function.

In other words calling any of these functions...

generator.next({value:3, done:false}) or generator.next({value:3, done:true}) or generator.next({error: new Error(), done: true})

...is no more or less semantically expressive than calling one of these following functions:

generator.next(3) or generator.return(3) or generator.throw(new Error());

Once again these design decisions are based on subjective considerations about how to expose the underlying semantics. It is neither easier, nor more instructive to attempt to validate duality by taking the dual of the arbitrarily-designed API of an ES2015 Generator.

There seems to be a little bit of confusion about how to get the dual of a function. Once you simplify the Observable and Iterable to function types and remove all of the purely subjective design decisions, it is much easier to assess whether they are dual. This process is objective. There is nothing "handwavy" about it.

A simplified explanation of how to get the dual of a function is to swap the arguments and return type of the function, and then do the same for each term.

Here is a function type definition of an ES2015 Iterable:

type Iterable = () -> ((Next v | Throw e | Return v) -> (Next v | Throw e | Return v))

First we swap the argument and return type of the function by reversing the arrow direction:

() <- ((Next v | Throw e | Return v) -> (Next v | Throw e | Return v))

Note that we're not done. We have to apply the same operation to each term. In other words we have to reverse the arrow in the generator function:

() <- ((Next v | Throw e | Return v) <- (Next v | Throw e | Return v))

Note that the generator is self dual because the arguments and return type are the same. Now we can rewrite our definition in a more readable way:

type Observable = ((Next v | Throw e | Return v) -> (Next v | Throw e | Return v)) -> ()

The iterable's Iterator() is as much part of the Iterable type as the generator itself. It is not exempt from the process as has been suggested. That's why the dual of Iterable is not...

type Observable = () -> Generator

This is an easy mistake to make, and a question I asked myself when I first learned of the duality between Observable and Iterable.

@headinthebox is a busy guy so rather than waiting for him to weigh in you can just read his paper: http://csl.stanford.edu/~christos/pldi2010.fit/meijer.duality.pdf. TLDR; he goes through the same process for C# Iterators. Rather than take the dual of each individual method on C# Iterators, he confirms the semantics are dual, and then crafts an Observable API that exposes the type's semantics in an ergonomic way.

Taking the dual of each individual method is not only unnecessary, it doesn't necessarily produce an ergonomic API. For example it turns to generally be more convenient to have a different callback for Return and Next in Observation because you can easily ignore either message if you don't care about it. As the observer is not in control it doesn't necessarily need to explicitly examine whether a value is the Return value every time. In other words this...

observable.subscribe({
  return(value) {
      // trigger some follow up async operation
  },
  next(value) 
    // do something with value
  }
});

..is more ergonomic than this:

observable.subscribe({
  next(iterResult) {
    if (iterResult.done) {
      // trigger some follow up async operation
    }
    else {
     // do something with value
    }
  }
});

The latter API is a footgun, because it forces the consumer to check a Boolean to handle the return value differently. If the developer forgets to do this, you may end up handling the return value the same way as the next value.

Note that in iteration, the consumer is in control and therefore must check the "done" property in order to determine whether or not to continue. Of the consumer has no choice, because the consumer is driving. This is why it is ergonomic to deliver both values at the same time.

vat iterator = [1,2][Symbol.iterator](), result;

// done and value delivered together to indicate that both are necessary for correct iteration
while(!(result = iterator.next()).done) {
  // console.log(result.value);
}

As the examples above demonstrate, subjective API design considerations matter. However they have nothing whatsoever to do with evaluating duality.

Here's the upshot: the design of ES2016 generators is not only ergonomic for iteration, it is also ergonomic for observation. This both very nice and improbable considering the observation use case was not carefully considered during the design process.

The current proposal adds a termination semantic to Observable, and in my last post I explained how we still have duality. We just cancel the Subscription semantic out of Iterable because we do not have pragmatic use cases for it.

@jhusain
Copy link
Collaborator

jhusain commented Jun 27, 2015

@zenparsing I agree we do not need a symbol for terminate anymore.

@benjamingr
Copy link

@jhusain

The first is that we do not have duality.

Yes, not with the current design with Symbol.dispose and Symbol.terminate and all those.

The second is that we need to take the dual of each method on the JavaScript Generator type to ensure duality.

No, but we do need to reverse the return values, a JS iterator does not expose (Next v | Throw e | Return v) - all of these return a {done, value} pair so when proving the duality (not necessarily when choosing the observable API) we must use the API iteration actually uses rather than change it.

Evaluating duality is a useful exercise because it validates that the Observable type supports all of the same combinators as the Iterable. Once we have validated that the types are dual, we should craft an API that exposes the semantics in an ergonomic way. The former is an objective process. The latter is a subjective process that takes into account the idioms of the language, and the expected usage of the API for target use cases.

Yes, but we have not done it here yet. The other question is "should the API we expose be based on duality", your assumption here is that we should not. I'm not saying I disagree and that's not the part I criticised..

Every interaction between producer and consumer can be modeled using this function. A sender can send a value, an error, or a return value, which can be answered with a value, an error, or a return value.

I'm not sure why you're using this notation, it's not what's used in any of Erik's work. It also doesn't expose the correct types since next and return both return a record that indicates completion in addition to the value which is crucial to the model. I'm good with any notation that's formally defined in some place I can read but personally prefer Haskell, Scala or C# type notations if we want common ground for notation.

Hopefully it is clear that there are a large number of different ways in which we can map these semantics to methods on an object without changing the fundamental semantics.

Well, any other iteration protocol we can prove isomorphism for is perfectly fine.

the design of Generator in ES2015 is a series of pragmatic and stylistic decisions that take into account the pre-existing idioms of the language, and the ergonomics of the API for intended use cases.

Yes, but it also changes the API in a fundamental way in terms of the guarantees it makes for instance. Returning {done, value} pairs is something I didn't see in other languages before.

In JavaScript the decision to model a generator as an object with multiple methods instead of a function was a pragmatic one given the pervasiveness of object-oriented programming in the language. These subjective design decisions have no impact on the underlying semantics and therefore the duality.

Yes, if you model it as .send(Error|Return|Value, v) that's also fine - but any other changes in the API we claim preserve duality need to be shown and proved.

The semantics could have been conveyed with a raw value and a "done" property on the Iterator as in other languages like C#.

Right, but they're not - unlike in C# we don't have a Current and MoveNext contract, we have a single .next which progresses the iterator, reports completion and returns the value.

The decision to invoke next(), throw(), or return() instead of passing an IterationResult to one function was also completely arbitrary. It would've been just as expressive to pass a single IterationResult with an optional error property as an input to a single function.

Sure. That's not the API but sure.

...is no more or less semantically expressive than calling one of these following functions:

I agree.

Once again these design decisions are based on subjective considerations about how to expose the underlying semantics. It is neither easier, nor more instructive to attempt to validate duality by taking the dual of the arbitrarily-designed API of an ES2015 Generator.

If you want to prove the duality of observables to ES2015 generators - then you need to prove just that. If you want to create the API to be dual - you have to actually dualize the API rather than create an API you like based off C#'s and then claim observables are dual to that. We either claim duality and have actual duality or we stop claiming duality to iterators - but I'm not sure how we can have a non-dual API and still claim duality.

There seems to be a little bit of confusion about how to get the dual of a function. Once you simplify the Observable and Iterable to function types and remove all of the purely subjective design decisions, it is much easier to assess whether they are dual. This process is objective. There is nothing "handwavy" about it.

If you look at my previous comment at #14 (comment) that's exactly what I did. Exact 1:1 reversal of the actual Iterable/Iterator contract.

Note that the generator is self dual because the arguments and return type are the same.

I apologize, I still don't understand why the return type of a generator is a (Next v | Error e | Return v). If it is and you reverse it it's indeed nice but I just don't see it as the case.

That's why the dual of Iterable is not... type Observable = () -> Generator

Can we stop using generators in the discussions here? They are just a syntactic way to create a collection of values - they're not actually related to the duality, the iteration protocol or iterable/iterator. I think they're creating a lot of confusion here.

@headinthebox is a busy guy so rather than waiting for him to weigh in you can just read his paper: http://csl.stanford.edu/~christos/pldi2010.fit/meijer.duality.pdf. TLDR; he goes through the same process for C# Iterators. Rather than take the dual of each individual method on C# Iterators, he confirms the semantics are dual, and then crafts an Observable API that exposes the type's semantics in an ergonomic way.

I've read this paper several times in the past as well as several other of Erik's. He proves actual duality by inverting the arrows without changing the API first - that's what I'd like to see here. I think doing so would expose interesting semantics - like not needing dispose or terminate as symbols. I might be wrong but I'd like to see it :)

Taking the dual of each individual method is not only unnecessary, it doesn't necessarily produce an ergonomic API. For example it turns to generally be more convenient to have a different callback for Return and Next in Observation because you can easily ignore either message if you don't care about it. As the observer is not in control it doesn't necessarily need to explicitly examine whether a value is the Return value every time. In other words this... ..is more ergonomic than this:

I agree, the same reason I like promises' .then over callbacks (err, value) but then we can't claim we're following duality of an API (this is fine).

The latter API is a footgun, because it forces the consumer to check a Boolean to handle the return value differently. If the developer forgets to do this, you may end up handling the return value the same way as the next value.

I agree, I think we need to first prove the duality formally and then opt to a nicer API. I just think that when we do we can uncover that we don't need some stuff like Symbol.observe, Symbol.dispose and Symbol.terminate.

Note that in iteration, the consumer is in control and therefore must check the "done" property in order to determine whether or not to continue. Of the consumer has no choice, because the consumer is driving. This is why it is ergonomic to deliver both values at the same time.

Right, but when we dualize the one that gets done is the consumer too - but I agree it's less nice from an API PoV.

The current proposal adds a termination semantic to Observable, and in my last post I explained how we still have duality. We just cancel the Subscription semantic out of Iterable because we do not have pragmatic use cases for it.

Right, but isn't termination semantics just return? Isn't unsubscription really model cancellation/unsubscription/termination? Why do we need both?

When I did the actual dualization the way Erik shows this is what confused me. I think we can get a thinner API with less symbols and still express the duality. We can rename return on observers which is fine.

@domenic
Copy link
Member

domenic commented Jun 27, 2015

I just want to strongly second @benjamingr's points. It's not very reasonable to say that the duality is objective and non-handwavey... and then go on to start talking about how arbitrary and subjective the generator design is, and how we don't actually need to dualize it, but we can instead do something that's conceptually kind of similar and kind of dual.

If you want to prove the duality of observables to ES2015 generators - then you need to prove just that. If you want to create the API to be dual - you have to actually dualize the API rather than create an API you like based off C#'s and then claim observables are dual to that. We either claim duality and have actual duality or we stop claiming duality to iterators - but I'm not sure how we can have a non-dual API and still claim duality.

+9001

@zenparsing
Copy link
Member Author

Apologies for not getting back to this discussion earlier.

@benjamingr Let's find a starting point. Do you agree that the ES6 generator interface is a dual of iteration?

When we call next(x) on an iterator, there are 3 possible "continuations" through which control flow will return to the caller, and by fiat they are expressed like this:

  • Next: return { value: y, done: false }
  • Complete: return { value: y, done: true }
  • Error: throw y

The ES6 generator interface expresses exactly these continuation types, except that we are pushing the continuation to the callee instead of returning it to the caller:

  • Next: generator.next(y)
  • Complete: generator.return(y)
  • Error: generator.throw(y)

Semantically, we have "reversed the arrows". It's not a literal reversal, because that would result in a horrible API. But it expresses the same thing.

Are we in agreement so far?

@benjamingr
Copy link

@benjamingr Let's find a starting point.

Yes, Excellent.

Do you agree that the ES6 generator interface is a dual of iteration?

Please define the "generator interface", what I understand is the iterable interface (things having Symbol.iterator returning an iterator), iterator (next), extended iterator (next, return, throw).

When we call next(x) on an iterator, there are 3 possible "continuations" through which control flow will return to the caller, and by fiat they are expressed like this.

Yes.

The ES6 generator interface expresses exactly these continuation types, except that we are pushing the continuation to the callee instead of returning it to the caller:

I don't see how that is dual, nor how that reverses the arrows. I do understand that sort of kind of it's supposed to express a similar thing in continuation. I don't see how the type signatures match and worse, I don't understand how they're isomorphic to matching types.

.next is not the inverse of a return since it cannot signify completion, return does not signify done-ness it signifies an explicit request to complete (so not the same as completion), the return types of the methods do not match at all and so on.

Your post did help clarify what you're doing here - thanks.

I'd love to be convinced how that is dual to our iterator interface, I see how abnormal completion and .throw are similar but I'd love to see some formality like in @headinthebox 's work and I feel that such formality and reliance on mathematical duality (and creating an API on top of that) would be very helpful.

Now to clarify further, I'm not trying to undermine your or Jafar's work, I know you put a lot of work into the current API and design. I think a formal dualization will however - teach us a lot about the problem of creating this API in JavaScript and will likely help resolve issues.

The way I see the dualization, we can definitely have onNext and onCompleted callbacks, but we do not need things like Symbol.dispose Symbol.terminate Symbol.observe and such.

@zenparsing
Copy link
Member Author

I don't see how that is dual, nor how that reverses the arrows. I do understand that sort of kind of it's supposed to express a similar thing in continuation. I don't see how the type signatures match and worse, I don't understand how they're isomorphic to matching types.

If you're trying to create a literal type signature representation, then you're likely going to be frustrated. On the iteration side, we don't have an error type, per se. We have the error value coming back as a thrown exception. On the observation/generator side, the type information that you're looking for is embedded in the name of the method that we invoke.

As @jhusain says, those choices make perfect sense for javascript, and don't alter the fundamental semantic duality: all continuations returned from a call to next can also be represented by method calls to generators.

I'll expand on the need for a "terminate" method on generators in a bit...

@benjamingr
Copy link

If you're trying to create a literal type signature representation, then you're likely going to be frustrated.

Why? Do you see any mistakes with the duality in #14 (comment) ?

On the iteration side, we don't have an error type, per se. We have the error value coming back as a thrown exception.

That's fine, you can see how @headinthebox deals with it by representing it with a Try[] type in his talks, we can preserve the duality during the proof by saying that the result of iteration methods is a Try[{value, done}] and invert that - for example, in https://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2014/Keynote-Duality this is demonstrated.

On the observation/generator side, the type information that you're looking for is embedded in the name of the method that we invoke.

I'm not sure I understand this, just to be perfectly clear I'll spell this out in your design the generator is the observer? Is this "generator interface" the dual of the observer? If it's the observer, I'm still not convinced it's dual to the iterator at all. If it is, proving it is merely performing steps we can prove are type isomorphic and then actually dualizing it - please do so. If it's not - I want to know why the duality is broken, I'm not saying duality is a goal here, just that if we're claiming it we should be able to prove it the way @headinthebox does.

As @domenic pointed out, saying that the API is different but duality is preserved is very handwavey.

as @jhusain says, those choices make perfect sense for javascript, and don't alter the fundamental semantic duality:

Why?

all continuations returned from a call to next can also be represented by method calls to generators.

I'm probably at fault here but I really don't understand why that has to do with duality. Why does it represent the same way?

The generator interface represents arbitrary inverted flow control for the generators where you can, from the outside, force a return or a throw (the two ways to terminate flow) or to tell it to progress until it gives you another "control point".


Now let's play a nice mental game - there is no generator, there is no function paused in the background, there is just an iterator (let's say one you got from a host object, without a backing function). The API our iterator exposes has a next method, a return method and a throw method. It's for arbitrary iteration. All we're left with are function signatures and expected usage.

Now all we have to work with are the function signatures and expected usage. We have an iterator - What next does is pretty obvious, it progresses the iterator, gives us the result and tells us if it's done.

That's where things get tricky - What throw does is signal from the consumer that an error occurred in the consumer side and we're explicitly propagating it to the producer - this is an insanely powerful tool (props to whoever came up with it) since we can tell the producer about the error and it may choose to recover from it in a nifty way like re-sending values.

As for return, it's the ace card here. What does it signify? It tells the sequence we're done with it, this disinterest and "being done with" has many names, it's cancellation, it's disposing, it's a ThreadAbort, it's many things but most of all - it's a signal of disinterest. It can model cancellation semantics, dispose semantics, ThreadAbort semantics and so on because not only does it give us a chance to signal to the iterator we're disinterested in it - it also gives it the ability to give us "one final value".

If we're not talking about a generator but an arbitrary iterator, calling return can:

  • return {done: true, value: undefined } - to signal it was either already finished, or has now finished.
  • return {done: false, value: someValue } - to signal it got our message but did not choose to terminate - for example if it has other subscribers.
  • return {done: false, value: undefined } - to signal that it got our message of disinterest, has nothing else to tell us but is not terminating.
  • throw an error - to signal that we have performed an invalid operation on the generator.

All four are completely distinct in terms of what they represent. You can model termination semantics, cancellation semantics, dispose semantics and more with it pretty easily, not to mention that return() return(undefined) return(value) are all different in terms of what they represent. (Unlike throw(val) vs throw() where the latter is just throwing undefined. Like next() vs next(value) where the two ar conceptually different)

@jhusain
Copy link
Collaborator

jhusain commented Jun 28, 2015

It has been suggested that the only way to get the dual of Iterable is to take the dual of each individual method on the ES2015Iterable and the ES2015Generator. This is not sufficient to evaluate duality. I can see that the confusion stems from the fact that @headinthebox takes the method-by-method approach when explaining the duality of Iterable and Observable to C# developers. This works because the C# Iterator explicitly encodes all of the important semantics of Iteration in its type signature. As I will demonstrate this is not the case with ES2015Generators, which rely on convention to express co-routine semantics.

Let's say this was the definition of Generator (don't worry, we will get to the actual ES2015 definition soon):

interface Generator {
val(IterationResult): IterationResult
}

Now I'm going to do exactly what @headinthebox did when he took the Observable/Iterable dual: bring exceptions into the type system. The fact that JavaScript can throw when val() is invoked needs to be captured in the type system if we are to consider it in the dual operation. This can be accomplished easily enough by adding an "error" to the IterationResult. If the "error" property is defined in the IterationResult we have an error and iteration/observation is complete.

Note that the IterationResult can be in one of three mutually exclusive states. The JS class is not expressive enough to capture this, so I will model it as a discriminated union. My notation was criticized for not being a real PL, so I will use F#:

type IterationResult =
| Next of Object
| Error of Object
| Return of Object

Now we see that a Generator can be expressed as the following F# function type:

IterationResult -> IterationResult // Generator

Note that the Generator is self-dual.

IterationResult -> IterationResult // dual

Now let's arbitrarily split the "val" function into three individual methods a la ES2015 Generator:

interface ES2015Generator {
IterationResult next(Object);
IterationResult throw(Object);
IterationResult return(Object);
}

Note the following equivalency table:

Generator fn Generator interface
val({value: 3, done:false}) next(3)
val({error: new Error() }) throw(new Error())
val({value: 5, done: true}) return(5)

Note that both of these types have all of the important semantics required for a coroutine: bidirectional communication of IterationResult. Now here comes the important point: These two Generator definitions are only semantically equivalent if we rely on convention. In this case the convention is that the callee infers the IterationResult of the caller based on the method that was invoked. Note that this convention is not be captured in the type system of any individual method. The notion is completely outside of the type system and is captured nowhere. Therefore we should not expect to see it in the dual.

Here's the franken-type we end up with if we take the dual of each method on our convention-based ES2015Generator:

interface DualOfEachMethodOnES2015Generator {
Object next(IterationResult);
Object throw(IterationResult);
Object return(IterationResult);
}

By taking the dual of each method in a interface which relies on convention to convey key semantics, we end up with something less semantically expressive than the both Generator type definitions. In this supposed dual, the caller conveys their IterationStatus twice, both in their decision of which method to invoke, and the information in their input IterationResult. However we also see that the receiver has no way of conveying iteration status back to the caller, which reflects the semantics lost to convention. The type has become dysfunctional, because important semantics have been lost to convention and you can't take the dual of a convention.

This example clearly falsifies the notion that taking the dual of the each method on the arbitrary design of an ES2015Generator type is instructive in any way. With all due respect to dissenting parties, this approach to evaluating duality already stood up to significant scrutiny at the last TC-39 meeting. I'm confident that it will continue to stand up to more intense scrutiny, and I'm capturing my argument here in the event that this happens. I believe it is possible to compartmentalize this argument, move on to discussing whether return(undefined) is sufficiently expressive to describe terminate semantics. I will do this in a future post.

@benjamingr
Copy link

It has been suggested that the only way to get the dual of Iterable is to take the dual of each individual method on the ES2015Iterable and the ES2015Generator. This is not sufficient to evaluate duality.

Well, assuming you're talking about my messages - I've said in two separate messages, #14 (comment) and #14 (comment) that this is not actually required. We can take the iteration protocol and prove isomorphism (that is, that we've only renamed stuff - that there exists a bijection between the protocols) to another protocol and then dualize that Any approach we can prove maps to the current protocol is fine.

I can see that the confusion stems from the fact that @headinthebox takes the method-by-method approach when explaining the duality of Iterable and Observable to C# developers.

Also in his papers - the transition is very straightforward. Duality is simply inverting the arrows, at least if we use that word the way Erik does or the way Math does - Take a function f:: v -> u and return a f':: u' -> v'.

This works because the C# Iterator explicitly encodes all of the important semantics of Iteration in its type signature. As I will demonstrate this is not the case with ES2015Generators, which rely on convention to express co-routine semantics.

Only if we claim the dual of the smaller (.next only) iteration protocol is the ""ES2015Generator"" protocol. It's entirely possible to start with the fuller iteration protocol (with .next .return and .throw) and dualize that into observables as done in #14 (comment) . The type information seems to encode everything I need that way - so it doesn't appear (to me) to be a problem of the protocol since if we dualize it we get a meaningful result with an API we dislike we can improve.

Now I'm going to do exactly what @headinthebox did when he took the Observable/Iterable dual: bring exceptions into the type system. The fact that JavaScript can throw when val() is invoked needs to be captured in the type system if we are to consider it in the dual operation. This can be accomplished easily enough by adding an "error" to the IterationResult. If the "error" property is defined in the IterationResult we have an error and iteration/observation is complete.

I agree with the idea but I don't agree with the construction. I don't think a Try[IterResult] is the same thing as the extended IterResult with an error property since that would mean it can indicate completion if it threw outside which is a semantic the dualization doesn't have. An object with a next, done and error property and having an object with a next and done property or an exception is two different types.

Note that the IterationResult can be in one of three mutually exclusive states. The JS class is not expressive enough to capture this, so I will model it as a discriminated union. My notation was criticized for not being a real PL, so I will use F#:

I'm fine with any notation I understand, thanks :)

Now, this is the part that confuses me, I'm not sure why that maps all the possible states of an iteration result - but you get to that later so let's do that.

Now let's arbitrarily split the "val" function into three individual methods a la ES2015 Generator:... Note the following equivalency table: ...

We only mapped a very small subset of what .next .return and .throw can do. We have not (but can) distinguish return and next with and without a value.

Note that both of these types have all of the important semantics required for a coroutine: bidirectional communication of IterationResult.

Yes, I see that, they behave semantically different for this goal but they both accomplish it.

Now here comes the important point: These two Generator definitions are only semantically equivalent if we rely on convention.

They don't sound semantically equivalent to me at all. The semantics of the first one and the type information is different. If we change the ES2015 actual protocol and remove semantics from it to the point it's no longer the ES2015 protocol but something else and then say we call it the ES2015 protocol and only use it as a subset - then we can show that they're equivalent.

In this case the convention is that the callee infers the IterationResult of the caller based on the method that was invoked.

It's more than that though, it's the fact we're explicitly prohibiting a lot of semantics Try[IterResult] captures, it also depends on how we type next and return and whether or not we capture return semantics with or without "not passing an argument".

Note that this convention is not be captured in the type system of any individual method. The notion is completely outside of the type system and is captured nowhere. Therefore we should not expect to see it in the dual.

Why is that necessary though? Why did we pick up a notation for our protocol (we got to pick the type system too) and then said our duality is not captured anywhere in it?

Here's the franken-type we end up with if we take the dual of each method on our convention-based ES2015Generator:

I thought F# interfaces were defined with type and not the interface keyword which implements them, so I'm probably missing on newer syntax here. I'm also not aware of a Try and Maybe type in F# (because my personal lack of experience with it) So I'm going to do my best to work with the definition and let me know if I missed something:

interface DualOfEachMethodOnES2015Generator<T> {
T next(Try[IterationResult]);
T throw(Try[IterationResult]);
Try[IterationResult] return(Maybe[T]); // note this was not inverted
}

By taking the dual of each method in a interface which relies on convention to convey key semantics, we end up with something less semantically expressive than the both Generator type definitions.

I'm not saying that this protocol should be used for the user facing API, but that the user facing API should be based on something isomorphic to it or that we should stop claiming duality. I don't think it makes for a good API, but that it shows us interesting properties if we pursue duality.

In this supposed dual, the caller conveys their IterationStatus twice, both in their decision of which method to invoke, and the information in their input IterationResult.

Only if you add the error field like in your construction, this is something I believe that dualizing the type more accurately avoids.

However we also see that the receiver has no way of conveying iteration status back to the caller, which reflects the semantics lost to convention.

I'm not sure what you mean by this - how would the receiver convey iteration status? They have the return value - we can claim that the sent value of .next in the iteration protocol is Maybe[T] in fact (since we can call it with no arguments) and capture it like that. I'm not sure I understand what the T return value means here (maybe @headinthebox can shed some light).

The type has become dysfunctional, because important semantics have been lost to convention and you can't take the dual of a convention.

If we think that then duality is not something we're doing or a goal of this proposal. The proposal can safely not claim it and diverge more freely from the semantics of a dual API.

This example clearly falsifies the notion that taking the dual of the each method on the arbitrary design of an ES2015Generator type is instructive in any way. With all due respect to dissenting parties, this approach to evaluating duality already stood up to significant scrutiny at the last TC-39 meeting.

Honestly I don't see how the scrutiny of TC-39 members is relevant here. I have mad respect to a lot of TC members (you included) but the fact people scrutinised the design doesn't mean we should not address issues with it as they arise (as in this case).

I'm confident that it will continue to stand up to more intense scrutiny, and I'm capturing my argument here in the event that this happens. I believe it is possible to compartmentalize this argument, move on to discussing whether return(undefined) is sufficiently expressive to describe terminate semantics. I will do this in a future post.

Yes, discussing the expressiveness of return would be nice :)

@domenic
Copy link
Member

domenic commented Jun 29, 2015

This example clearly falsifies the notion that taking the dual of the each method on the arbitrary design of an ES2015Generator type is instructive in any way. With all due respect to dissenting parties, this approach to evaluating duality already stood up to significant scrutiny at the last TC-39 meeting.

This is not at all clear to me, as @benjamingr's arguments always manage to express during times when I'm on planes :). I am a TC member as well, and apologize for not being there at the meeting to voice my doubts, but it's been good to get some offline time to express them as well. We can't just steamroll over them by saying they're "clearly falsified." As @benjamingr says,

I'm not saying that this protocol should be used for the user facing API, but that the user facing API should be based on something isomorphic to it or that we should stop claiming duality.
...
If we think that then duality is not something we're doing or a goal of this proposal. The proposal can safely not claim it and diverge more freely from the semantics of a dual API.


@benjaminr, I am curious why you do not reverse return as well. I also disagree with your assertion that return() is different from return(undefined); the fact that these are distinguishable is, in my mind, an edge-case artifact of the language and not something you'd want to encode into anything type-system-esque. There's also the issue that errors are not of type T. (And that generators aren't typed anyway.)

As such I think a proper dual of an ES2015 generator would be

interface ES2015Generator {
  Try[IterationResult] next(any);
  Try[IterationResult] throw(any);
  Try[IterationResult] return(any);
}

interface DualOfES2015Generator {
  any next(Try[IterationResult]);
  any throw(Try[IterationResult]);
  any return(Try[IterationResult]);
}

Now for my take on it, while at my computer and not on mobile.

I think perhaps the duality that @jhusain is trying to express is not in fact one that has anything to do with generators, besides punning on their names. Here is one path that seems plausible to me.

First, ignore generators. Focus only on unidirectional iterators, without the ability for consumers to send values backward. In that case the interface is

interface ES2015Iterator {
  Try[IterationResult] next(); // note: no parameter
}

The dual is then of course

interface DualOfES2015Iterator {
  void next(Try[IterationResult]);
  // maybe the return type is actually Try[void]?
}

Our goal, because we like C# and think that its observables provide a productive way of programming, is to somehow justify the interface

interface Observer {
  void onValue(any);
  void onComplete(any);
  void onError(any);
  // maybe these should return Try[void]?
}

with an appeal to "duality".

To do this, it suffices to produce a bijection between Observer and DualOfES2015Iterator. I believe the intention is that you biject via

  • Exceptions e sent to iterator.next (via the Try[] type; we can't actually express this in JS directly, since JS uses exceptions instead of functional-style Try or Either) map to calls to observer.onError(e)
  • IterationResults in the form { value, done: true } sent iterator.next map to calls to observer.onComplete(value)
  • IterationResults in the form { value, done: false } sent to iterator.next map to calls to observer.onValue(value).

The final step, which I think introduces most of the confusion, is to simply pun on the names that generators use, and rename onValue to next, onComplete to return, and onError to throw. These names in the observer context have nothing to do with generators. (And indeed, generators and observables are not a great match, as can be seen from the awkward example in the README.md, with its "priming".) There's some appealing "hey look they match"---but it's not a sign of anything deeper. No duality with generators. It's entirely about iterators.

That's my take. What it says about abrupt termination is then pretty clear: there is no capability for that built in to the system. Since iterators are a completely unidirectional protocol, with no way to communicate cancelation back to the source, then of course their dual observers will have the same limitation.

So that's the starting point. The question then, for observable fans, is how do you tack on cancelation semantics into something which does not have them---and which certainly does not arise naturally out of the duality with iterators, since iterators don't have such semantics either.

My guidance would be to do one of two things:

  1. Embrace cancelation tokens. They have issues, but are very very simple to integrate into any model.
  2. Figure out what adding cancelation semantics to iterators looks like. Then dualize that. Remember to separate the steps of dualization, bijection to something more convenient, and finally renaming into something punny.

@benjamingr
Copy link

@benjamingr, I am curious why you do not reverse return as well.

Because I did not consider it as part of the duality, like IDisposable in the C# duality. As @headinthebox said, disposing is handwavey and he has not proven it himself in C# semantics. I viewed return as external to the duality (as a dispose mechanism in both cases), inverting the IDisposable trait didn't sound like something we'd want to do (unlike next). This is debatable.

I also disagree with your assertion that return() is different from return(undefined); the fact that these are distinguishable is, in my mind, an edge-case artifact of the language and not something you'd want to encode into anything type-system-esque.

Well, I agree that this is debatable, since you can look at it as function overloading or as a language artifact. We can either treat it as return:: Maybe value -> Try IterResult or as return:: value -> Try IterResult - both are fine by me. I think the distinction is important.

There's also the issue that errors are not of type T. (And that generators aren't typed anyway.)

Errors were not typed because JavaScript does not sport checked exceptions (like C#, unlike Java), a function does not have to declare the type of errors it throws and that is not a part of its signature. As other choices this is debateable, I picked an arbitrary (but valid) type system and inverted the arrows - any form of duality established on this basis (inverting the arrows of a typed Generator interface) sounds reasonable to me.

As such I think a proper dual of an ES2015 generator would be

That is also acceptable, we restrict less here but represent things more accurately. (I think that typing iteration to the same type is a reasonable assumption to make on the actual protocol but you're correct that this type is more accurate).

I think perhaps the duality that @jhusain is trying to express is not in fact one that has anything to do with generators, besides punning on their names. Here is one path that seems plausible to me.

I agree with this 100%, I'm not sure why generators are assumed to be the dual of iterators. Aside from being a vague similarity between the return type of next and throw/return/next (which also exists in any other representation of flow control) I don't see the protocols as similar or dual.

I'd consider the next/return/throw protocol as the extended ES2015 protocol and then dualize that.

interface ES2015Iterator {

This is actually a much more similar representation to the duality Erik shows for C# iterators.

Our goal, because we like C# and think that its observables provide a productive way of programming, is to somehow justify the interface ... with an appeal to "duality".

Well, another (big) reason to justify the onNext onError onCompleted interface is that it's proved to be useful in libraries like Rx (and RxJS in JavaScript). It's actually also definitely dual to the iteration protocol you presented (with no value passed in next) - this is definitely something @headinthebox proved.

To do this, it suffices to produce a bijection between Observer and DualOfES2015Iterator. I believe the intention is that you biject via ... (via the Try[] type; we can't actually express this in JS directly, since JS uses exceptions instead of functional-style Try or Either)

This is perfectly fine and is what Erik does in C# and Scala too. Your mapping looks perfectly fine (and formalizable) too.

These names in the observer context have nothing to do with generators.

I also agree.

No duality with generators. It's entirely about iterators.

I also tend to think that but I'm not as convinced yet.

That's my take. What it says about abrupt termination is then pretty clear: there is no capability for that built in to the system. Since iterators are a completely unidirectional protocol, with no way to communicate cancelation back to the source, then of course their dual observers will have the same limitation.

Well, if we dualize the extended iteration protocol (generator protocol) as our iteration protocol, which has next/return/throw with values, we have bidirectional flow in both ends and termination. I'm not saying we should but this is a limitation that we got by taking next the way you did.

So that's the starting point. The question then, for observable fans, is how do you tack on cancelation semantics into something which does not have them---and which certainly does not arise naturally out of the duality with iterators, since iterators don't have such semantics either.

I think what @jhusain intends to do is to add cancellation semantics to the iteration protocol too. If we base it on a dualization of the generator protocol (extended iteration) we can have cancellation semantics. An alternative is to base it on what libraries do or what APIs we like better and not worry about duality at all.

Figure out what adding cancelation semantics to iterators looks like. Then dualize that. Remember to separate the steps of dualization, bijection to something more convenient, and finally renaming into something punny.

I think this is what @jhusain is doing. An alternative is to base it off extended iteration and not just iteration as the protocol. In either case we should stop calling the current design dual or based off duality.

Embrace cancelation tokens. They have issues, but are very very simple to integrate into any model.

I think this would make a very unergonomic API but I think @headinthebox disagrees since he said he would use it had he gotten a second chance.

@benlesh
Copy link

benlesh commented Jun 29, 2015

@benjamingr I'm sure @headinthebox will add his two cents when he deems it necessary.

@benlesh
Copy link

benlesh commented Jun 29, 2015

FWIW, I think cancellation tokens are a horrible API in comparison to something like a subscription function or object. This is primarily because they don't compose. They're just a really inelegant solution. When faced with an async API that gives me a cancellation token, I'm generally quick to wrap it in an Rx Observable to clean things up.

@benjamingr
Copy link

Thanks for reminding me why I generally avoid these discussions with you @Blesh. Can we please agree to drop the tone?

If you read carefully Erik did in a previous comment in fact encourage tokens:

(In some sense using .NET cancellation tokens and cancellation token sources https://msdn.microsoft.com/en-us/library/dd997289(v=vs.110).aspx in Rx would be cleaner than disposables. But that is another discussion)

Update: I talked to Erik, he said he'll weigh in on this but has been busy these last couple of days so it'll take time. This is another reason not to rush any design choices on the API IMO.

@benlesh
Copy link

benlesh commented Jun 29, 2015

Thanks for reminding me why I generally avoid these discussions with you @Blesh. Can we please agree to drop the tone?

@benjamingr you're an intelligent guy with a lot to contribute. There was no tone. Trying to gently remind you not to appeal to authority and keep your thoughts/contributions your own. As I said, I'm sure Erik will chime in at some point, and may very well confirm everything you've said; But I think mentioning him over and over is distracting and it pollutes your point.

You're free to avoid discussions with whomever, of course. I take no offense.

@domenic
Copy link
Member

domenic commented Jun 30, 2015

I think what @jhusain intends to do is to add cancellation semantics to the iteration protocol too. If we base it on a dualization of the generator protocol (extended iteration) we can have cancellation semantics. An alternative is to base it on what libraries do or what APIs we like better and not worry about duality at all.

Yeah, things seem very confused to me.

As far as I can tell, generators already have an interface for expression cancelation. Two, in fact, depending on whether you want cancelation or abortion (viz. return and throw).

The problem is, we already used up those names when we renamed onComplete to return, and onError to throw, while dualizing iterator. Now we're thinking "it would be nice to add a cancelation semantic ... we could pun on the return name from generators, which is similar ... but oh no, we already punned on that name!"

I definitely don't think the answer is to add a new method (terminate or whatever) to generators. Generators already have more than enough. The issue is that iterators do not. So let's say we extended iterators:

interface ESnextIterator {
  Try[IterationResult] next();
  Try[IterationResult] cancel(any); // cancelation reason argument

  // return type maybe should be void, but it's plausible that
  // iterators would want to respond to cancelation.
}

In a world without observables, it might even be natural to make ESnextIterator a subtype of ES2015Generator, by renaming:

interface ESnextIterator2 {
  Try[IterationResult] next();
  Try[IterationResult] return(any);
}

But now we get ourselves into trouble, because we've already dualized ES2015Iterator into Observer, which is already using the name return. And we have the question: when dualizing ESnextIterator2, do we dualize return? Or do we treat cancelation as something tacked on the side of both iterator and observer, which isn't to be dualized?

And how does this all fit with the current design, where the responsibility for cancelation isn't even related to the observer, but instead belongs on this new subscription object?

To reemphasize, it's pretty clear that there's no need to extend generator semantics with a third "terminate" completion. But everything else is less clear to me.

@benjamingr
Copy link

To reemphasize, it's pretty clear that there's no need to extend generator semantics with a third "terminate" completion. But everything else is less clear to me.

👍

I opened #33 to discuss whether or not we even want to use the generator interface.

@zenparsing
Copy link
Member Author

Since discussion has moved elsewhere, and we have some other proposals in the works, I'm going to close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants