Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-repeat doesn't stop after being disposed of #94

Closed
joshaber opened this issue Nov 1, 2012 · 36 comments · Fixed by #169
Closed

-repeat doesn't stop after being disposed of #94

joshaber opened this issue Nov 1, 2012 · 36 comments · Fixed by #169
Assignees
Labels

Comments

@joshaber
Copy link
Member

joshaber commented Nov 1, 2012

Noted in #93:

RACReplaySubject *subject = [RACReplaySubject subject];
[subject sendNext:@(1)];
[subject sendCompleted];
[[[subject repeat] take:1] subscribeNext:^(id x) { ... }];

Loops indefinitely.

@joshaber
Copy link
Member Author

joshaber commented Nov 4, 2012

Ugh. So this is a pretty annoying one. Since the repeat and replay happen on the same thread, it gets stuck replaying and repeating before -repeat's ever returned a disposable for -take: to dispose of.

It might be a bit more obvious in this example:

RACSubscribable *subscribable = [RACSubscribable createSubscribable:^ RACDisposable * (id<RACSubscriber> subscriber) {
    [subscriber sendNext:@1];
    [subscriber sendCompleted];
    return nil;
}];

__block RACDisposable *disposable = [[subscribable repeat] subscribeNext:^(id _) {
    NSLog(@"disposable: %@", disposable);
    [disposable dispose];
}];

Disposable will always be nil so the repeat can't be stopped. It gets stuck in the infinite loop within the original subscribe to -repeat before the disposable's ever returned. Bleh.

We had a similar problem with the old implementation of generators. I'm not really sure how to solve this generally.

I think Rx solves this by saying, you can never know what scheduler results might be delivered on, unless you manually specify it. That way they can split the repeat and replay so that the disposable's returned before the repeat actually happens.

@jspahrsummers
Copy link
Member

Maybe we could add a subscription variant that gives the disposable to the block? Seems like that might be generally useful anyways.

@joshaber
Copy link
Member Author

joshaber commented Nov 4, 2012

If I understand you right, the tricky thing is that we don't have the important disposable until after the subscribable's didSubscribe block has returned... and didSubscribe never returns in cases like these.

@Coneko
Copy link
Member

Coneko commented Nov 5, 2012

Generally speaking, are subscribers supposed to be able to dispose their own subscription?

@joshaber
Copy link
Member Author

joshaber commented Nov 5, 2012

Absolutely, they really have to to be able to end potentially unending subscribables.

@Coneko
Copy link
Member

Coneko commented Nov 5, 2012

What about timeliness? Is it enough if they're only able to dispose the subscription eventually and not immediately?

The general pattern used for implementing a lot of the methods on <RACSubscribable>, as in calling +createSubscribable, immediately sending values to the subscriber and only then returning the disposable, doesn't allow subscribers to dispose of the subscription until after all the initial values have been received. Especially noticeable on RACReplaySubject since it sends the whole replay before returning the disposable.

I was wondering if that behaviour was to be considered a bug too.

@joshaber
Copy link
Member Author

joshaber commented Nov 5, 2012

Yes, I think that's a good summary of the heart of this bug.

@Coneko
Copy link
Member

Coneko commented Nov 5, 2012

Oh ok, because I fixed -repeat by resubscribing asynchronously instead of synchronously, which avoids the buffer overflow, and gives the subscriber a chance to dispose of the subscription between a repeat and the next, guaranteeing the subscribable will stop sending stuff eventually, but of course it doesn't do anything to fix the underlying problem of being unable to unsubscribe during each repeat.

So it's just a band-aid, not a real fix. (https://gist.github.com/4019491)

@joshaber
Copy link
Member Author

joshaber commented Nov 5, 2012

I think a generalized version of that is the best fix. But dispatch_get_current_queue is going away/already gone so we can't use it. +[NSOperationQueue currentQueue] doesn't seem to make strong enough guarantees for us.

Which takes us back to the Rx position of saying you can't depend on the queue it's in unless specified.

@jspahrsummers
Copy link
Member

It's not generally safe to asynchronously dispatch to the current GCD queue, because it could be a temporary user-created queue that's going away. The block will still get executed (because GCD makes such a guarantee), but could result in crazy unpredictable behavior to the user.

@Coneko
Copy link
Member

Coneko commented Nov 5, 2012

Right, GCD does say queued blocks retain the queue they're queued on.

You could implement a policy of "if you call RAC methods from a RACScheduler, stuff gets delivered on the calling scheduler, otherwise stuff gets delivered on the main thread".
At least then chaining -deliverOn: and -subscribeOn: will still work like it does now.

@jspahrsummers
Copy link
Member

The "calling scheduler" is a nebulous concept, though. For an operation queue, that's +currentQueue. For a GCD queue, that's dispatch_get_current_queue. Those aren't interchangeable, since an operation queue may be backed by any number of GCD queues, and they're not required to stay the same.

@joshaber
Copy link
Member Author

joshaber commented Nov 5, 2012

Exactly, that's why it's so awkward. We can still make guarantees about -deliverOn: and -subscribeOn:, but they become much more necessary.

@Coneko
Copy link
Member

Coneko commented Nov 5, 2012

What I mean by "calling scheduler" is something like +currentQueue does. +currentQueue only returns a valid NSOperationQueue if you call it from code that was queued on a NSOperationQueue. Likewise, a +currentScheduler method would return a RACScheduler only if the code it was called from was scheduled on a RACScheduler in the first place.

The implementation could mix +currentQueue with associated objects and dispatch_queue_set_specific/dispatch_get_specific to get the current RACScheduler whether it used one or the other as backing. (Not sure how to solve the problem of clearing the reference to the RACScheduler without potentially getting blocks scheduled while it's in -dealloc)

The user would lose the ability to create new schedulers with arbitrary scheduling backing because of this.

Since all the scheduling of subscribables goes through RACScheduler anyway, that would help keep the scheduling predictable.

I'm not against explicitly specifying the scheduler on which to deliver, I just wouldn't want to have to specify it multiple times in a chain because each link of the chain can potentially reschedule on a different one.
At least, that's how I understand it would turn out if subscribers didn't give any guarantees about that.

@jspahrsummers
Copy link
Member

@Coneko I think there are still problems with that idea, because it'd be possible to get into a case where one thread or queue is technically associated with multiple schedulers.

This is mostly an issue with concurrent GCD queues. For example, an operation queue RACScheduler might use a global GCD queue for execution (as an implementation detail). Blocks scheduled on the +backgroundScheduler shouldn't see the operation queue scheduler, and vice versa, but there's no way to ensure this with dispatch_set_queue_specific.

@Coneko
Copy link
Member

Coneko commented Nov 25, 2012

Yes, the general idea isn't very robust, so the implementation has a lot of gotchas, but in your example the RACScheduler should never use a global GCD queue directly, instead it should use a private GCD queue, and set the target queue for it to the global GCD queue. That way dispatch_set_queue_specific can be called on a meaningful target.

I agree it's it's not perfect, and as I mentioned before it makes implementing custom RACSchedulers very error-prone, so it'll end up not being something a user of the framework is expected to be able to do.
It definitely brings more problems than it solves as long as the only real problem with the current implementation is with certain instances of -repeat.

@jspahrsummers
Copy link
Member

That's a good point about target queues. I'm curious if some mix of deferred scheduling, operation queue scheduling, etc. could still result in an incorrect currentScheduler, though.

@Coneko
Copy link
Member

Coneko commented Nov 26, 2012

I'd be backing my idea a lot more if unit testing these threading shenanigans reliably were possible.

@jspahrsummers
Copy link
Member

The implementation could mix +currentQueue with associated objects and dispatch_queue_set_specific/dispatch_get_specific to get the current RACScheduler whether it used one or the other as backing. (Not sure how to solve the problem of clearing the reference to the RACScheduler without potentially getting blocks scheduled while it's in -dealloc)

I found out today, while working on #138, that dispatch_get_specific only reads from the current queue and its target queues. If you dispatch_sync from one targetless queue to another targetless queue, you won't be able to read specific data from the former in a block running on the latter.

This makes it just as broken as dispatch_get_current_queue, and kind of hamstrings anything we could do with RACScheduler.

@Coneko
Copy link
Member

Coneko commented Nov 28, 2012

I understood dispatch_get_specific to work that way, but I didn't think it was a problem. It still falls under "only works if called from code running on a RACScheduler" clause right? After all dispatch_sync doesn't call the code from the scheduler's queue, it locks the queue and calls the code from another queue. You wouldn't expect it to work if you implemented something like that yourself.
It does mean you can't use it to implement RACScheduler, but not that it's broken from a caller's perspective.

@jspahrsummers
Copy link
Member

I think that means we'll end up in a lot of cases where +[RACScheduler currentScheduler] is nil. In particular, we won't have a reliable +deferredScheduler.

@jspahrsummers
Copy link
Member

I think the sanest way to resolve this would be to change the subscription API a bit. For example, if +createSignal: were changed to accept a subscriber and a block or pointer that would tell it if it were disposed, you could implement a pattern like the following:

return [RACSignal createSignal:^(id<RACSubscriber> subscriber, BOOL *stop) {
    while (YES) {
        if (*stop) break;
        [subscriber sendNext:RACUnit.defaultUnit];
    }

    [subscriber sendCompleted];
    return nil;
}];

@Coneko
Copy link
Member

Coneko commented Nov 30, 2012

I think that's a very good idea. Regardless of how the thing with the schedulers evolves, having to do all that dispatching/scheduling just to implement even something as simple as +return: properly is unwieldy.

@joshaber
Copy link
Member Author

@jspahrsummers I'm not sure how that'd solve the problem. You'd get stuck in the loop before you'd get access to the address of stop.

@jspahrsummers
Copy link
Member

With an API like that available, you could also implement stuff like:

- (void)subscribeNext:(void (^)(id value, BOOL *stop))next error:(void (^)(NSError *))error completed:(void (^)(void))completed;

I could take a pass at it. I'm pretty sure it would solve this issue.

@joshaber
Copy link
Member Author

But now we'd have two different ways of stopping a signal: with a disposable or with *stop = YES. Also, this problem is solved by the new scheduler work. I'm not sure what this would give us besides an uglier API.

@jspahrsummers
Copy link
Member

This API is uglier, but the new scheduler work feels like a hack to work around the timing of how disposables are created. Why not just solve when they're created/made available?

The idea of setting a BOOL *stop could also be encapsulated in a disposable:

- (RACDisposable *)subscribeNext:(void (^)(id, RACDisposable *))next;

@Coneko
Copy link
Member

Coneko commented Nov 30, 2012

I don't think adding new parameters to the subscription methods is right. Subscriptions should work correctly regardless of how they're disposed of, you can't give the user two ways of disposing them, and then have them behave differently.

Rather, change the +createSignal: method to take two blocks. One that returns a disposable, and one that creates the subscription. Call the one that returns the disposable first, return the disposable, then... oops.

I guess that would go back to the scheduler fix anyway.

Still, it would hide the scheduling complexity in the internal implementation, leaving more elegant APIs both for the user that creates the signal and the user that consumes it.

@joshaber
Copy link
Member Author

I don't see how any of these ideas would work.

You can't create/return a disposable without subscribing, and as soon as you subscribe you're going down the rabbit hole of infinite subscriptions. The re-subscribe has to be deferred. I don't see any other way.

@jspahrsummers
Copy link
Member

Maybe it was a mistake to bring up concrete APIs before really explaining my thought process.

The key I'm trying to communicate is this: you can return a disposable without subscribing, as long as:

  1. Infinite signals can watch its status.
  2. Subscribers can access it before the subscription invocation actually completes.

That's what the BOOL *stop argument was encapsulating, but it can be done in other ways too.

@Coneko
Copy link
Member

Coneko commented Nov 30, 2012

What I meant before was in fact returning the disposable immediately, but subscribing deferred to ensure the caller received the disposable in the meanwhile and was able to dispose it if needed.

@jspahrsummers : I agree it would work, I just think the API would be really ugly if all the methods that currently return RACDisposable * would have to accept a RACDisposable ** argument instead.

@joshaber
Copy link
Member Author

Again, this seems to just make the API uglier for a problem that can be solved other ways.

@jspahrsummers
Copy link
Member

Alright, well, I'm willing to give deferred subscription a shot. I just think it's going to be really surprising sometimes; for example, what happens to RACAbleWithStart and code dependent upon those values arriving immediately?

@joshaber
Copy link
Member Author

@jspahrsummers That's exactly what +subscriptionScheduler is solving. Since that (should) be on the main queue, it'd start immediately like it always has.

@jspahrsummers
Copy link
Member

👍

@ghost ghost assigned jspahrsummers Dec 2, 2012
jspahrsummers added a commit that referenced this issue Dec 5, 2012
In order to solve issue #94, it's necessary for the +iterativeScheduler to enqueue the block, return full control to the currently-scheduled block, and then let that block complete. Otherwise, we may still not get the disposable we need.
@joshaber joshaber closed this as completed Dec 7, 2012
@thenikso
Copy link
Contributor

thenikso commented Dec 7, 2012

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants