Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

"Safe" synchronous operations can deadlock in the face of target queues #1

Closed
andymatuschak opened this Issue August 02, 2012 · 11 comments

2 participants

Andy Matuschak Justin Spahr-Summers
Andy Matuschak

Try adding this test to SDQueueTests.m:

- (void)testTargetedQueueDeadlock
{
    dispatch_queue_t firstQueueT = dispatch_queue_create("1", DISPATCH_QUEUE_SERIAL);
    dispatch_queue_t secondQueueT = dispatch_queue_create("2", DISPATCH_QUEUE_SERIAL);

    dispatch_set_target_queue(firstQueueT, secondQueueT);

    SDQueue *firstQueue = [SDQueue queueWithGCDQueue:firstQueueT concurrent:NO private:YES];
    SDQueue *secondQueue = [SDQueue queueWithGCDQueue:secondQueueT concurrent:NO private:YES];

    __block BOOL finished = NO;

    [firstQueue runSynchronously:^{
        [secondQueue runSynchronously:^{
            [firstQueue runSynchronously:^{
                finished = YES;
            }];
        }];
    }];

    STAssertTrue(finished, @"");
}

I'm pretty sure the usage of dispatch_get_current_queue() throughout SDQueue.m isn't safe in the face of target queue usage.

This looks like a pretty contrived example, but it could get much worse much faster. I think it means that the SDQueue constructors which take a dispatch_queue_t argument are only safe if you made the queue yourself, and it never "escapes" your control.

If I provide an API like the following, and the client gives me a dispatch_queue_t which targets another, and then makes two of these, then I can't use any of the safe SDQueue operations:

@interface TestAPI : NSObject
- (instancetype)initWithQueue:(dispatch_queue_t)queue;
- (void)doThingInConcertWithOtherGuy:(TestAPI *)otherGuy;
@end

It might be worth just removing the constructors which take a dispatch_queue_t as argument. But that'd kind of unfortunate, since targeting semantics are actually useful. Maybe you could remove those constructors, then provide targeting semantics yourself on SDQueue which could then attempt to avoid cycles. But I'm not sure you could avoid cycles in a thread-safe way, since the target queues involved could change out from under you on various queues while you're traversing the graph. You'd have to be very careful about the guarantees made about the timing of your setTargetQueue: method: I doubt you could make the same guarantees that GCD makes about when the new target queue's applied.

Justin Spahr-Summers
Owner

Thanks for investigating this, and for the detailed write-up.

I can't find any documentation about what GCD does or doesn't guarantee when setting a new target queue – could you explain what those guarantees are?

Andy Matuschak
Andy Matuschak

Other relevant bits of information: dispatch_get_current_queue() returns the targeting queue, not the target queue; dispatch_queue_get_specific() will return information from the targeting queue but will fall back on the target queue's specific key/values if the targeting queue doesn't have a value for that key.

Justin Spahr-Summers
Owner

Alright. It's been a long and winding road trying to figure out a good solution, but I think I finally have something. SDQueue now has a targetQueue property, which is implemented as a combination of dispatch_set_target_queue() (so its semantics are inherited) and queue-specific data to track target queues.

As you noted it would, it does have weaker guarantees than dispatch_set_target_queue(), but I don't think they're that onerous. Basically, it might block waiting for a synchronous dispatch to complete (see below), and cannot prevent deadlocking if the calling code is executing on the receiver. If that's possible, the solution is to just do it on a global dispatch queue instead.

As for avoiding cycles, SDQueue will now freeze the whole chain of target queues before synchronous dispatches. This is a relatively cheap operation, but does lock out updating the targetQueue property during that time. I think this is a worthwhile tradeoff.

Let me know if you have any thoughts or catch any bugs!

Andy Matuschak

Very cool, Justin! I read over your new implementation in SDQueue.m, and I believe that your implementation of "safe" synchronous dispatch is now indeed much more safe. Very cool! This case still does break, though—the case in which the target queue of a non-empty queue is changed:

- (void)testNonEmptyTargetedQueueDeadlock {
    SDQueue *firstQueue = [[SDQueue alloc] initWithPriority:DISPATCH_QUEUE_PRIORITY_DEFAULT concurrent:NO label:@"1"];
    SDQueue *secondQueue = [[SDQueue alloc] initWithPriority:DISPATCH_QUEUE_PRIORITY_DEFAULT concurrent:NO label:@"2"];
    firstQueue.targetQueue = secondQueue;

    __block BOOL finished = NO;
    [firstQueue runSynchronously:^{
        [firstQueue runAsynchronously:^{
            [secondQueue runSynchronously:^{
                finished = YES;
            }];
        }];
        firstQueue.targetQueue = nil;
    }];
    [firstQueue runBarrierSynchronously:^{}];

    STAssertTrue(finished, @"");
}

That's because dispatch_set_target_queue has dispatch_barrier_async semantics: the SDTargetQueueKey of the receiver can get mutated before the underlying queue's target queue changes. So in this case, runSynchronously blithely executes dispatch_sync with the argument, even though the receiver is the target queue of a queue that's already running. You have to barrier the execution of the intermediate queue's block on the receiver's queue.

A few notes:

I don't think you should publish the +[SDQueue currentQueue] method. What could the client possibly use that for which wouldn't be horrifically unsafe? I'm glad you unpublished -[SDQueue isCurrentQueue], though!

Thinking about it more deeply, I think the semantics of using target queues while also using SDQueue's synchronous methods are very strange. You're specifying that computations on some queue Q must be executed within a particular environment by setting a target queue… but if Q is somewhere in the queue stack when runSynchronously: is called, that indication is ignored.

Andy Matuschak

For posterity, a few more notes about that last point:

Usage of a target queue has three correctness-affecting effects I can think of:

  • additional serialization semantics: a serial queue could target another serial queue, which would ensure that no code on the latter queue could be executing simultaneously with code on the former queue;
  • modification of the priority at which blocks should be executed;
  • and the semantics of dispatch_get_specific, since it will fall back to the target queue's value if the receiver doesn't have one.

I don't think that executing blocks re-entrantly on a conditional basis ruin the serialization semantics, since if the serial queue is already on the queue stack, those semantics will be enforced regardless of the queue on which the block's executed. For instance, if you have queue A targeting queue B, and you submit a block onto queue A which dispatch_syncs onto an unrelated serial queue C, which dispatch_syncs onto queue B, then that block can safely assume that no other code is executing on queue B, even if it's invoked re-entrantly from C.

The possibility of re-entrant code execution does indeed mean that code may get executed at a different priority than was intended. For instance, say you have queue A which targets the high-priority global queue, and queue B which targets the low-priority global queue. If you dispatch_sync onto B, and that block dispatch_syncs another block onto A, then the latter block will be executed at low priority, not high priority. This could have starvation/fairness ramifications in complex distributed systems, but I suspect it's not a big deal in practice in most Cocoa apps.

The semantics of dispatch_get_specific in this context could be pretty bad. For instance, this test fails, which could be quite surprising in practice:

- (void)testReentrantGetSpecific {
    SDQueue *firstQueue = [[SDQueue alloc] initWithPriority:DISPATCH_QUEUE_PRIORITY_DEFAULT concurrent:NO label:@"1"];
    SDQueue *secondQueue = [[SDQueue alloc] initWithPriority:DISPATCH_QUEUE_PRIORITY_DEFAULT concurrent:NO label:@"2"];
    firstQueue.targetQueue = secondQueue;

    static char key = 0;
    int value = 1;
    [firstQueue withGCDQueue:^(dispatch_queue_t queue, BOOL isCurrentQueue) {
        dispatch_queue_set_specific(queue, &key, (void *)value, NULL);
    }];

    __block BOOL readCorrectValue = NO;
    [secondQueue runSynchronously:^{
        [firstQueue runSynchronously:^{
            readCorrectValue = ((int)dispatch_get_specific(&key) == value);
        }];
    }];

    STAssertTrue(readCorrectValue, @"");
}

I don't think there's any real way to fix this, though. You should probably at least document the fact that dispatch_get_specific will not behave sanely with SDQueues.

Justin Spahr-Summers
Owner

For what it's worth, testNonEmptyTargetedQueueDeadlock is not actually valid, according to the documented semantics of the targetQueue property:

Because the setter for this property is synchronous, it will deadlock if the
calling code is executing on the receiver (directly or indirectly). If this
may be a possibility, consider setting this property in an asynchronous block
dispatched to a global queue.

However, the issue you're highlighting with it is indeed valid.

Andy Matuschak

Ah, good point!

Justin Spahr-Summers jspahrsummers referenced this issue from a commit August 15, 2012
Justin Spahr-Summers Remove +[SDQueue currentQueue]
As mentioned in #1, there's really no reason to use it that wouldn't be horribly unsafe.
7eac133
Justin Spahr-Summers
Owner

Finally quashed -testNonEmptyTargetedQueueDeadlock (well, the closest translation of it that honors the API). I think it should be pretty damn solid now, but let me know if you see anything else that might be problematic in the new implementation of setTargetQueue:.

Andy Matuschak andymatuschak closed this August 27, 2012
Andy Matuschak

I can't see any more holes! Well done, Justin.

I'm a big fan of the kind of comment which opens -setTargetQueue:. Good stuff. I really like the design pattern here where you using the intermediate queue to safely drain the queue and switch is target. That's something I can see myself using in other GCD-y systems.

Justin Spahr-Summers
Owner

:metal:

Justin Spahr-Summers jspahrsummers referenced this issue in ReactiveCocoa/ReactiveCocoa November 29, 2012
Closed

Scheduler improvements #150

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.