Skip to content

[SR-14902] Binaries with Concurrency/Dispatch crash with _dispatch_queue_no_activate. #57249

@swift-ci

Description

@swift-ci
Previous ID SR-14902
Radar rdar://problem/80383002
Original Reporter 3405691582 (JIRA User)
Type Bug
Environment

Linux (clean build and checkout at HEAD)

OpenBSD (with pr swiftlang/swift-corelibs-libdispatch#559)

Additional Detail from JIRA
Votes 0
Component/s
Labels Bug, Concurrency
Assignee None
Priority Medium

md5: 989fbed04827f8361dfc98879fa3bf00

Issue Description:

Presumably all Dispatch-based Concurrency binaries built at HEAD currently crash at _dispatch_queue_no_activate.

Easy reproduction on Linux is to execute one of the Concurrency tests, e.g.,
./llvm-project/llvm/utils/lit/lit.py -sv --param swift_site_config=./build/Ninja-DebugAssert/swift-linux-x86_64/test-linux-x86_64/lit.site.cfg swift/test/Concurrency/Runtime/async_let_throws.swift. Test will build ./build/Ninja-DebugAssert/swift-linux-x86_64/test-linux-x86_64/Concurrency/Runtime/Output/async_let_throws.swift.tmp/a.out and crash with SIGILL.

I am not completely confident why exactly this is the case but I have some suspicions as to what might be going awry based on some debugging. See e.g. (lightly redacted) GDB session

(gdb) f
#​0  _dispatch_queue_no_activate (dqu=...,
    allow_resume=0x2a529413ff0 <__OS_dispatch_queue_main_vtable>)
    at .../swift/swift-corelibs-libdispatch/src/init.c:652
652             DISPATCH_INTERNAL_CRASH(dx_type(dqu._dq), "dq_activate called");
(gdb) print *(*(struct dispatch_queue_global_s *)dqu._dq).do_vtable
$62 = {_os_obj_xref_dispose = 0x502, _os_obj_dispose = 0x0, _os_obj_vtable = {
    do_type = 1,
    do_kind = 0x2a5761bddb0 <jobInvoke(void*, void*, unsigned int)> "UH\211\345H\203\354@H\211}\370H\211u\360\211U\354H\213E\370H\211E\340H\213E\340H\211E\310H\213}\340\350$\027", do_dispose = 0x0, do_debug = 0x0, do_invoke = 0x0,
    dq_activate = 0x0, dq_wakeup = 0x0, dq_push = 0x0}}

Observe how do_kind – a const char * is being interpreted as a function pointer here. Swift tries to manufacture Dispatch-compatible objects when it uses Dispatch to implement Concurrency features (see Task.cpp#L257) . This mismatch seems to suggest there is some variation in the way the object is being laid out on the Swift side versus how it is interpreted on the Dispatch side.

However, I suspect that this is not taking to consideration that part of the object header in Dispatch is defined differently when USE_OBJC is 1 or 0 (see src/objc_internal.h#L174 – what also may be relevant is that there is a similar deviation when OS_OBJECT_HAVE_OBJC1 is 1 or 0 (see src/object_internal.h#L436).

(The main reason why I am not 100% sure that this is the root cause of the crash is that I am missing is how _dispatch_queue_no_activate is reached if the suspect vtable has 0x0 for dq_activate – but the above analysis may nonetheless be relevant to a potential fix. I have not experimented in trying to reorder the metadata in Task.cpp to try and bring the layouts on Swift/Dispatch into alignment. I suspect perhaps this was not tripped earlier because the code as it stands at HEAD will align correctly with the ObjC-enabled branch)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugA deviation from expected or documented behavior. Also: expected but undesirable behavior.concurrencyFeature: umbrella label for concurrency language features

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions