Skip to content

curl_multi cancel: heap corruption when AsyncCancellation interrupts curl_multi_select() #145

@EdmondDantes

Description

@EdmondDantes

Repro (no harness, plain PHP, ~50 lines)

\$srv = stream_socket_server('tcp://127.0.0.1:0', \$errno);
\$addr = stream_socket_get_name(\$srv, false);

\$accept = spawn(function() use (\$srv) {
    while (true) {
        \$c = @stream_socket_accept(\$srv, 30);
        if (!\$c) return;
        while ((\$l = @fgets(\$c)) !== false && \$l !== \"\\r\\n\") {}
        @fwrite(\$c, \"HTTP/1.1 200 OK\\r\\nContent-Length: 1048576\\r\\n\\r\\n\");
        for (\$i = 0; \$i < 1000; \$i++) { \\Async\\delay(50); if (@fwrite(\$c, str_repeat('x', 64)) === false) return; }
        @fclose(\$c);
    }
});

\$c = spawn(function() use (\$addr) {
    \$mh = curl_multi_init();
    \$easy = [];
    foreach ([1, 2] as \$_) {
        \$ch = curl_init();
        curl_setopt(\$ch, CURLOPT_URL, \"http://\$addr/\");
        curl_setopt(\$ch, CURLOPT_WRITEFUNCTION, fn(\$ch, \$d) => strlen(\$d));
        curl_multi_add_handle(\$mh, \$ch);
        \$easy[] = \$ch;
    }
    try {
        do { curl_multi_exec(\$mh, \$active); if (\$active) curl_multi_select(\$mh, 1.0); } while (\$active);
    } catch (\\Async\\AsyncCancellation \$e) { echo \"C: cancellation caught\\n\"; }
    finally { foreach (\$easy as \$ch) { @curl_multi_remove_handle(\$mh, \$ch); @curl_close(\$ch); } @curl_multi_close(\$mh); }
});

\$k = spawn(function() use (\$c) { delay(30); \$c->cancel(); });
await_all([\$c, \$k]);
\$accept->cancel();
echo \"OK\\n\";

Observed (ASAN-ZTS build)

K: cancel
C: cancellation caught
Warning: Attempt to finalize a coroutine that is still in the queue in Unknown on line 0
zend_mm_heap corrupted
AddressSanitizer:DEADLYSIGNAL
==…==ERROR: AddressSanitizer: SEGV on unknown address …
    #1 zend_mm_panic Zend/zend_alloc.c:389
    #2 zend_mm_get_next_free_slot Zend/zend_alloc.c:1324
    #3 zend_mm_alloc_small Zend/zend_alloc.c:1408
    …
    #6 zend_array_dup Zend/zend_hash.c:2522
    #7 ZEND_BIND_STATIC_SPEC_CV_HANDLER Zend/zend_vm_execute.h:40871
    …
   #10 async_coroutine_execute ext/async/coroutine.c:527
   #11 execute_next_coroutine_from_fiber ext/async/scheduler.c:563

The cancellation IS caught by the coroutine (the user-level catch fires
and prints "C: cancellation caught"). The crash is in the next coroutine
the scheduler picks up — it tries to BIND_STATIC, the allocator finds the
free-list trashed and panics. So the corruption happens during the
unwind of the curl_multi state when the cancel lands inside
`curl_multi_select()`.

Why it matters

curl_multi + cancellation is the canonical pattern for any HTTP gateway
that fans out parallel requests with a timeout / cancel watcher. With
this bug every such cancellation has a chance to take the worker down
with heap corruption. Equally serious because it's a memory-safety bug
in a happy-path-adjacent code path.

Suspected locus: `ext/curl/curl_async.c` (same file as the chunked-body
bug fixed in #136). The "finalize a coroutine that is still in the
queue" warning hints at a curl-multi cancel handler that completes /
disposes the coroutine while it's still in the scheduler runqueue, so a
later step's allocator metadata is overwritten.

Found by

Drafting `fuzzy-tests/curl/curl_multi_chaos.feature` for umbrella #143.
4 of 7 scenarios (all the cancel-mid-multi-select ones) SEGV; the 3
no-cancel scenarios pass clean. Filed before shipping; the cancel
scenarios stay in the feature commented out under `# Blocked: #145`
until this is fixed.

Acceptance

The minimal repro above prints "OK" without firing AddressSanitizer
or zend_mm_panic. The four `curl_multi_chaos` cancel scenarios run
green on the fuzzy CI matrix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions