Skip to content

96 curl async file io#97

Merged
EdmondDantes merged 77 commits intomainfrom
96-curl-async-file-io
Mar 12, 2026
Merged

96 curl async file io#97
EdmondDantes merged 77 commits intomainfrom
96-curl-async-file-io

Conversation

@EdmondDantes
Copy link
Copy Markdown
Contributor

No description provided.

- tests/curl/011-curl_file_upload.phpt: reproducer for fiber assertion
  crash when curl_exec is used with CURLFile in async mode
- tests/common/test_router.php: add /upload endpoint for file upload tests
uv_fs_read/write/fsync/fstat are libuv requests (not handles) that keep
uv_loop_alive() true but were invisible to ZEND_ASYNC_ACTIVE_EVENT_COUNT.
The reactor exited prematurely while file I/O callbacks were still pending,
causing deadlocks in async curl file writes.

Add INCREASE/DECREASE_EVENT_COUNT around all four async file I/O operations
and their completion callbacks (io_file_read_cb, io_file_write_cb,
io_file_flush_cb, io_file_stat_cb).

Also add curl async write tests (012-022).
Document the PAUSE/unpause reliability issues in libcurl < 8.11.1
(timer_lastcall optimization, tempcount guard on cselect_bits,
CURLINFO_ACTIVESOCKET unreliability) and the applied solution:
sync read fallback for file uploads on old curl versions.
Tests:
- 023: write callback exception (crashes — known scheduler bug)
- 024: upload nonexistent file (PASS — verifies CURL_READFUNC_ABORT)
- 025: write to broken pipe (crashes — known async IO bug)

Plan: curl-plan/PLAN.md — phases for remaining async curl work
Test 026: CURLOPT_WRITEHEADER writes response headers to file asynchronously.
Test 027: CURLOPT_HEADERFUNCTION user callback collects headers asynchronously.
…ons in the enqueue operation and other logic checks.

+ Proper error propagation has been added for CURL-type functions.
The sync_io optimization (direct read/write syscalls instead of libuv
async I/O) was incorrectly enabled on Linux (#ifndef PHP_WIN32) where
libuv uses cheap epoll, and disabled on Windows where libuv async I/O
goes through a helper process. This caused hangs and crashes when
non-blocking descriptors were used (e.g. broken pipe via proc_open).

- Flip all sync_io #ifdefs to #ifdef PHP_WIN32
- Rewrite sync I/O blocks with Windows CRT APIs (_read/_write/_lseeki64)
- Remove POSIX-specific EINTR/EAGAIN handling from Windows path
- Add tests for non-blocking pipe read/write in async context
- scheduler.c: Restore root_function.common.function_name initialization
  ("{core}scheduler") and its cleanup in shutdown. Was previously commented
  out, causing missing function name in stack traces.

- tests/simulate_run_tests.php: Diagnostic script that reproduces how
  run-tests.php launches tests via proc_open with pipes + "2>&1".
  Used to identify a bug where stream_select() returns ready for stderr
  (EOF due to 2>&1) but the code blindly reads from stdout, which blocks
  forever when a grandchild process (php -S) inherits the stdout pipe fd.
The broken pipe scenario intentionally triggers "Send of N bytes failed
with errno=32 Broken pipe" notices from the scheduler. These are expected
side effects, not test failures. Suppress them via --INI-- section to keep
the test output clean and matching EXPECTF.
- Set SCHEDULER_CONTEXT during cancel_queued_coroutines so coroutine
  cancellation callbacks execute in scheduler context.
- Call process_resumed_coroutines after cancellation to flush any
  coroutines resumed by cancellation callbacks.
- Update 033-read_user_exception test expectation: exception from
  READFUNCTION callback now correctly propagates through await().
Regression test: ob_start() without explicit ob_end_flush() must
auto-flush buffered output at shutdown, even after file_put_contents()
triggers coroutine OB context creation.
Cover CURLFile uploads, write/read/header callbacks, file output,
exception propagation, concurrent uploads, and mixed callback modes
in curl_multi_* with async coroutines.

Tests: 036-045 (10 new tests)
Test curl_multi operations across multiple simultaneous coroutines:
- Two coroutines each with own curl_multi handle (046)
- Two coroutines with different callback modes: WRITEFUNCTION vs FILE (047)
- Mixed curl_exec and curl_multi in concurrent coroutines (048)
- Two coroutines each uploading CURLFile via curl_multi (049)
- Exception isolation: error in one coroutine doesn't affect another (050)
Update existing tests (043, 050) for new exception propagation behavior:
exceptions from user callbacks now throw from curl_multi_exec().

New tests for multi mode error scenarios:
- 051: READFUNCTION exception propagates to curl_multi_exec
- 052: HEADERFUNCTION exception propagates to curl_multi_exec
- 053: CURLFile nonexistent file returns error via curl_multi_info_read
- 054: CURLOPT_FILE broken pipe triggers CURLE_WRITE_ERROR
- 055: Connection error (invalid port) reports error correctly
- 056: XFERINFOFUNCTION exception propagates to curl_multi_exec
- 057: reuse same handle + same $fp across iterations
- 058: fclose($fp) between requests (early stream close)
- 059: different $fp on each iteration (open/close in loop)
- 060: concurrent coroutines each reusing a handle with $fp
New field `acting_coroutine` in zend_async_globals_t. When set,
zend_get_executed_filename_ex(), zend_get_executed_lineno(), and
get_active_function_name() use the coroutine's suspended execute_data
for error reporting instead of showing {core}scheduler() at line 0.

Scheduler resets acting_coroutine to NULL on each tick as safety net.
Macros: ZEND_ASYNC_ACT_AS_START(), ZEND_ASYNC_ACT_AS_END().
Leave root_function.common.function_name as NULL so the scheduler
frame is invisible in backtraces, matching standard Fiber behavior.
… offset

- Do not set ZEND_ASYNC_IO_EOF in io_file_read_cb and Windows sync fallback,
  as file EOF is temporary (another process can write more data)
- Replace lseek(SEEK_END) with fstat() in libuv_io_create for append mode
  to avoid moving the fd position as a side effect
- Use PHPWRITE_CORO in exec_read_cb for SYSTEM and PASSTHRU modes
  so output goes through the correct coroutine's output buffer
- Only create stderr pipe when std_error is requested; otherwise
  use UV_INHERIT_FD so child stderr goes to parent's stderr
- Store coroutine reference in async_exec_event_t for PHPWRITE_CORO
… sync

uv_fs_write/read with explicit offset uses pwrite/pread, which do not
move the kernel file offset. This breaks scenarios where multiple fds
share the same open file description (e.g. dup/redirect in proc_open):
each async_io_t tracked its own offset independently, causing writes
through one fd to overwrite data written through the other.

Fix: pass offset=-1 to uv_fs_write/uv_fs_read so libuv uses regular
write()/read() which move the kernel file offset. Update tracked offset
from kernel position via lseek(SEEK_CUR) after completion. Also sync
kernel offset in libuv_io_seek for fseek() calls.

Added tests for offset correctness with dup'd fds, sequential writes,
and seek+write+read.
On Windows, proc_open pipes that the child never writes to return
EBADF on _read(). The old sync path (PeekNamedPipe) silently handled
this. Treat EBADF as EOF instead of throwing an IO exception.
…read sizes

- libuv_io_read() now accepts char *buf; if non-NULL it is used directly
  (buf_owned=false) and dispose() skips the free, eliminating the double
  allocation that caused OOM when reading large files (e.g. 5 GB).
- Added bool buf_owned to async_io_req_t to track buffer ownership.
- Cap max_size to INT_MAX at the top of libuv_io_read so _read() and
  uv_buf_t.len (32-bit ULONG on Windows) never overflow; the streams
  loop retries for the remainder.
- Cap write count to INT_MAX in _write() paths of libuv_io_write,
  fixing file_put_contents() silent truncation on >2 GB writes on Windows.
- Fix passthru() binary data corruption on Windows: CRLF->LF stripping
  now applies only to SHELL_EXEC (which mirrors popen("rt")); all other
  modes including PASSTHRU receive raw bytes unchanged.
- Replace all bare lseek/_lseeki64 calls with zend_lseek for
  cross-platform consistency
- Rewrite libuv_io_seek to accept whence parameter and return position,
  eliminating double lseek in php_stdiop_seek
- Initialize append-mode file offset by querying EOF at io_create time,
  then restoring fd to 0 to match POSIX O_APPEND ftell semantics
- On Windows, query real EOF via lseek(SEEK_END) before each async
  append write to avoid stale cached offsets
- Skip updating file.offset on seek in append mode to prevent
  corrupting subsequent writes after fseek
- Mark test 069 (concurrent two-coroutine append) as XFAIL: Windows
  WriteFile ignores CRT _O_APPEND when FILE_WRITE_DATA is present,
  and removing it breaks ftruncate
uv_spawn expects UTF-8 strings but PHP may pass strings in the current
code page (e.g. CP1251). Convert cmd and cwd from the active code page
to UTF-8 before spawning. Also remove quoted_cmd from struct since
uv_spawn copies all options internally - free temporaries immediately.
  - libuv_io_close: pure close without dispose, handles all types (STREAM + UDP)
  - libuv_io_event_dispose: calls close if needed, then callbacks_free + free
  - curl_async: don't close borrowed IO, just clear the pointer
  - Fix intptr_t for php_stream_cast fd to prevent stack corruption on x64
  - Add test 063-readdata_no_callback
  When curl calls php_stdiop_cast(PHP_STREAM_AS_STDIO) on a stream with
  async IO, the fd is dup'd to avoid dual ownership with libuv. However,
  the original fd was lost because stdiop_cast unconditionally set
  data->fd = SOCK_ERR. On Windows this caused the file to remain locked
  after fclose (Permission denied on reopen/unlink).

  Two fixes:
  1. stdiop_cast: preserve data->fd when async IO owns a dup'd copy
  2. stdiop_close: close dup'd FILE* for all IO types (not just streams),
     so the normal close logic below can close the original fd
@EdmondDantes EdmondDantes linked an issue Mar 10, 2026 that may be closed by this pull request
io_close_cb (called by libuv after uv_close) was unconditionally
freeing the async_io_t object. Any code accessing the object after
ZEND_ASYNC_IO_CLOSE would hit use-after-free (e.g. dispose() in
php_stdiop_close, causing segfault with USE_ZEND_ALLOC=0).

Fix: use refcount to manage async_io_t lifetime. libuv_io_close
adds a ref before uv_close, io_close_cb calls dispose which
decrements it. The object is only freed when refcount reaches 0,
ensuring all users have released their references.
…or destroy

- Track all IO handles in active_io_handles HashTable
- On reactor shutdown: call on_detach callback, preserve orig_fd, close/dispose
- on_detach allows plain_wrapper to clear async_io pointer so streams work sync
- Fixes crash in executor_globals_dtor when streams outlive the reactor
- Fixes 15 false-positive memory leak reports in mysqli_debug tests
- Remove completed TODO entry
Split libuv_reactor_shutdown into detach_io + shutdown phases.
detach_io runs before RSHUTDOWN to disconnect IO handles from streams;
shutdown runs after shutdown_executor to destroy the reactor.

Add uv_run(UV_RUN_NOWAIT) before uv_loop_close to process pending
uv_close callbacks from poll events disposed during shutdown_executor
(e.g. curl free_obj), fixing memory leaks in curl_postfields_array test.
Runs only the failing bailout/exec/include tests with
debug + ZTS + ASAN to reproduce the libuv_io_close crash
in executor_globals_dtor.
Replace ZEND_ASYNC_DEACTIVATE with ZEND_ASYNC_INITIALIZE in
async_scheduler_main_coroutine_suspend(). DEACTIVATE set the
state to OFF too early — before php_request_shutdown() had a
chance to run REACTOR_DETACH_IO, causing it to be skipped on
subsequent repeat iterations.

Also add shutdown lifecycle documentation for --repeat mode.
…ith --repeat

JIT + --repeat 2 causes use-after-free in zend_jit_rope_end() on the
second iteration. This is a php-src JIT bug, not async-specific.
Reproduces with plain PHP code (no fibers/coroutines needed).
@EdmondDantes EdmondDantes merged commit 6a78f3b into main Mar 12, 2026
1 check passed
@EdmondDantes EdmondDantes deleted the 96-curl-async-file-io branch March 12, 2026 07:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CURL ASYNC FILE IO

1 participant