Skip to content

More work#109

Merged
sgerbino merged 13 commits intocppalliance:developfrom
sgerbino:pr/perf-vf
Feb 9, 2026
Merged

More work#109
sgerbino merged 13 commits intocppalliance:developfrom
sgerbino:pr/perf-vf

Conversation

@sgerbino
Copy link
Collaborator

@sgerbino sgerbino commented Feb 9, 2026

Summary by CodeRabbit

  • New Features

    • Added reset functionality to TLS streams, enabling session reuse without reconstructing stream objects.
  • Bug Fixes

    • Enhanced system call reliability with automatic retry logic for interrupted calls.
    • Improved I/O handling for zero-byte reads in network operations.
  • Improvements

    • Optimized scheduler task cleanup and timer state management for better performance and correctness.

sgerbino and others added 13 commits February 7, 2026 03:46
… variants

New benchmark categories with corosio, Asio coroutine, and Asio callback
implementations:

- timer: schedule/cancel throughput, zero-delay fire rate, concurrent
  timers with staggered intervals
- accept_churn: sequential connect/accept/close, concurrent accept loops,
  burst connection storms
- fan_out: fork-join coordination at varying fan-out, nested two-level
  fan-out, concurrent independent parents
Add ioc.stop() to Asio timer threads so pending TCP operations
don't block shutdown for 60+ seconds. Set SO_LINGER(true, 0) on
client sockets to avoid TIME_WAIT accumulation when benchmarks
run back-to-back under --library all.
Wrap readv, sendmsg, and accept4 calls in do/while loops
that retry on EINTR. Signals can interrupt these syscalls
on any platform that uses signal-based profiling or timers.
Prepares for upcoming run_async work tracking fix in capy. Once
run_async correctly calls on_work_started/on_work_finished, ioc->run()
will block until all spawned coroutines complete. The explicit stop()
ensures benchmarks shut down promptly regardless of work tracking
semantics.
- Remove extra signal_one from run_task after splicing ops
  (cascading wake in do_one handles all thread signaling)
- Remove private queue check from more_handlers in reactor
  path (private queue is thread-local, waking others is wrong)
- Remove dead drain_private_queue function and call site
  (work_cleanup and task_cleanup already drain private queues)
- Move private queue drain into task_cleanup destructor so
  work count is flushed before ops become visible to others
- Make task_running_ atomic (read outside mutex by callback)
- Defer timerfd_settime via timerfd_stale_ flag to avoid
  syscalls for timers scheduled then cancelled without waiting
- Remove unnecessary (void)on_exit casts
The test lambdas (do_round, fuse callback) were coroutines that
called ioc.run() internally while already being driven by an
outer ioc.run(). The inner run creates its own scheduler context
and cannot see ops posted to the outer context's private queue,
causing a deadlock. Convert these to plain functions since they
only use ioc.run()/restart() to drive sub-coroutines synchronously.
@coderabbitai
Copy link

coderabbitai bot commented Feb 9, 2026

📝 Walkthrough

Walkthrough

This pull request makes multiple types of changes across the codebase: updating copyright email formatting from obfuscated to standard format throughout files, adding a reset() pure virtual method to TLS stream interfaces with concrete implementations in OpenSSL and WolfSSL backends, introducing EINTR retry loops for I/O operations in the epoll backend, and refactoring the epoll scheduler's task cleanup, timer management, and state handling mechanisms.

Changes

Cohort / File(s) Summary
Copyright Email Updates
example/*, include/boost/corosio/*, perf/profile/*, src/corosio/src/*, src/openssl/src/*, src/wolfssl/src/*
Email format standardized from "vinnie dot falco at gmail dot com" to "vinnie.falco@gmail.com" across 70+ files. No functional impact.
TLS Stream Reset API
include/boost/corosio/tls_stream.hpp, include/boost/corosio/openssl_stream.hpp, include/boost/corosio/wolfssl_stream.hpp, src/openssl/src/openssl_stream.cpp, src/wolfssl/src/wolfssl_stream.cpp
Added pure virtual reset() method to tls_stream interface for clearing session state; implemented in openssl_stream and wolfssl_stream with per-instance used_ flag tracking and SSL state clearing via SSL_clear and buffer draining.
EINTR Retry Handling
src/corosio/src/detail/epoll/acceptors.cpp, src/corosio/src/detail/epoll/op.hpp, src/corosio/src/detail/epoll/sockets.cpp
Wrapped I/O operations (readv, sendmsg, accept4) in retry loops to handle EINTR signals without propagating interruption errors.
Epoll Scheduler Refactoring
src/corosio/src/detail/epoll/scheduler.cpp, src/corosio/src/detail/epoll/scheduler.hpp
Replaced mutable bool with std::atomic for task_running_ state; introduced timerfd_stale_ flag for deferred timer updates; removed work-flush helpers and integrated per-task cleanup with explicit lock pointer; refactored reactor wake logic and task lifecycle.
Echo Server Behavioral Change
example/echo-server/echo_server.cpp
Modified read loop break condition to only trigger on error (ec); removed previous break on zero-byte reads (n == 0), allowing zero-byte operations to continue.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant TLS as tls_stream
    participant OpenSSL as openssl_stream
    participant SSL as SSL Library
    
    App->>TLS: reset()
    activate TLS
    TLS->>OpenSSL: reset()
    activate OpenSSL
    OpenSSL->>SSL: SSL_clear()
    activate SSL
    SSL-->>OpenSSL: state cleared
    deactivate SSL
    OpenSSL->>OpenSSL: drain ext_bio_
    OpenSSL->>OpenSSL: reapply hostname
    OpenSSL->>OpenSSL: used_ = false
    OpenSSL-->>TLS: reset complete
    deactivate OpenSSL
    TLS-->>App: ready for new handshake
    deactivate TLS
Loading
sequenceDiagram
    participant Task as Task Thread
    participant Scheduler as epoll_scheduler
    participant Cleanup as task_cleanup
    participant Queue as private_queue
    participant Lock as task_lock
    
    Task->>Scheduler: run_task()
    activate Scheduler
    Scheduler->>Scheduler: check timerfd_stale_
    Note over Scheduler: Defer timerfd update if stale
    Scheduler->>Cleanup: construct task_cleanup
    activate Cleanup
    Scheduler->>Queue: process private work
    Note over Queue: Work processing continues
    Scheduler-->>Task: task running
    deactivate Scheduler
    Task->>Scheduler: task ends
    Cleanup->>Lock: acquire lock
    Cleanup->>Queue: drain remaining work
    Cleanup->>Queue: update outstanding_work
    Cleanup-->>Task: cleanup complete
    deactivate Cleanup
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • tests and more #34 — Modifies TLS stream implementations (openssl_stream and wolfssl_stream) at the class/API level, directly overlapping with the reset() method additions and per-instance state tracking in this PR.
  • Epoll #101 — Alters epoll scheduler internals including timerfd/timer update logic and task-running state management, directly related to the scheduler refactoring and atomic state changes in this PR.
  • Refactor details directory structure #32 — Refactors epoll backend files (scheduler, op, sockets, acceptors) with overlapping changes to reactor logic, EINTR handling, and task management as this PR's epoll subsystem modifications.

Poem

🐰 A rabbit hops through code so keen,
From dotted mails to format clean,
TLS streams now reset their state,
EINTR loops won't hesitate!
With scheduler spun in atomic grace,
We've tidied up this cozy place! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'More work' is vague and generic, providing no meaningful information about the changeset's primary objectives. Replace with a descriptive title that summarizes the main changes, such as 'Normalize email format and add TLS stream reset methods' or 'Update copyright emails and implement stream reset functionality'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://109.corosio.prtest3.cppalliance.org/index.html

If more commits are pushed to the pull request, the docs will rebuild at the same URL.

2026-02-09 01:48:16 UTC

@codecov
Copy link

codecov bot commented Feb 9, 2026

Codecov Report

❌ Patch coverage is 88.63636% with 30 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.46%. Comparing base (ac5807f) to head (c493368).
⚠️ Report is 13 commits behind head on develop.

Files with missing lines Patch % Lines
include/boost/corosio/test/mocket.hpp 86.02% 13 Missing ⚠️
src/corosio/src/detail/epoll/scheduler.cpp 79.16% 5 Missing ⚠️
include/boost/corosio/openssl_stream.hpp 66.66% 4 Missing ⚠️
src/openssl/src/openssl_stream.cpp 71.42% 4 Missing ⚠️
src/corosio/src/detail/epoll/op.hpp 66.66% 2 Missing ⚠️
src/corosio/src/detail/make_err.cpp 66.66% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #109      +/-   ##
===========================================
+ Coverage    80.06%   80.46%   +0.39%     
===========================================
  Files           65       65              
  Lines         5569     5594      +25     
===========================================
+ Hits          4459     4501      +42     
+ Misses        1110     1093      -17     
Files with missing lines Coverage Δ
include/boost/corosio/detail/scheduler.hpp 100.00% <ø> (ø)
include/boost/corosio/endpoint.hpp 94.33% <ø> (ø)
include/boost/corosio/io_buffer_param.hpp 100.00% <100.00%> (ø)
include/boost/corosio/io_object.hpp 100.00% <ø> (ø)
include/boost/corosio/io_stream.hpp 100.00% <ø> (ø)
include/boost/corosio/ipv4_address.hpp 100.00% <ø> (ø)
include/boost/corosio/ipv6_address.hpp 100.00% <ø> (ø)
include/boost/corosio/resolver.hpp 97.40% <ø> (ø)
include/boost/corosio/resolver_results.hpp 100.00% <ø> (ø)
include/boost/corosio/signal_set.hpp 93.33% <ø> (ø)
... and 34 more

... and 1 file with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ac5807f...c493368. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@example/echo-server/echo_server.cpp`:
- Around line 56-57: The loop using read_some currently breaks only on ec, which
can spin if read_some returns (no_error, 0); reintroduce the n == 0 guard or
explicitly treat zero-byte reads as EOF: in the loop that calls read_some
(variables n and ec), change the exit condition to break when ec is set OR n ==
0 (e.g., if (ec || n == 0) break), or, if you prefer error semantics, detect n
== 0 and set ec = capy::cond::eof before breaking; also add a brief comment
referencing shutdown_send and optionally log the zero-length read to aid
debugging.

In `@include/boost/corosio/io_buffer_param.hpp`:
- Around line 327-333: copy_to is declared noexcept but calls fn_ (the function
pointer bound to copy_impl) which is not noexcept, risking std::terminate if
iterator ops throw; make the exception contract consistent by either removing
noexcept from copy_to or marking copy_impl (and any functions assigned to fn_)
noexcept as well so fn_ has a noexcept target; update the signature of copy_impl
to noexcept (and adjust any internal error handling) or remove noexcept from
copy_to so exceptions can propagate, ensuring you change all related overloads
and assignments to fn_ to match.

In `@src/corosio/src/detail/epoll/scheduler.cpp`:
- Around line 855-880: The destructor of task_cleanup must clear the scheduler's
task_running_ flag to ensure it isn't left true on exceptions from run_task;
modify task_cleanup::~task_cleanup to, after ensuring ctx is non-null and with
the mutex held (acquire lock if !lock->owns_lock()), perform
scheduler->task_running_.store(false, std::memory_order_release) (or equivalent)
before touching outstanding work/private_queue so the reactor state is reset
even during stack unwinding; after adding this change, remove the later
redundant task_running_ store in do_one to avoid duplicate writes.

In `@src/corosio/src/tls/detail/context_impl.hpp`:
- Line 39: Add a class-level doc comment above the tls_context_data struct
describing its role as the central portable TLS configuration holder consumed by
TLS backends, how it relates to tls_context::impl and native_context_base, and
what kinds of fields it contains (e.g., certificate, key, verification flags).
Place the comment immediately above the declaration of tls_context_data, mirror
the style/format of the existing docs used for native_context_base and
tls_context::impl, and keep it concise (one or two sentences) to satisfy the
project's docstring guidelines.
🧹 Nitpick comments (5)
perf/profile/concurrent_io_bench.cpp (1)

24-41: Add required implementation overview block after includes.

This file contains non-trivial logic but lacks the required /* */ high-level overview comment immediately after the includes. Please add a short overview describing how the benchmark orchestrates the workload and any non-obvious behavior.

As per coding guidelines "Files containing non-trivial implementation logic should include a /* */ block comment after the includes that provides a high-level overview of how the implementation works."

src/corosio/src/detail/iocp/signals.hpp (1)

22-24: Nit: duplicate #include <system_error>.

Lines 22 and 24 both include <system_error>. Pre-existing, but worth cleaning up while you're touching this file.

src/corosio/src/detail/iocp/timers.cpp (1)

20-32: Reasonable strategy switch with clear rationale.

The comment on lines 23-24 explains the performance motivation well. Minor nit: the #if 0 block on lines 27-31 is unreachable (after the return). Consider removing it or moving it above the return as a commented-out alternative to reduce dead code.

src/corosio/src/detail/epoll/op.hpp (1)

377-385: Add EINTR retry loops to select backend I/O operations for consistency with epoll.

The select backend's perform_io() methods in src/corosio/src/detail/select/op.hpp lack EINTR retry loops in their system calls (getsockopt, readv, sendmsg, accept, and fcntl), while the epoll backend consistently retries interrupted operations. For robustness and cross-backend consistency, these operations should handle EINTR by retrying the syscall, similar to the epoll backend's pattern.

src/wolfssl/src/wolfssl_stream.cpp (1)

996-1001: Missing null check on impl_ before delegation.

Both openssl_stream::reset() and wolfssl_stream::reset() dereference impl_ without a null check. After a move operation, impl_ is nullptr (see line 951). While other methods (handshake, shutdown, etc.) have the same pattern, reset() is a synchronous non-coroutine call more likely to be invoked on a moved-from object by accident.

That said, this is consistent with the existing design across the class, so flagging as optional.

Comment on lines +56 to 57
if (ec)
break;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Removing the n == 0 guard may cause a busy loop if read_some can return zero bytes without an error.

If the peer performs a graceful close and read_some returns (no_error, 0) instead of an EOF error code, this loop will spin indefinitely — reading zero bytes, writing zero bytes, and repeating. The previous n == 0 check was a safety net against this.

The retrieved learning notes that shutdown(shutdown_send) guarantees an explicit EOF error code, but a normal peer close may not have the same contract. Please verify that read_some always surfaces EOF as an ec (e.g., capy::cond::eof) on all close paths before relying solely on the error code.

,

#!/bin/bash
# Check how read_some handles EOF / zero-byte returns in the codebase
rg -n -C5 'read_some' --type=cpp -g '!**/test/**' | head -80

# Check if there's documentation or implementation detail about EOF behavior
rg -n -C3 'eof\|zero.byte\|n == 0\|bytes_transferred.*== 0' --type=cpp | head -60
🤖 Prompt for AI Agents
In `@example/echo-server/echo_server.cpp` around lines 56 - 57, The loop using
read_some currently breaks only on ec, which can spin if read_some returns
(no_error, 0); reintroduce the n == 0 guard or explicitly treat zero-byte reads
as EOF: in the loop that calls read_some (variables n and ec), change the exit
condition to break when ec is set OR n == 0 (e.g., if (ec || n == 0) break), or,
if you prefer error semantics, detect n == 0 and set ec = capy::cond::eof before
breaking; also add a brief comment referencing shutdown_send and optionally log
the zero-length read to aid debugging.

Comment on lines +327 to +333
std::size_t
copy_to(
capy::mutable_buffer* dest,
std::size_t n) const noexcept
{
return fn_(bs_, dest, n);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

copy_to is noexcept but copy_impl is not — potential std::terminate on throwing iterators.

copy_to is declared noexcept (line 330), but the function pointer fn_ targets copy_impl (line 337), which is not noexcept. If a buffer sequence's iterator operations throw, this will call std::terminate rather than propagating the exception.

If this is intentional (assuming well-behaved buffer sequences never throw during iteration), consider also marking copy_impl as noexcept to make the contract explicit and catch violations at the source.

Suggested diff
     template<capy::ConstBufferSequence BS>
     static std::size_t
     copy_impl(
         void const* p,
         capy::mutable_buffer* dest,
-        std::size_t n)
+        std::size_t n) noexcept
     {
🤖 Prompt for AI Agents
In `@include/boost/corosio/io_buffer_param.hpp` around lines 327 - 333, copy_to is
declared noexcept but calls fn_ (the function pointer bound to copy_impl) which
is not noexcept, risking std::terminate if iterator ops throw; make the
exception contract consistent by either removing noexcept from copy_to or
marking copy_impl (and any functions assigned to fn_) noexcept as well so fn_
has a noexcept target; update the signature of copy_impl to noexcept (and adjust
any internal error handling) or remove noexcept from copy_to so exceptions can
propagate, ensuring you change all related overloads and assignments to fn_ to
match.

Comment on lines 855 to 880
struct task_cleanup
{
epoll_scheduler const* scheduler;
std::unique_lock<std::mutex>* lock;
scheduler_context* ctx;

~task_cleanup()
{
if (ctx && ctx->private_outstanding_work > 0)
if (!ctx)
return;

if (ctx->private_outstanding_work > 0)
{
scheduler->outstanding_work_.fetch_add(
ctx->private_outstanding_work, std::memory_order_relaxed);
ctx->private_outstanding_work = 0;
}

if (!ctx->private_queue.empty())
{
if (!lock->owns_lock())
lock->lock();
scheduler->completed_ops_.splice(ctx->private_queue);
}
}
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

task_cleanup destructor: verify exception-safety of the lock acquisition path.

If run_task throws (e.g., epoll_wait error at line 939) before re-acquiring the lock at line 984, this destructor runs during stack unwinding and correctly re-acquires the lock at line 876. However, if run_task throws, task_running_ remains true (it's only reset at line 1024 in do_one, after run_task returns). This leaves the scheduler in a state where no thread can become the reactor.

Consider resetting task_running_ inside task_cleanup's destructor to ensure it's cleared even on exceptional exit.

Proposed fix
     ~task_cleanup()
     {
+        scheduler->task_running_.store(false, std::memory_order_relaxed);
+
         if (!ctx)
             return;

         if (ctx->private_outstanding_work > 0)

If you do this, also remove the redundant store in do_one:

             run_task(lock, ctx);
 
-            task_running_.store(false, std::memory_order_relaxed);
             completed_ops_.push(&task_op_);
🤖 Prompt for AI Agents
In `@src/corosio/src/detail/epoll/scheduler.cpp` around lines 855 - 880, The
destructor of task_cleanup must clear the scheduler's task_running_ flag to
ensure it isn't left true on exceptions from run_task; modify
task_cleanup::~task_cleanup to, after ensuring ctx is non-null and with the
mutex held (acquire lock if !lock->owns_lock()), perform
scheduler->task_running_.store(false, std::memory_order_release) (or equivalent)
before touching outstanding work/private_queue so the reactor state is reset
even during stack unwinding; after adding this change, remove the later
redundant task_running_ store in do_one to avoid duplicate writes.

virtual ~native_context_base() = default;
};

struct tls_context_data
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add class-level documentation for tls_context_data.

The tls_context_data struct is a central data structure that holds all portable TLS configuration, yet it lacks class-level documentation. Other classes in this file (native_context_base, tls_context::impl) have doc comments, creating an inconsistency. A brief description explaining its purpose, relationship to tls_context, and how it's consumed by TLS backends would improve maintainability. As per coding guidelines, classes and functions should have docstrings that include a brief description of purpose.

📝 Suggested documentation
+/** Portable TLS configuration data.
+
+    Contains all TLS settings (credentials, verification, protocol
+    options) in a backend-agnostic format. Each TLS backend
+    (OpenSSL, WolfSSL) consumes this data to configure its
+    native context. Also maintains a cache of backend-specific
+    native contexts as an intrusive list.
+*/
 struct tls_context_data
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
struct tls_context_data
/** Portable TLS configuration data.
Contains all TLS settings (credentials, verification, protocol
options) in a backend-agnostic format. Each TLS backend
(OpenSSL, WolfSSL) consumes this data to configure its
native context. Also maintains a cache of backend-specific
native contexts as an intrusive list.
*/
struct tls_context_data
🤖 Prompt for AI Agents
In `@src/corosio/src/tls/detail/context_impl.hpp` at line 39, Add a class-level
doc comment above the tls_context_data struct describing its role as the central
portable TLS configuration holder consumed by TLS backends, how it relates to
tls_context::impl and native_context_base, and what kinds of fields it contains
(e.g., certificate, key, verification flags). Place the comment immediately
above the declaration of tls_context_data, mirror the style/format of the
existing docs used for native_context_base and tls_context::impl, and keep it
concise (one or two sentences) to satisfy the project's docstring guidelines.

@cppalliance-bot
Copy link

GCOVR code coverage report https://109.corosio.prtest3.cppalliance.org/gcovr/index.html
LCOV code coverage report https://109.corosio.prtest3.cppalliance.org/genhtml/index.html
Coverage Diff Report https://109.corosio.prtest3.cppalliance.org/diff-report/index.html

Build time: 2026-02-09 01:54:12 UTC

@sgerbino sgerbino merged commit 2e13bb5 into cppalliance:develop Feb 9, 2026
31 of 32 checks passed
@sgerbino sgerbino deleted the pr/perf-vf branch February 9, 2026 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants