Skip to content

Another crash with stdexec::task (again probably premature coro destruction) #2047

@mika-fischer

Description

@mika-fischer

With main at 45c4cee, the following crashes relatively reliably, especially when enabling heap guards in application verifier.

The chain of events seems to be:

  • The task awaits inline_affine_stopped_sender
  • stdexec::__as_awaitable::__sender_awaiter::await_suspend() starts the sender:
    constexpr auto
    await_suspend([[maybe_unused]] __std::coroutine_handle<> __continuation) noexcept
    -> __std::coroutine_handle<>
    {
    STDEXEC_ASSERT(this->__continuation_.handle() == __continuation);
    // Start the operation.
    STDEXEC::start(__opstate_);
  • The sender completes inline with set_stopped
  • The awaiter sees a stopped result and calls __get_continuation():
    int const __old_refcount = this->__refcount_.fetch_sub(1, __std::memory_order_acq_rel);
    if (__old_refcount == 1)
    {
    // If the refcount was 1 before the decrement, then the operation has already
    // completed on the same thread and we are responsible for resuming the
    // continuation. Otherwise, we can let the receiver resume the continuation when
    // the operation completes.
    return this->__get_continuation();
  • __get_continuation() calls __continuation_.unhandled_stopped():
    constexpr auto __get_continuation() const noexcept -> __std::coroutine_handle<>
    {
    // If the operation was stopped (__result_ is valueless), we should use the
    // unhandled_stopped() continuation. Otherwise, should resume the __continuation_
    // as normal.
    return __result_.__is_valueless() ? __continuation_.unhandled_stopped()
    : __continuation_.handle();
    }
  • That enters task::__promise::unhandled_stopped():
    auto unhandled_stopped() const noexcept -> __std::coroutine_handle<>
    {
    return __state_->__canceled();
    }
  • task::__awaiter::__canceled() forwards stopped to the parent continuation:
    auto __canceled() noexcept -> __std::coroutine_handle<> final
    {
    this->__reset_callback();
    return this->__handle().promise().continuation().unhandled_stopped();
    }
  • The parent here is __connect_awaitable; its stopped handler calls set_stopped on the receiver:
    constexpr auto unhandled_stopped() noexcept -> __std::coroutine_handle<>
    {
    __get_opstate().__on_stopped();
    // Returning noop_coroutine here causes the __connect_awaitable
    // coroutine to never resume past its initial_suspend point
    return __std::noop_coroutine();
    }

    and
    constexpr void __on_stopped() noexcept
    {
    STDEXEC::set_stopped(static_cast<_Receiver&&>(__rcvr_));
    }
  • That reaches spawn’s receiver, which completes and destroys the spawn state:
    static void __do_complete(__spawn_state_base* __base) noexcept
    {
    auto* __self = static_cast<__spawn_state*>(__base);
    [[maybe_unused]]
    auto __assoc = std::move(__self->__assoc_);
    {
    using __traits = std::allocator_traits<_Alloc>::template rebind_traits<__spawn_state>;
    typename __traits::allocator_type __alloc(std::move(__self->__alloc_));
    __traits::destroy(__alloc, __self);
    __traits::deallocate(__alloc, __self, 1);
    }
    }
  • Destroying the spawn state destroys the task operation, which destroys the currently executing task coroutine frame, including the sender awaiter whose await_suspend() has not returned yet.
  • Control returns into __get_continuation() / await_suspend() and MSVC writes the returned coroutine handle through storage that lived in the now-destroyed coroutine frame:
    constexpr auto __get_continuation() const noexcept -> __std::coroutine_handle<>
    {
    // If the operation was stopped (__result_ is valueless), we should use the
    // unhandled_stopped() continuation. Otherwise, should resume the __continuation_
    // as normal.
    return __result_.__is_valueless() ? __continuation_.unhandled_stopped()
    : __continuation_.handle();
    }
  • Application Verifier catches that as INVALID_POINTER_WRITE_AVRF at __as_awaitable.hpp:135

Apparently there was a workaround in place for older MSVC versions that seems very similar in coroutine.hpp:143-258. But a) in the code path above, STDEXEC_CORO_DESTROY_AND_CONTINUE is not used and b) we're using cl 19.44.35215 so the workaround is not active

#include <stdexec/execution.hpp>
#include <exec/static_thread_pool.hpp>

struct inline_affine_stopped_sender {
    using sender_concept = stdexec::sender_tag;
    using completion_signatures = stdexec::completion_signatures<stdexec::set_stopped_t()>;

    template <class Receiver>
    struct operation {
        Receiver rcvr_;

        void start() & noexcept { stdexec::set_stopped(std::move(rcvr_)); }
    };

    template <class Receiver>
    auto connect(Receiver rcvr) && -> operation<Receiver> {
        return {std::move(rcvr)};
    }

    struct attrs {
        [[nodiscard]]
        static constexpr auto query(stdexec::__get_completion_behavior_t<stdexec::set_stopped_t>) noexcept {
            return stdexec::__completion_behavior::__inline_completion
                 | stdexec::__completion_behavior::__asynchronous_affine;
        }
    };

    [[nodiscard]]
    auto get_env() const noexcept -> attrs {
        return {};
    }
};

auto await_inline_stopped_sender() -> stdexec::task<void> {
    co_await inline_affine_stopped_sender{};
}

int main() {
    auto pool = exec::static_thread_pool(1);

    for (int iter = 0; iter < 100; ++iter) {
        auto scope = stdexec::counting_scope();
        stdexec::spawn(stdexec::starts_on(pool.get_scheduler(), await_inline_stopped_sender())
                           | stdexec::upon_error([](auto) noexcept { std::terminate(); }),
                       scope.get_token());
        stdexec::sync_wait(scope.join());
    }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions