flow.async: perf: Concurrent_task_loop hierarchy should be de-virtualized and templatized. #72

ygoldfeld · 2024-03-10T08:31:28Z

Filed by @ygoldfeld pre-open-source:

flow::async has been quite the work-horse for (projects) (EDIT: now also in Flow-IPC). It's held up under pressure IMO. However, as I make changes to it, I learn some lessons - and one of those is that I probably made a (non-fatal) design mistake in basing the Concurrent_task_loop hierarchy on a classic Java-style interface/virtual basis.

For ease of discussion, I'll pretend that the only way of posting a task is to use L->post(F), or else ....async_X(..., asio_handler_via_op(L, O, F)). The schedule_*() stuff is similar, as is asio_handler_timed(), respectively.

Currently there is a hierarchy:

Segregated_thread_task_loop => Concurrent_task_loop <= Cross_thread_task_loop
Ctl::post() is virtual and overrides in the other 2.
** Since it is virtual, it cannot take a tparam Handler H; it takes a boost::function<void()> (alias Task). This creates some unnecessary computing, as such an object must always be constructed (implicitly typically).
** Furthermore, it is not possible to write something like post() that would not lose a bound executor (e.g., strand) of H; constructing a function<> always loses any bound executor - silently and evilly, as it is hard to detect resulting bugs (I speak from experience). While this is not a problem for public post() itself, it would be a problem if one wanted to wrap a handler with some added processing without losing bound executor (like strand).
asio_handler_via_op(L, O, H) doesn't wrap H but rather returns a handler that L->post(O)s the H(params...) call.
** This is a perf loss; ideally it would return a handler that would just bind_executor() from the O (the strand within) in the case of Cross_thread_task_loop; without any added async delay from the added post(). And for Segregated_thread_task_loop it could just return the handler unchanged.
*** But it cannot do this by delegating different behavior depending on which C_t_l subclass it is; as any virtual behavior cannot take a tparam Handler (see above).
*** Note: This applies to any other thing like asio_handler_timed() and anything else of that nature we might want in the future.

What we should do IMO is to get rid of the C_t_l interface and replace it with a concept (it could still be rigorously documented like today). And then the individual subclasses wouldn't be subclasses anymore; they'd just behave the way a C_t_l must behave (duck typing).

This would make further development quite a bit more natural, I've found. I ran into issues (in implementing asio_handler_timed() and related) that wouldn't have been issues with this change. Vaguely speaking, the template-y nature of dealing with boost.asio handlers tends to come in conflict with the virtual interface design here.

And it would improve performance - in quite a critical place; who knows how many processor cycles (services that use flow.async) could get back.

What would be lost? The following things actively use the virtual/interface design:

It might be a mildly breaking change - some user code might need small tweaks due to probably-unavoidable API changes.
Any user code that can swap in a Segregated_thread_task_loop or a Cross_... at run-time, or compile-time for that matter.
** But in reality - even though such use is documented in my doc headers - no code demands this ability that I know of. (EDIT: Flow is now open-source; that could change rapidly, in theory anyway.)
** Moreover, even if run-time swap-in/swap-out is required somewhere, a virtual hierarchy can still exist by wrapping a subset of the modified classes. I.e., virtual on top of template = possible; but the reverse = not possible.
There is some internal code - Timed_..._loop stuff - that uses the virtual design.
** But it could without much pain just be rewritten to be templated. Hell, the main class (Timed_concurrent_task_loop_impl<Accumulator>) is already a template anyway; it would just gain an arg (ideally on ctor only, but failing that on the class).

In terms of priority, can tackle this when there is a bit of free time. It's not urgent. The current setup works fine. But the sooner the better, both for field perf and before further flow.async changes are needed and/or more flow.async users exist.

The text was updated successfully, but these errors were encountered:

ygoldfeld added enhancement New feature or request from-akamai-pre-open Issue origin is Akamai, before opening source labels Mar 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flow.async: perf: Concurrent_task_loop hierarchy should be de-virtualized and templatized. #72

flow.async: perf: Concurrent_task_loop hierarchy should be de-virtualized and templatized. #72

ygoldfeld commented Mar 10, 2024 •

edited

Loading

flow.async: perf: Concurrent_task_loop hierarchy should be de-virtualized and templatized. #72

flow.async: perf: Concurrent_task_loop hierarchy should be de-virtualized and templatized. #72

Comments

ygoldfeld commented Mar 10, 2024 • edited Loading

ygoldfeld commented Mar 10, 2024 •

edited

Loading