R2 RFC for task_group dynamic dependencies #1664

kboyarinov · 2025-03-03T17:09:40Z

Description

Add RFC describing the semantics for concrete task_group dynamic dependencies APIs

Fixes # - issue number(s) if exists

Type of change

Choose one or multiple, leave empty if none of the other choices apply

Add a respective label(s) to PR if you have permissions

bug fix - change that fixes an issue
new feature - change that adds functionality
tests - change in tests
infrastructure - change in infrastructure and CI
documentation - documentation update

Tests

added - required for new features and some bug fixes
not needed

Documentation

updated in # - add PR number
needs to be updated
not needed

Breaks backward compatibility

Yes
No
Unknown

Notify the following users

List users with @ to send notifications

Other information

…details

Co-authored-by: Aleksei Fedotov <aleksei.fedotov@intel.com>

Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>

Co-authored-by: Alexey Kukanov <alexey.kukanov@intel.com> Co-authored-by: Konstantin Boyarinov <konstantin.boyarinov@intel.com>

* Remove concrete proposals from the RFC * Apply review comments

…k_group_dynamic_dependencies

…fc-dynamic-dependencies-r2

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

akukanov · 2025-03-10T14:30:53Z

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

+    * Semantics for ``task_group::run``, ``task_group::run_and_wait``, ``task_arena::enqueue`` and ``this_task_arena::enqueue`` should be defined
+      when the input ``task_handle`` is handling a task in various states.
+    * Semantics for returning a ``task_handle`` handling a task in various states from the body of the task should be described (while using the existing
+      preview extensions for ``task_group``). 


These two sub-items are not sufficient unfortunately. First and foremost, the task_handle class needs to evolve from the current move-only type to a type that can be safely shared between multiple threads and across program scopes.

Addressed by introducing the task_tracker.

akukanov · 2025-03-10T14:33:47Z

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

+(e.g. using ``task_group::run(std::move(task_handle))``) looks misleading even if some guarantees are provided for the referred handle object. It also creates an error-prone
+for TBB developers - any move-construct or move-assign from the accepted handle will break the guarantee. Code analyzers?


It also creates an error-prone for TBB developers

An error-prone what? And whom specifically you refer to as TBB developers?

Also I would suggest to use "move construction" and "move assignment" here, with no hyphen (the hyphen is needed to form adjectives etc., e.g. "move-assignable", "move-assigned"). This would be consistent with the C++ standard use of the terms.

I meant it was easy to make a mistake and break the guarantees while having the non-const rvalue reference in the argument to run. This issue should be addressed by introducing the task_tracker.

akukanov · 2025-03-10T14:57:13Z

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

+(e.g. using ``task_group::run(std::move(task_handle))``) looks misleading even if some guarantees are provided for the referred handle object. It also creates an error-prone
+for TBB developers - any move-construct or move-assign from the accepted handle will break the guarantee. Code analyzers?
+
+To handle this, the proposal is to extend all functions that take the task handled by the ``task_handle`` with the new overload taking an non-const lvalue reference and provide the following guarantees:


I am afraid that taking a non-const lvalue reference is error-prone.

Consider one of the examples from the umbrella RFC (a bit simplified here):

tbb::task_handle add_another_task(tbb::task_group& tg, int work_id) { tbb::task_handle new_task = tg.defer([=] { do_work(work_id); }); // ... tg.run(new_task); // takes the handle by reference // Return the newly created task to the caller return new_task; }

I hope the problem is obvious.

If we want to support this pattern, taking task_handle by reference seems a bad idea. Of course it can be "copied" internally, or the actual task pointer can be extracted and stored somewhere, but the semantics of taking a handle by lvalue reference just does not seem right.

Probably we can/should think of it in the terms of ownership. The current design supports only a single owner, and it is reflected in the move semantics for submission: the ownership is transferred to the task scheduler. We want either a shared ownership, which can be represented by copies of the task handle, or some form of weak reference/observation (see https://en.cppreference.com/w/cpp/memory/weak_ptr). The API design should conform to the chosen approach.

It seems like the required semantics for ownership is somewhere in the middle. It seems like we need "unique" ownership to submit the task and "weak" ownership to track the task progress and set the dependencies. And the option may be to define 2 handlers of the task:

Current task_handle that owns the task and can be used to make any action on it - submit for execution or add dependencies.

New handle (let's say task_view) that does not own the task, but can be used to track progress or add dependencies.

task_handle would be almost unchanged. task_view would be copyable and any copy would be able make actions on the underlying task that is owned either by the task_handle or the TBB scheduler.

task_view can be constructible from task_handle but not the opposite.

Submitting functions (task_group::run and others) would also remain unchanged and allow only the task_handle argument.

Functions for adding dependencies would provide overloads taking both task_handle and task_view where the task in any state can be taken, and only the task_handle where only the task in created state in allowed (like the successor side of make_edge).

I have come to essentially the same conclusion. Even though a new class is an API complication, it seems that trying to extend task_handle semantics and adjust all the related APIs would complicate everything even more.

I would not use task_view for the name, as it is not really a view to a task. Rather, it's a weak_task_handle - though that it is also not ideal because, unlike weak_ptr, this type should not allow "promotion" to a "strong" task handle.

task_tracker with the semantics stated above was introduced.

I also don't like having another class, but I agree that it simplifies thinking about the problem, so while complicating the API, it makes it easier to reason about.

akukanov · 2025-03-10T15:16:57Z

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

+Extra care must be taken while working with this method anyway since even if it returns ``false``, it may be unsafe to submit the ``task_handle`` since the state 
+can be changed by the other thread.


This seems related to the ownership question. Maybe some form of "exclusive" ownership is needed to submit the task.

Addressed by introducing the task_tracker

vossmjp · 2025-03-15T00:14:29Z

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

+<img src="transferring_between_two_separate_tracking_new_successors.png" width=800>
+
+Such an approach can be beneficial if ``current`` task is kind of generator task that collects the set of successors on each iteration of the loop
+and then transfers it to the newly created task.


Can you provide a sketch of such a generator example that shows where the loop would be, and how additional successors would be added to the current task from an outside task or thread. I think the more common use case would be recursive decomposition, such as the merge sort example included in the umbrella RFC, where a task body refines its work into a subgraph. So an initial node N becomes N -> N1. N1, when executed, becomes N1 -> N2, etc. When viewed from outside the task, the original N represents a complete piece of work and so outside tasks or threads would want to make successors of the complete work represented by the N's new subgraph. So an outside task, might want to add N->M. Or, if N is in some state of refinement N->N1->N2->M. In those cases, I can't think of a reason why an outside task or thread would want to insert an additional successor of N ignoring its refinement into a subgraph, such as N->{M,N1->N2}. Maybe your generator case is such as counter-example, but I'm not sure I can immediately imagine practical generator scenarios. Do you have some in mind?

I have thought once more about that and seems like the generator example I had in mind is not realistic. Initially it was something like:

tbb::task_group tg; tbb::task_tracker tracker; tbb::task_handle handle = tg.defer([] { while (exit_condition) { tbb::task_handle new_task = tg.defer(...); tbb::task_group::current_task::transfer_successors_to(new_task); tg.run(new_task); } }); tracker = handle; tg.run(std::move(handle)); // Adding successors to tracker for each iteration of the loop

But the loop inside of the task should wait somehow for the new successors to be added which seems inefficient.
I think now it looks more like finding a problem for the solution:)
I have removed this paragraph from the RFC for now.

The add_successor API was removed from the proposal. I have added the separate section describing alternatives.

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

vossmjp · 2025-03-27T16:24:00Z

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

+### Adding successors to the current task
+
+Consider use-case of parallel wavefront pattern on the 2-d grid. Each cell is computed as part of a separate task in ``task_group``. Each cell task computes
+itself and creates more tasks to process the cell below and the cell on the right.


Usually the executing task does not create the other tasks in a wavefront, since there are typically multiple predecessors. For example, the task to the right of a task A, let's call it C, may also be the task below B. More typically, wavefront tasks are pre-allocated and dependencies set from outside of the tasks themselves. Otherwise, who is responsible for creating C? A or B? What if A creates C, adds itself as a predecessor, and completes before B executes? Would there be an edge from B to C?

Yeah, definitely there should be some mechanism to synchronize on the task creation. So, at least there should be some pre-allocated and pre-set counter somewhere that would gate a particular task instantiation and/or execution.

We have found an example of a divide-and-conquer wavefront where edges are made from within an executing task and also from external tasks, so my previous comment is moot.

The add_successor API was removed from the proposal. I have added the separate section describing alternatives.
I have also added the "Advanced examples" section describing all wavefront patterns we found with the possible implementations.

vossmjp · 2025-03-27T16:26:25Z

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

+
+<img src="transferring_between_two_separate_tracking_new_successors.png" width=800>
+
+Alternative approach is to keep tracking ``current`` and ``target`` together after transferring. This requires introducing the new state of task - a `proxy` state.


I can think of examples where combining tracking is useful. I think we need a counter example, where separating is useful. Otherwise, the choice is obvious.

It seems I missed the use case where adding successors to the task being executed is useful. If there are no one such use case, I believe it would be a way more easier to disallow adding successors to being executed tasks. So, we will have only that tbb::task_group::current_task::transfer_successors_to(task_handle), where task_handle can be only in created or submitted state.

The latest version proposes the combined tracking. Added the "Eager + classic" wavefront in the advanced examples section as a motivation.

…ntics.md Co-authored-by: Mike Voss <michaelj.voss@intel.com>

aleksei-fedotov

I think we are overcomplicating things without proofs we actually need to support all the described behaviors.

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

aleksei-fedotov · 2025-04-17T11:08:27Z

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

+and the task handled by ``successor_task_handle`` -
+``task_group::make_edge(predecessor_task_handle, successor_task_handle)``. 
+
+As it was stated in the parent RFC document, we would like to allow adding predecessors in any state described above and to limit the successor to be a task in created state since it can be too late to add predecessors to


Forgot to allow adding predecessors to the task in submitted state as well? In case the user knows that the successor's execution cannot be started even if it is in submitted state (e.g. because of its predecessors are still in created state), I think it may be reasonable to allow adding more predecessors to such a submitted task.

Suggested change

As it was stated in the parent RFC document, we would like to allow adding predecessors in any state described above and to limit the successor to be a task in created state since it can be too late to add predecessors to

As it was stated in the parent RFC document, we would like to allow adding predecessors in any state described above and to limit the successor to be a task in created or submitted state since it can be too late to add predecessors to

But if the user doesn't know that there are other predecessors in the created state then adding a predecessor to a submitted task results in a race, where it may or may not become a predecessor quickly enough. Is such a race desirable? I understand the case you've described, but I'm not sure we should allow it. Whichever way we decide to go, this should be a an open question that we want answered if its released as a preview.

Yeah. That's the point. If we want building an API that does not allow all these unsafe peculiarities, then I would just start with existing task_handle, even not allowing to specify successors for already submitted tasks, but only for created ones. Let's then see if it won't be enough for the users.

Added this as an open question.

aleksei-fedotov · 2025-04-17T11:25:06Z

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

+### Adding successors to the current task
+
+Consider use-case of parallel wavefront pattern on the 2-d grid. Each cell is computed as part of a separate task in ``task_group``. Each cell task computes
+itself and creates more tasks to process the cell below and the cell on the right.


Yeah, definitely there should be some mechanism to synchronize on the task creation. So, at least there should be some pre-allocated and pre-set counter somewhere that would gate a particular task instantiation and/or execution.

aleksei-fedotov · 2025-04-17T12:01:14Z

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

+If there is a strong dependency between the computation of the current cell and the computations of the following cells, it is required to add
+currently executed task as a predecessor of the tasks representing the cells below and on the right. Since there is no ``task_handle`` or ``task_tracker`` representing the
+currently executed task, the ``make_edge`` function described above cannot be used to set these dependencies. 


Since we are speaking here about the tasks created right inside being executed task, there is no straightforward way to automatically know their task_handles in the other tasks. Therefore, the dependency described here could be just made implicitly by submitting these tasks as the last step which is done in the task being executed. Not to mention that the body of one of the created tasks can simply be run without any task creation and therefore task submission at all.

This API is removed from the latest version of the proposal because of possibility to spawn the dependent tasks after doing the computations.

aleksei-fedotov · 2025-04-17T12:17:31Z

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

+
+<img src="transferring_between_two_separate_tracking_new_successors.png" width=800>
+
+Alternative approach is to keep tracking ``current`` and ``target`` together after transferring. This requires introducing the new state of task - a `proxy` state.


It seems I missed the use case where adding successors to the task being executed is useful. If there are no one such use case, I believe it would be a way more easier to disallow adding successors to being executed tasks. So, we will have only that tbb::task_group::current_task::transfer_successors_to(task_handle), where task_handle can be only in created or submitted state.

aleksei-fedotov · 2025-04-17T12:28:39Z

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

+    task_tracker(const task_tracker& other);
+    task_tracker(task_tracker&& other);


For which use cases we need to allow copying and/or moving of this class instances? Is not it enough to only have possibility to create task_tracker from task_handle?
Similar question about *-assignment operators.

I think it can be useful for storing task_tracker in the containers. I have added a section with advanced examples and possible implementations. Recursive eager wavefront shows the necessity of copy semantics as well (while storing the trackers to the parent subtasks in the child task).

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

aleksei-fedotov · 2025-04-17T13:36:14Z

rfcs/proposed/task_group_dynamic_dependencies/task_handle_semantics.md

+The alternative approaches are to keep only the ``task_handle`` as the only was to track the task, set the
+dependencies and submit the task for execution.
+
+### ``task_handle`` as a unique owner


It seems that having a task_handle as oneTBB has it now should be enough to cover all mentioned use cases with a few extensions to it that will allow adding dependencies between valid task_handles. I am not sure we need to answer right now the question about setting already moved task_handle as a predecessor. We can just prohibit this behavior in the first iteration of this extension for task dynamic dependencies until we find the use cases and users who will benefit from it.
And even if we prove the necessity in such support we may just relax the wording about moved task_handle by saying that the move operation only affects that the task cannot be submitted multiple times, but still allow setting it as a predecessor for another tasks. I believe this will not be confusing to users because "the edges" between the tasks represent the weak thing by itself. Meaning that moving them with the body of the task or leaving them with the task_handle after the move does not make any influence on the execution of the tasks.

Recursive eager wavefront example added to the "Advanced examples" section shows the necessity of creating a predecessor-successor dependencies between the task in any state and the newly created task.
It also shows the necessity of copy and move semantics for task_tracker. If this would be implemented using the task_handle - we will need to make it a shared owner of the task and define the behavior when one of the copies is scheduled for execution.

isaevil · 2025-06-04T13:50:25Z