Skip to content

Bound allocations for constrained task_arena #1721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions rfcs/proposed/numa_support/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,8 +136,7 @@ See [sub-RFC for creation of NUMA-constrained arenas](create-numa-arenas.md)

### NUMA-aware allocation

Define allocators or other features that simplify the process of allocating or placing data onto
specific NUMA nodes.
See [sub-RFC for constraining thread allocations](allocations-bound-to-constrained-arena.org)

### Simplified approaches to associate task distribution with data placement

Expand Down
101 changes: 101 additions & 0 deletions rfcs/proposed/numa_support/allocations-bound-to-constrained-arena.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
#+TITLE: Bind Allocations for Constrained Threads

This is a sub-RFC of [[file:README.md][the general RFC about better NUMA support in oneTBB]].

* Introduction
oneTBB allows binding threads that join a ~tbb::task_arena~ for task execution to a particular CPU
mask that in most of the cases is related to a single NUMA node on the platform, but also can be
associated with specific core types that do not necessarily correspond to a single NUMA node. The
binding settings are specified using the [[https://github.com/uxlfoundation/oneTBB/blob/2df02d2ac710ff22a917d008dc04d7a21084e32e/include/oneapi/tbb/info.h#L36-L65][~tbb::task_arean::constraints~]] structure. These settings
affect pinning of software threads onto hardware cores and has no explicit guidance about where
memory is physically allocated by these pinned threads, effectively relying on the OS settings or
preferences set up earlier by a user.

The motivation is to introduce a handle that would allow users to explicitly specify that memory
allocations done by the threads should also be constrained to the constraints of the
~tbb::task_arena~ they join.

* Proposal
Introduce an interface that will indicate that memory allocations by threads should preferably be
bound to the constraint settings of the ~tbb::task_arena~ instance.

Since the functionality represents additional constraint, it is reasonable to extend the existing
constraints struct with the new interface for this feature.

Therefore, the interface is an extension to the ~tbb:task_arena::constraints~ struct:
#+begin_src C++
namespace tbb {
namespace detail {
namespace d1 {

using numa_node_id = int;
using core_type_id = int;

struct constraints {
#if !__TBB_CPP20_PRESENT
constraints(numa_node_id id = -1, int maximal_concurrency = -1,
bool bind_memory_allocations = false) // <-- new parameter
: numa_id(id)
, max_concurrency(maximal_concurrency)
, core_type(-1)
, max_threads_per_core(-1)
, bind_memory_allocations(bind_memory_allocations) // <-- new member
{}
#endif /*!__TBB_CPP20_PRESENT*/

constraints& set_numa_id(numa_node_id id) {
numa_id = id;
return *this;
}
/* ... similar setters for other parameters ... */

// New method to set memory allocation binding
constraints& set_bind_memory_allocations(bool bind = true) {
bind_memory_allocations = bind;
return *this;
}

numa_node_id numa_id = -1;
/* ... other fields ... */
bool bind_memory_allocations = false; // <-- new member
};

} // namespace d1
} // namespace detail
} // namespace tbb
#+end_src

Implementation-wise the feature relies on HWLOC library. In particular, its [[https://hwloc.readthedocs.io/en/stable/group__hwlocality__membinding.html#ga020951efa0ce3862bd4faec295501a7f][~hwloc_set_membind~]] and
[[https://hwloc.readthedocs.io/en/stable/group__hwlocality__membinding.html#gae21f0a1a884929c784bebf070252aa56][~hwloc_get_membind~]] functions.

** Alternatives
Since there is no guarantee that the allocations will be actually bound, the naming of the feature
may imply the preference rather than strict enforcement. Although, it will be explained in the
documentation, from the code readability standpoint it is better to have interfaces that accurately
describe the actual behavior.

*** Naming
Naming alternatives for the parameter and struct field with variations in square brackets ~[]~:
- ~prefer_local_allocations[_first]~
- ~prefer_bound_allocations[_first]~
- ~prefer_local_memory~.

Alternatives for the word ~prefer~:
- ~opt_for~
- ~favor~
- ~try~

*** Setter Method
Because the feature represents a toggle (i.e. can be code by a single boolean variable), it might
make sense to have setter that only switches the feature ON. For example:
~[prefer_]bind_memory_allocations()~

*** Default Value
Currently it is not proven that the feature makes any difference performance-wise. Depending on the
performance results it can be switched ON or OFF by default. It might also left unimplemented (i.e.
archived) if study results show that the feature does not help in improving the performance.

* Open Questions
1. Naming.
2. Should the feature indicate (i.e. by means of error reporting) that the memory binding is not
possible?