Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 41 additions & 16 deletions affinity/cpp-20/d0796r2.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,16 @@

# Changelog

### P0796r2 (RAP)

* Introduced a free function for retrieving the execution resource underlying the current thread of execution.

### P0796r1 (JAX)

* Introduce proposed wording.
* Based on feedback from SG1, introduced a pair-wise interface for querying the relative affinity between execution resources.
* Introduce an interface for retrieving an allocator or polymorphic memory resource.
* Based on feedback from SG1, remove requirement for a hierarchical system topology structure, which doesn't require a root resouce.
* Based on feedback from SG1, remove requirement for a hierarchical system topology structure, which doesn't require a root resource.

### P0796r0 (ABQ)

Expand Down Expand Up @@ -180,12 +184,12 @@ An `execution_resource` is a light weight structure which acts as an identifier

### System topology

The system topology is made up of a number of system level `execution_resource`s, which can be queried through `this_system::resource` which returns a `std::vector`. The `execution_resources` available within the system can be initialised dynamically by a runtime library, however must be done so before `main` is called, given that after that point the system topology cannot change.
The system topology is made up of a number of system level `execution_resource`s, which can be queried through `this_system::get_resources` which returns a `std::vector`. The `execution_resources` available within the system can be initialised dynamically by a runtime library, however must be done so before `main` is called, given that after that point the system topology cannot change.

Below *(Listing 2)* is an example of iterating over the system level resources and priniting out it's capabilities.

```cpp
for (auto res : execution::this_system::resources()) {
for (auto res : execution::this_system::get_resources()) {
std::cout << res.name() `\n`;
std::cout << res.can_place_memory() << `\n`;
std::cout << res.can_place_agents() << `\n`;
Expand All @@ -194,14 +198,18 @@ for (auto res : execution::this_system::resources()) {
```
*Listing 2: Example of querying all the system level execution resources*

### Current resource

The `execution_resource` which underlies the current thread of execution can be queried through `this_thread::get_resource`.

### Querying relative affinity

The `affinity_query` class template provides an abstraction for a relative affinity value between two `execution_resource`s, derived from a particular `affinity_operation` and `affinity_metric`. The `affinity_query` is templated by `affinity_operation` and `affinity_metric` and is constructed from two `execution_resource`s. An `affinity_query` does not mean much on it's own, instead a relative magnitude of affinity can be queried by using comparison operators. If nessesary the value of an `affinity_query` can also be queried through `native_affinity`, though the return value of this is implementation defined.
The `affinity_query` class template provides an abstraction for a relative affinity value between two `execution_resource`s, derived from a particular `affinity_operation` and `affinity_metric`. The `affinity_query` is templated by `affinity_operation` and `affinity_metric` and is constructed from two `execution_resource`s. An `affinity_query` does not mean much on it's own, instead a relative magnitude of affinity can be queried by using comparison operators. If necessary the value of an `affinity_query` can also be queried through `native_affinity`, though the return value of this is implementation defined.

Below *(listing 3)* is an example of how you can query the relative affinity between two `execution_resource`s.

```cpp
auto systemLevelResources = execution::this_system::resources();
auto systemLevelResources = execution::this_system::get_resources();
auto memberResources = systemLevelResources.resources();

auto relativeLatency01 = execution::affinity_query<execution::affinity_operation::read,
Expand All @@ -223,15 +231,15 @@ The `execution_context` class provides an abstraction for managing a number of l
Below *(Listing 4)* is an example of how this extended interface could be used to construct an *execution context* from an *execution resource* which is retrieved from the *system’s resource topology*. Once an *execution context* is constructed it can then still be queried for its *execution resource* and then that *execution resource* can be further partitioned.

```cpp
auto &resources = execution::this_system::resources();
auto &resources = execution::this_system::get_resources();

execution::execution_context execContext(resources[0]);

auto &systelLevelResource = execContext.resource();
auto &systemLevelResource = execContext.resource();

// resource[0] should be equal to execResource

for (auto res : systelLevelResource.resources()) {
for (auto res : systemLevelResource.resources()) {
std::cout << res.name() << `\n`;
}
```
Expand All @@ -242,11 +250,11 @@ for (auto res : systelLevelResource.resources()) {
When creating an `execution_context` from a given `execution_resource`, the executors and allocators associated with it are bound to that `execution_resource`. For example: when creating an `execution_resource` from a CPU socket resource, all executors associated with the given socket will spawn execution agents with affinity to the socket partition of the system *(Listing 5)*.

```cpp
auto cList = std::execution::this_system::resources();
auto cList = std::execution::this_system::get_resources();
// FindASocketResource is a user-defined function that finds a
// resource that is a CPU socket in the given resource list
auto& socket = findASocketResource(cList);
execution_contexteC{socket} // Associated with the socket
execution_contextC{socket} // Associated with the socket
auto executor = eC.executor(); // By transitivity, associated with the socket too
auto socketAllocator = eC.allocator(); // Retrieve an allocator to the closest memory node
std::vector<int, decltype(socketAllocator)> v1(100, socketAllocator);
Expand Down Expand Up @@ -343,13 +351,20 @@ If a particular policy or algorithm requires to access placement information, th

};

} // execution

/* This system */

namespace this_system {
std::vector<execution_resource> resources() noexcept;
}

} // execution
/* This thread */

namespace this_thread {
std::experimental::execution::execution_resource get_resource() noexcept;
}

} // experimental
} // std

Expand Down Expand Up @@ -496,7 +511,7 @@ The `affinity_query` class template provides an abstraction for a relative affin
friend expected<size_t, error_type> operator>=(const affinity_query&, const affinity_query&);

*Returns:* An `expected<size_t, error_type>` where,
* if the affinity query was succesful, the value of type `size_t` represents the magnitude of the relative affinity;
* if the affinity query was successful, the value of type `size_t` represents the magnitude of the relative affinity;
* if the affinity query was not successful, the error is an error of type `error_type` which represents the reason for affinity query failed.

> [*Note:* An affinity query is permitted to fail if affinity between the two execution resources cannot be calculated for any reason, such as the resources are of different vendors or communication between the resources is not possible. *--end note*]
Expand All @@ -505,16 +520,26 @@ The `affinity_query` class template provides an abstraction for a relative affin

## Free functions

The free function `this_system::resources` is provided for retrieving the `execution_resource`s which encapsulate the hardware platforms available within the system, these are refered to as the *system level resources*.
### `this_system::get_resources`

The free function `this_system::get_resources` is provided for retrieving the `execution_resource`s which encapsulate the hardware platforms available within the system, these are referred to as the *system level resources*.

std::vector<execution_resource> resources() noexcept;

*Returns:* A std::vector containing all *system level resources*.

*Requires:* If `this_system::resources().size() > 0`, `this_system::resources()[0]` be the `execution_resource` use by `std::thread`. The value returned by `this_system::resources()` be the same at any point after the invocation of `main`.
*Requires:* If `this_system::get_resources().size() > 0`, `this_system::get_resources()[0]` be the `execution_resource` use by `std::thread`. The value returned by `this_system::get_resources()` be the same at any point after the invocation of `main`.

> [*Note:* Returning a `std::vector` allows users to potentially manipulate the container of `execution_resource`s after it is returned, we may want to replace this with an alternative type which is more restrictive at a later date such as a range. *--end note*]

### `this_thread::get_resource`

The free function `this_thread::get_resource` is provided for retrieving the `execution_resource` underlying the current thread of execution.

std::experimental::execution::execution_resource get_resource() noexcept;

*Returns:* The `execution_resource` underlying the current thread of execution.

# Future Work

## Migrating data from memory allocated in one partition to another
Expand All @@ -535,7 +560,7 @@ With the ability to place memory with affinity comes the ability to define algor

## Level of abstraction

The current proposal provides an interface for querying whether an `execution_resource` can allocate and/or execute work, it can provide the concurrency it supports and it can provide a name. We also provide the `affinity_query` structure for querying the relative affinity metrics between two `execution_resource`s. However this may not be enough information for users to take full advance of the system, they may also want to know what kind of memory is available or the properties by which work is executed. It was decided that attempting to enumerate the various hardware components would not be ideal as that would make it harder for implementers to support new hardware. It has been discussed that a better approach would be to parameterise the additional properties of hardware such that hardware queries could be much more generic.
The current proposal provides an interface for querying whether an `execution_resource` can allocate and/or execute work, it can provide the concurrency it supports and it can provide a name. We also provide the `affinity_query` structure for querying the relative affinity metrics between two `execution_resource`s. However this may not be enough information for users to take full advance of the system, they may also want to know what kind of memory is available or the properties by which work is executed. It was decided that attempting to enumerate the various hardware components would not be ideal as that would make it harder for implementors to support new hardware. It has been discussed that a better approach would be to parameterize the additional properties of hardware such that hardware queries could be much more generic.

We may wish to mirror the design of the executors proposal and have a generic query interface using properties for querying information about an `execution_resource`. It’s expected that an implementation may provide additional nonstandard queries that are specific to that implementation.

Expand All @@ -545,7 +570,7 @@ We may wish to mirror the design of the executors proposal and have a generic qu

## Dynamic topology discovery

The current proposal requires that all `execution_resource`s are initialised before `main` is called, therefore not allowing an `execution_resource` to become available or go offline at runtime. We may wish to support this in the future, however this is outside of the scope of this paper.
The current proposal requires that all `execution_resource`s are initialized before `main` is called, therefore not allowing an `execution_resource` to become available or go offline at runtime. We may wish to support this in the future, however this is outside of the scope of this paper.

| Straw Poll |
|------------|
Expand Down