Skip to content

Commit

Permalink
levelzero: expose subdevices as sub-osdevices
Browse files Browse the repository at this point in the history
ze0 may contain ze0.0 and ze0.1 if the hardware contains 2 subdevices.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
  • Loading branch information
bgoglin committed Nov 30, 2021
1 parent f90ce3b commit 347039e
Show file tree
Hide file tree
Showing 5 changed files with 75 additions and 6 deletions.
7 changes: 5 additions & 2 deletions NEWS
Expand Up @@ -31,8 +31,11 @@ Version 2.7.0
Thanks to Jonathan Cameron for the help.
- HWLOC_DONT_MERGE_CLUSTER_GROUPS=1 may be set in the environment
to prevent these groups from being merged with identical caches, etc.
+ Add many new attributes in LevelZero devices to describe their types
and numbers of slices, subslices, execution units and threads.
+ Improve the oneAPI LevelZero backend:
- Expose subdevices such as "ze0.1" inside root OS devices ("ze0")
when the hardware contains multiple subdevices.
- Add many new attributes to describe device type, and the
numbers of slices, subslices, execution units and threads.
* Tools
+ Add --grey and --palette options to switch lstopo to greyscale or
white-background-only graphics, or to tune individual colors.
Expand Down
15 changes: 13 additions & 2 deletions doc/hwloc.doxy
Expand Up @@ -1482,6 +1482,9 @@ are directly attached near their NUMA node
Also, if hwloc could not discover PCI for some reason, PCI-related
OS devices may also be attached directly to normal objects.

Finally, OS <em>subdevices</em> may be exposed as OS devices children
of another OS device. This is the case of LevelZero subdevices for instance.

hwloc first tries to discover OS devices from the operating system,
e.g. <em>eth0</em>, <em>sda</em> or <em>mlx4_0</em>.
However, this ability is currently only available on Linux for some
Expand Down Expand Up @@ -1533,7 +1536,8 @@ components when I/O discovery is enabled and supported.
<em>opencl1d3</em> for the fourth device of the second OpenCL platform
("OpenCL" subtype, from the OpenCL component)</li>
<li><em>ze0</em> for the first Level Zero device
("LevelZero" subtype, from the levelzero component, using the oneAPI Level Zero library)</li>
("LevelZero" subtype, from the levelzero component, using the oneAPI Level Zero library),
and <em>ze0.1</em> for its second subdevice (if any).</li>
<li><em>cuda0</em> for the first NVIDIA CUDA device
("CUDA" subtype, from the CUDA component, using the NVIDIA CUDA Library)</li>
<li><em>ve0</em> for the first NEC Vector Engine device
Expand Down Expand Up @@ -1981,14 +1985,20 @@ and SectorSize (in bytes).
<dd>The index of the Level Zero driver within the list of drivers,
and the index of the device within the list of devices managed by this driver.
</dd>
<dt>LevelZeroSubdevices (Level OS devices)</dt>
<dd>The number of subdevices below this OS device.
</dd>
<dt>LevelZeroSubdeviceID (LevelZero OS subdevices)</dt>
<dd>The index of this subdevice within its parent.
</dd>
<dt>LevelZeroDeviceType (LevelZero OS devices or subdevices)</dt>
<dd>A string describing the type of device, for instance "GPU", "CPU", "FPGA", etc.
</dd>
<dt>LevelZeroDeviceNumSlices, LevelZeroDeviceNumSubslicesPerSlice, LevelZeroDeviceNumEUsPerSubslice, LevelZeroDeviceNumThreadsPerEU (LevelZero OS devices or subdevices)</dt>
<dd>The number of slices in the devices, of subslices per slice,
of execution units (EU) per subslice, and of threads per EU.
</dd>
<dt>LevelZeroCQGroups, LevelZeroCQGroup2 (LevelZero OS devices)</dt>
<dt>LevelZeroCQGroups, LevelZeroCQGroup2 (LevelZero OS devices/subdevices)</dt>
<dd>The number of completion queue groups, and the description of the third group
(as <tt>N*0xX</tt> where <tt>N</tt> is the number of queues in the group,
and <tt>0xX</tt> is the hexadecimal bitmask of <tt>ze_command_queue_group_property_flag_t</tt>
Expand Down Expand Up @@ -3041,6 +3051,7 @@ environment variable (see \ref envvar).
<dd>
This component creates co-processor OS device objects such as
<em>ze0</em> for describing oneAPI Level Zero devices.
It may also create sub-OS-devices such as <em>ze0.0</em> inside those devices.
<b>It may be built as a plugin</b>.
</dd>
<dt>cuda</dt>
Expand Down
54 changes: 52 additions & 2 deletions hwloc/topology-levelzero.c
Expand Up @@ -21,6 +21,7 @@ hwloc__levelzero_properties_get(ze_device_handle_t h, hwloc_obj_t osdev,
ze_result_t res;
ze_device_properties_t prop;
zes_device_properties_t prop2;
int is_subdevice = 0;

memset(&prop, 0, sizeof(prop));
res = zeDeviceGetProperties(h, &prop);
Expand Down Expand Up @@ -50,8 +51,15 @@ hwloc__levelzero_properties_get(ze_device_handle_t h, hwloc_obj_t osdev,
hwloc_obj_add_info(osdev, "LevelZeroDeviceNumEUsPerSubslice", tmp);
snprintf(tmp, sizeof(tmp), "%u", prop.numThreadsPerEU);
hwloc_obj_add_info(osdev, "LevelZeroDeviceNumThreadsPerEU", tmp);

if (prop.flags & ZE_DEVICE_PROPERTY_FLAG_SUBDEVICE)
is_subdevice = 1;
}

if (is_subdevice)
/* no need for the following info attrs in subdevices */
return;

/* try to get additional info from sysman if enabled */
memset(&prop2, 0, sizeof(prop2));
res = zesDeviceGetProperties(h, &prop2);
Expand Down Expand Up @@ -124,7 +132,7 @@ hwloc_levelzero_discover(struct hwloc_backend *backend, struct hwloc_disc_status
enum hwloc_type_filter_e filter;
ze_result_t res;
ze_driver_handle_t *drh;
uint32_t nbdrivers, i, zeidx;
uint32_t nbdrivers, i, k, zeidx;
int sysman_maybe_missing = 0; /* 1 if ZES_ENABLE_SYSMAN=1 was NOT set early, 2 if ZES_ENABLE_SYSMAN=0 */
char *env;

Expand Down Expand Up @@ -191,7 +199,8 @@ hwloc_levelzero_discover(struct hwloc_backend *backend, struct hwloc_disc_status
for(j=0; j<nbdevices; j++) {
zes_pci_properties_t pci;
zes_device_handle_t sdvh = dvh[j];
hwloc_obj_t osdev, parent;
uint32_t nr_subdevices;
hwloc_obj_t osdev, parent, *subosdevs = NULL;

osdev = hwloc_alloc_setup_object(topology, HWLOC_OBJ_OS_DEVICE, HWLOC_UNKNOWN_INDEX);
snprintf(buffer, sizeof(buffer), "ze%u", zeidx); // ze0d0 ?
Expand All @@ -210,6 +219,41 @@ hwloc_levelzero_discover(struct hwloc_backend *backend, struct hwloc_disc_status

hwloc__levelzero_cqprops_get(dvh[j], osdev);

nr_subdevices = 0;
res = zeDeviceGetSubDevices(dvh[j], &nr_subdevices, NULL);
/* returns ZE_RESULT_ERROR_INVALID_ARGUMENT if there are no subdevices */
if (res == ZE_RESULT_SUCCESS && nr_subdevices > 0) {
zes_device_handle_t *subh;
char tmp[64];
snprintf(tmp, sizeof(tmp), "%u", nr_subdevices);
hwloc_obj_add_info(osdev, "LevelZeroSubdevices", tmp);
subh = malloc(nr_subdevices * sizeof(*subh));
subosdevs = malloc(nr_subdevices * sizeof(*subosdevs));
if (subosdevs && subh) {
zeDeviceGetSubDevices(dvh[j], &nr_subdevices, subh);
for(k=0; k<nr_subdevices; k++) {
subosdevs[k] = hwloc_alloc_setup_object(topology, HWLOC_OBJ_OS_DEVICE, HWLOC_UNKNOWN_INDEX);
snprintf(buffer, sizeof(buffer), "ze%u.%u", zeidx, k);
subosdevs[k]->name = strdup(buffer);
subosdevs[k]->depth = HWLOC_TYPE_DEPTH_UNKNOWN;
subosdevs[k]->attr->osdev.type = HWLOC_OBJ_OSDEV_COPROC;
subosdevs[k]->subtype = strdup("LevelZero");
hwloc_obj_add_info(subosdevs[k], "Backend", "LevelZero");
snprintf(tmp, sizeof(tmp), "%u", k);
hwloc_obj_add_info(subosdevs[k], "LevelZeroSubdeviceID", tmp);

hwloc__levelzero_properties_get(dvh[j], subosdevs[k], sysman_maybe_missing);

hwloc__levelzero_cqprops_get(subh[k], subosdevs[k]);
}
} else {
free(subosdevs);
subosdevs = NULL;
nr_subdevices = 0;
}
free(subh);
}

parent = NULL;
res = zesDevicePciGetProperties(sdvh, &pci);
if (res == ZE_RESULT_SUCCESS) {
Expand All @@ -227,6 +271,12 @@ hwloc_levelzero_discover(struct hwloc_backend *backend, struct hwloc_disc_status
parent = hwloc_get_root_obj(topology);

hwloc_insert_object_by_parent(topology, parent, osdev);
if (nr_subdevices) {
for(k=0; k<nr_subdevices; k++)
if (subosdevs[k])
hwloc_insert_object_by_parent(topology, osdev, subosdevs[k]);
free(subosdevs);
}
zeidx++;
}

Expand Down
4 changes: 4 additions & 0 deletions tests/hwloc/ports/include/levelzero/level_zero/ze_api.h
Expand Up @@ -24,8 +24,11 @@ typedef enum _ze_device_type {
ZE_DEVICE_TYPE_VPU = 5
} ze_device_type_t;

#define ZE_DEVICE_PROPERTY_FLAG_SUBDEVICE (1<<1)

typedef struct ze_device_properties {
ze_device_type_t type;
unsigned flags;
uint32_t numThreadsPerEU;
uint32_t numEUsPerSubslice;
uint32_t numSubslicesPerSlice;
Expand All @@ -41,5 +44,6 @@ typedef struct ze_command_queue_group_properties {

extern ze_result_t zeDeviceGetCommandQueueGroupProperties(ze_driver_handle_t, uint32_t *, ze_command_queue_group_properties_t *);

extern ze_result_t zeDeviceGetSubDevices(ze_device_handle_t, uint32_t *, ze_device_handle_t*);

#endif /* HWLOC_PORT_L0_ZE_API_H */
1 change: 1 addition & 0 deletions tests/hwloc/ports/include/levelzero/level_zero/zes_api.h
Expand Up @@ -16,6 +16,7 @@ typedef struct {
char *modelName;
char *serialNumber;
char *boardNumber;
unsigned numSubdevices;
} zes_device_properties_t;

typedef struct {
Expand Down

0 comments on commit 347039e

Please sign in to comment.