Skip to content

Commit

Permalink
opencl/nvidia: properly detect the PCI domain of NVIDIA GPUs
Browse files Browse the repository at this point in the history
Instead of hardwiring it to 0.

NVIDIA finally implemented this in CUDA 10.2. I couldn't find it
in their OpenCL extenstions, but they told me it's implemented.
It's the very next extension value (0x400a), it reports the right
result, and following extension values (0x400b+) still report errors.

Thanks to François Rue and the wonderful PlaFRIM team for the help.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
  • Loading branch information
bgoglin committed Dec 4, 2019
1 parent 67246b9 commit 91701af
Showing 1 changed file with 9 additions and 4 deletions.
13 changes: 9 additions & 4 deletions include/hwloc/opencl.h
@@ -1,5 +1,5 @@
/*
* Copyright © 2012-2018 Inria. All rights reserved.
* Copyright © 2012-2019 Inria. All rights reserved.
* Copyright © 2013, 2018 Université Bordeaux. All right reserved.
* See COPYING in top-level directory.
*/
Expand Down Expand Up @@ -52,6 +52,7 @@ typedef union {
/* needs "cl_nv_device_attribute_query" device extension, but not strictly required for clGetDeviceInfo() */
#define HWLOC_CL_DEVICE_PCI_BUS_ID_NV 0x4008
#define HWLOC_CL_DEVICE_PCI_SLOT_ID_NV 0x4009
#define HWLOC_CL_DEVICE_PCI_DOMAIN_ID_NV 0x400A


/** \defgroup hwlocality_opencl Interoperability with OpenCL
Expand All @@ -74,7 +75,7 @@ hwloc_opencl_get_device_pci_busid(cl_device_id device,
unsigned *domain, unsigned *bus, unsigned *dev, unsigned *func)
{
hwloc_cl_device_topology_amd amdtopo;
cl_uint nvbus, nvslot;
cl_uint nvbus, nvslot, nvdomain;
cl_int clret;

clret = clGetDeviceInfo(device, HWLOC_CL_DEVICE_TOPOLOGY_AMD, sizeof(amdtopo), &amdtopo, NULL);
Expand All @@ -91,8 +92,12 @@ hwloc_opencl_get_device_pci_busid(cl_device_id device,
if (CL_SUCCESS == clret) {
clret = clGetDeviceInfo(device, HWLOC_CL_DEVICE_PCI_SLOT_ID_NV, sizeof(nvslot), &nvslot, NULL);
if (CL_SUCCESS == clret) {
/* FIXME: PCI bus only uses 8bit, assume nvidia hardcodes the domain in higher bits */
*domain = nvbus >> 8;
clret = clGetDeviceInfo(device, HWLOC_CL_DEVICE_PCI_DOMAIN_ID_NV, sizeof(nvdomain), &nvdomain, NULL);
if (CL_SUCCESS == clret) { /* available since CUDA 10.2 */
*domain = nvdomain;
} else {
*domain = 0;
}
*bus = nvbus & 0xff;
/* non-documented but used in many other projects */
*dev = nvslot >> 3;
Expand Down

0 comments on commit 91701af

Please sign in to comment.