StarPU has found : 32 STARPU_CPU_WORKER workers: CPU 0 CPU 1 CPU 2 CPU 3 CPU 4 CPU 5 CPU 6 CPU 7 CPU 8 CPU 9 CPU 10 CPU 11 CPU 12 CPU 13 CPU 14 CPU 15 CPU 16 CPU 17 CPU 18 CPU 19 CPU 20 CPU 21 CPU 22 CPU 23 CPU 24 CPU 25 CPU 26 CPU 27 CPU 28 CPU 29 CPU 30 CPU 31 4 STARPU_CUDA_WORKER workers: CUDA 0.0 (Tesla V100-SXM2-16GB 14.2 GiB 1a:00.0) CUDA 1.0 (Tesla V100-SXM2-16GB 14.2 GiB 1c:00.0) CUDA 2.0 (Tesla V100-SXM2-16GB 14.2 GiB 1d:00.0) CUDA 3.0 (Tesla V100-SXM2-16GB 14.2 GiB 1e:00.0) No STARPU_OPENCL_WORKER worker topology ... (hwloc logical indexes) numa 0 pack 0 core 0 PU 0 CPU 0 core 1 PU 1 CPU 1 core 2 PU 2 CPU 2 core 3 PU 3 CPU 3 core 4 PU 4 CPU 4 core 5 PU 5 CPU 5 core 6 PU 6 CPU 6 core 7 PU 7 CPU 7 core 8 PU 8 CPU 8 core 9 PU 9 CPU 9 core 10 PU 10 CPU 10 core 11 PU 11 CPU 11 core 12 PU 12 CPU 12 core 13 PU 13 CPU 13 core 14 PU 14 CPU 14 core 15 PU 15 CPU 15 core 16 PU 16 CPU 16 core 17 PU 17 CPU 17 numa 1 pack 1 core 18 PU 18 CPU 18 core 19 PU 19 CPU 19 core 20 PU 20 CPU 20 core 21 PU 21 CPU 21 core 22 PU 22 CPU 22 core 23 PU 23 CPU 23 core 24 PU 24 CPU 24 core 25 PU 25 CPU 25 core 26 PU 26 CPU 26 core 27 PU 27 CPU 27 core 28 PU 28 CPU 28 core 29 PU 29 CPU 29 core 30 PU 30 CPU 30 core 31 PU 31 CPU 31 core 32 PU 32 CUDA 0.0 (Tesla V100-SXM2-16GB 14.2 GiB 1a:00.0) core 33 PU 33 CUDA 1.0 (Tesla V100-SXM2-16GB 14.2 GiB 1c:00.0) core 34 PU 34 CUDA 2.0 (Tesla V100-SXM2-16GB 14.2 GiB 1d:00.0) core 35 PU 35 CUDA 3.0 (Tesla V100-SXM2-16GB 14.2 GiB 1e:00.0) bandwidth (MB/s) and latency (us)... from/to NUMA 0 CUDA 0 CUDA 1 CUDA 2 CUDA 3 NUMA 0 0 12203 12223 12220 12222 CUDA 0 12979 0 47413 47744 47742 CUDA 1 12979 47751 0 47761 47764 CUDA 2 12979 47750 47746 0 47639 CUDA 3 12980 47769 47765 47757 0 NUMA 0 0 9 9 9 9 CUDA 0 8 0 10 10 10 CUDA 1 8 10 0 10 10 CUDA 2 8 10 10 0 10 CUDA 3 8 10 10 10 0 GPU NUMA in preference order (logical index), host-to-device, device-to-host CUDA_0 0 CUDA_1 0 CUDA_2 0 CUDA_3 0