StarPU has found : 56 STARPU_CPU_WORKER workers: CPU 0 CPU 1 CPU 2 CPU 3 CPU 4 CPU 5 CPU 6 CPU 7 CPU 8 CPU 9 CPU 10 CPU 11 CPU 12 CPU 13 CPU 14 CPU 15 CPU 16 CPU 17 CPU 18 CPU 19 CPU 20 CPU 21 CPU 22 CPU 23 CPU 24 CPU 25 CPU 26 CPU 27 CPU 28 CPU 29 CPU 30 CPU 31 CPU 32 CPU 33 CPU 34 CPU 35 CPU 36 CPU 37 CPU 38 CPU 39 CPU 40 CPU 41 CPU 42 CPU 43 CPU 44 CPU 45 CPU 46 CPU 47 CPU 48 CPU 49 CPU 50 CPU 51 CPU 52 CPU 53 CPU 54 CPU 55 8 STARPU_CUDA_WORKER workers: CUDA 0.0 (NVIDIA A100-SXM4-80GB 71.2 GiB 27:00.0) CUDA 1.0 (NVIDIA A100-SXM4-80GB 71.2 GiB 2a:00.0) CUDA 2.0 (NVIDIA A100-SXM4-80GB 71.2 GiB 51:00.0) CUDA 3.0 (NVIDIA A100-SXM4-80GB 71.2 GiB 57:00.0) CUDA 4.0 (NVIDIA A100-SXM4-80GB 71.2 GiB 9e:00.0) CUDA 5.0 (NVIDIA A100-SXM4-80GB 71.2 GiB a4:00.0) CUDA 6.0 (NVIDIA A100-SXM4-80GB 71.2 GiB c7:00.0) CUDA 7.0 (NVIDIA A100-SXM4-80GB 71.2 GiB ca:00.0) No STARPU_OPENCL_WORKER worker topology ... (hwloc logical indexes) numa 0 pack 0 core 0 PU 0 CPU 0 core 1 PU 1 CPU 1 core 2 PU 2 CPU 2 core 3 PU 3 CPU 3 core 4 PU 4 CPU 4 core 5 PU 5 CPU 5 core 6 PU 6 CPU 6 core 7 PU 7 CPU 7 core 8 PU 8 CPU 8 core 9 PU 9 CPU 9 core 10 PU 10 CPU 10 core 11 PU 11 CPU 11 core 12 PU 12 CPU 12 core 13 PU 13 CPU 13 core 14 PU 14 CPU 14 core 15 PU 15 CPU 15 core 16 PU 16 CPU 16 core 17 PU 17 CPU 17 core 18 PU 18 CPU 18 core 19 PU 19 CPU 19 core 20 PU 20 CPU 20 core 21 PU 21 CPU 21 core 22 PU 22 CPU 22 core 23 PU 23 CPU 23 core 24 PU 24 CPU 24 core 25 PU 25 CPU 25 core 26 PU 26 CPU 26 core 27 PU 27 CPU 27 core 28 PU 28 CPU 28 core 29 PU 29 CPU 29 core 30 PU 30 CPU 30 core 31 PU 31 CPU 31 numa 1 pack 1 core 32 PU 32 CPU 32 core 33 PU 33 CPU 33 core 34 PU 34 CPU 34 core 35 PU 35 CPU 35 core 36 PU 36 CPU 36 core 37 PU 37 CPU 37 core 38 PU 38 CPU 38 core 39 PU 39 CPU 39 core 40 PU 40 CPU 40 core 41 PU 41 CPU 41 core 42 PU 42 CPU 42 core 43 PU 43 CPU 43 core 44 PU 44 CPU 44 core 45 PU 45 CPU 45 core 46 PU 46 CPU 46 core 47 PU 47 CPU 47 core 48 PU 48 CPU 48 core 49 PU 49 CPU 49 core 50 PU 50 CPU 50 core 51 PU 51 CPU 51 core 52 PU 52 CPU 52 core 53 PU 53 CPU 53 core 54 PU 54 CPU 54 core 55 PU 55 CPU 55 core 56 PU 56 CUDA 0.0 (NVIDIA A100-SXM4-80GB 71.2 GiB 27:00.0) core 57 PU 57 CUDA 1.0 (NVIDIA A100-SXM4-80GB 71.2 GiB 2a:00.0) core 58 PU 58 CUDA 2.0 (NVIDIA A100-SXM4-80GB 71.2 GiB 51:00.0) core 59 PU 59 CUDA 3.0 (NVIDIA A100-SXM4-80GB 71.2 GiB 57:00.0) core 60 PU 60 CUDA 4.0 (NVIDIA A100-SXM4-80GB 71.2 GiB 9e:00.0) core 61 PU 61 CUDA 5.0 (NVIDIA A100-SXM4-80GB 71.2 GiB a4:00.0) core 62 PU 62 CUDA 6.0 (NVIDIA A100-SXM4-80GB 71.2 GiB c7:00.0) core 63 PU 63 CUDA 7.0 (NVIDIA A100-SXM4-80GB 71.2 GiB ca:00.0) bandwidth (MB/s) and latency (us)... from/to NUMA 0 CUDA 0 CUDA 1 CUDA 2 CUDA 3 CUDA 4 CUDA 5 CUDA 6 CUDA 7 NUMA 0 0 25093 25173 25155 25171 25103 25098 25081 25070 CUDA 0 23839 0 237093 241591 241592 241860 244171 244921 243622 CUDA 1 23839 244506 0 242319 242414 245014 243011 244269 244551 CUDA 2 23840 244862 245277 0 242914 244460 244745 244587 244632 CUDA 3 23840 241273 241860 103391 0 244297 243507 243558 244466 CUDA 4 23836 241772 244534 243944 247462 0 244066 243014 243584 CUDA 5 23835 241711 241605 243762 244878 247284 0 243749 243871 CUDA 6 23834 241188 243913 244744 244161 243936 246822 0 243548 CUDA 7 23835 241782 241208 243598 244189 243356 243619 244084 0 NUMA 0 0 11 11 11 11 12 12 12 12 CUDA 0 10 0 13 13 13 13 13 13 13 CUDA 1 10 13 0 13 13 13 13 13 13 CUDA 2 10 13 13 0 13 13 13 13 13 CUDA 3 10 13 13 12 0 13 13 13 13 CUDA 4 11 14 13 13 13 0 13 13 13 CUDA 5 11 13 14 13 13 13 0 13 13 CUDA 6 11 14 13 13 13 13 13 0 13 CUDA 7 11 14 14 13 13 13 13 13 0 GPU NUMA in preference order (logical index), host-to-device, device-to-host CUDA_0 0 CUDA_1 0 CUDA_2 0 CUDA_3 0 CUDA_4 0 CUDA_5 0 CUDA_6 0 CUDA_7 0