Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[LIBOMPTARGET] Adding AMD to llvm-omp-device-info
Adding device information print for AMD devices on the `llvm-omp-device-info` command line tool. The output is inspired by the rocminfo command line tool. This commit adds missing HSA functions, enums and structs needed to query additional information from the HSA agents. A generic message for the `generic-elf-64bit` plugin is also added Example of an output: ``` llvm-omp-device-info Device (0): This is a generic-elf-64bit device Device (1): This is a generic-elf-64bit device Device (2): This is a generic-elf-64bit device Device (3): This is a generic-elf-64bit device Device (4): HSA Runtime Version: 1.1 HSA OpenMP Device Number: 0 Device Name: gfx906 Vendor Name: AMD Device Type: GPU Max Queues: 128 Queue Min Size: 64 Queue Max Size: 131072 Cache: L0: 16384 bytes L1: 8388608 bytes Cacheline Size: 64 Max Clock Freq(MHz): 1725 Compute Units: 60 SIMD per CU: 4 Fast F16 Operation: TRUE Wavefront Size: 64 Workgroup Max Size: 1024 Workgroup Max Size per Dimension: x: 1024 y: 1024 z: 1024 Max Waves Per CU: 40 Max Work-item Per CU: 2560 Grid Max Size: 4294967295 Grid Max Size per Dimension: x: 4294967295 y: 4294967295 z: 4294967295 Max fbarriers/Workgrp: 32 Memory Pools: Pool GLOBAL; FLAGS: COARSE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GLOBAL; FLAGS: FINE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GROUP: Size: 65536 bytes Allocatable: FALSE Runtime Alloc Granule: 0 bytes Runtime Alloc alignment: 0 bytes Accessable by all: FALSE Device (5): HSA Runtime Version: 1.1 HSA OpenMP Device Number: 1 Device Name: gfx906 Vendor Name: AMD Device Type: GPU Max Queues: 128 Queue Min Size: 64 Queue Max Size: 131072 Cache: L0: 16384 bytes L1: 8388608 bytes Cacheline Size: 64 Max Clock Freq(MHz): 1725 Compute Units: 60 SIMD per CU: 4 Fast F16 Operation: TRUE Wavefront Size: 64 Workgroup Max Size: 1024 Workgroup Max Size per Dimension: x: 1024 y: 1024 z: 1024 Max Waves Per CU: 40 Max Work-item Per CU: 2560 Grid Max Size: 4294967295 Grid Max Size per Dimension: x: 4294967295 y: 4294967295 z: 4294967295 Max fbarriers/Workgrp: 32 Memory Pools: Pool GLOBAL; FLAGS: COARSE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GLOBAL; FLAGS: FINE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GROUP: Size: 65536 bytes Allocatable: FALSE Runtime Alloc Granule: 0 bytes Runtime Alloc alignment: 0 bytes Accessable by all: FALSE Device (6): HSA Runtime Version: 1.1 HSA OpenMP Device Number: 2 Device Name: gfx906 Vendor Name: AMD Device Type: GPU Max Queues: 128 Queue Min Size: 64 Queue Max Size: 131072 Cache: L0: 16384 bytes L1: 8388608 bytes Cacheline Size: 64 Max Clock Freq(MHz): 1725 Compute Units: 60 SIMD per CU: 4 Fast F16 Operation: TRUE Wavefront Size: 64 Workgroup Max Size: 1024 Workgroup Max Size per Dimension: x: 1024 y: 1024 z: 1024 Max Waves Per CU: 40 Max Work-item Per CU: 2560 Grid Max Size: 4294967295 Grid Max Size per Dimension: x: 4294967295 y: 4294967295 z: 4294967295 Max fbarriers/Workgrp: 32 Memory Pools: Pool GLOBAL; FLAGS: COARSE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GLOBAL; FLAGS: FINE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GROUP: Size: 65536 bytes Allocatable: FALSE Runtime Alloc Granule: 0 bytes Runtime Alloc alignment: 0 bytes Accessable by all: FALSE Device (7): HSA Runtime Version: 1.1 HSA OpenMP Device Number: 3 Device Name: gfx906 Vendor Name: AMD Device Type: GPU Max Queues: 128 Queue Min Size: 64 Queue Max Size: 131072 Cache: L0: 16384 bytes L1: 8388608 bytes Cacheline Size: 64 Max Clock Freq(MHz): 1725 Compute Units: 60 SIMD per CU: 4 Fast F16 Operation: TRUE Wavefront Size: 64 Workgroup Max Size: 1024 Workgroup Max Size per Dimension: x: 1024 y: 1024 z: 1024 Max Waves Per CU: 40 Max Work-item Per CU: 2560 Grid Max Size: 4294967295 Grid Max Size per Dimension: x: 4294967295 y: 4294967295 z: 4294967295 Max fbarriers/Workgrp: 32 Memory Pools: Pool GLOBAL; FLAGS: COARSE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GLOBAL; FLAGS: FINE GRAINED, : Size: 34342961152 bytes Allocatable: TRUE Runtime Alloc Granule: 4096 bytes Runtime Alloc alignment: 4096 bytes Accessable by all: FALSE Pool GROUP: Size: 65536 bytes Allocatable: FALSE Runtime Alloc Granule: 0 bytes Runtime Alloc alignment: 0 bytes Accessable by all: FALSE ``` Differential Revision: https://reviews.llvm.org/D126836
- Loading branch information