Skip to content

GPU Utilization is N/A on Intel Data Center GPU Max 1550? #98

@colleeneb

Description

@colleeneb

Hello,

With xpu-smi/1.2.39 we are seeing "N/A" for GPU Utilization, even though we know an application is running on the GPU.

For example, we see:

xpu-smi dump -d 0,1,2,3,4,5 -m 0,1,2,3,4,5
Timestamp, DeviceId, GPU Utilization (%), GPU Power (W), GPU Frequency (MHz), GPU Core Temperature (Celsius Degree), GPU Memory Temperature (Celsius Degree), GPU Memory Utilization (%)
08:08:16.817,    0,  N/A, 281.13, 1600.00, 35.50, 31.50,  N/A
08:08:16.817,    1,  N/A, 286.48, 1600.00, 38.50, 35.00,  N/A
08:08:16.817,    2,  N/A, 281.28, 1600.00, 37.50, 31.50,  N/A
08:08:16.817,    3,  N/A, 293.19, 1600.00, 37.50, 32.50,  N/A
08:08:16.817,    4,  N/A, 292.32, 1600.00, 42.50, 36.00,  N/A
08:08:16.817,    5,  N/A, 284.67, 1600.00, 40.50, 37.50,  N/A
08:08:17.817,    0,  N/A, 293.78, 1600.00, 35.50, 32.00, 20.41
08:08:17.818,    1,  N/A, 287.05, 1600.00, 38.50, 35.00, 1.67
08:08:17.818,    2,  N/A, 281.70, 1600.00, 37.00, 32.00, 1.31
08:08:17.818,    3,  N/A, 293.61, 1600.00, 38.00, 33.00, 0.75
08:08:17.818,    4,  N/A, 292.31, 1600.00, 42.50, 36.00, 0.63
08:08:17.818,    5,  N/A, 284.77, 1600.00, 41.00, 37.00, 0.51
08:08:18.817,    0,  N/A, 409.54, 1600.00, 40.50, 34.50, 20.41

Our system is Aurora at Argonne National Lab, with Intel Data Center GPU Max 1550 PVC GPUs, and UMD and KMD are 1099.12. If you need more information, let us know.

Thanks,
Colleen

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions