Switch GPU Util metric to DCGM_FI_PROF_GR_ENGINE_ACTIVE
in NVIDIA DCGM Metrics Dashboard
#341
Labels
enhancement
New feature or request
Is this a new feature, an improvement, or a change to existing functionality?
Improvement
Please provide a clear description of the problem this feature solves
The NVIDIA DCGM Metrics Dashboard on OpenShift 4.15 is using the
DCGM_FI_DEV_GPU_UTIL
metric which only shows 0% or 100% GPU Utilization. The more accurate metric isDCGM_FI_PROF_GR_ENGINE_ACTIVE
. Need to switch metrics to reportDCGM_FI_PROF_GR_ENGINE_ACTIVE
for GPU utilization.Feature Description
From a user prespective I need to see the more accurate GPU utilization when running a GPU workload and not 0% or 100% utilization. The current counter in the NVIDIA DCGM Dashboard on OpenShift is using an older metric
DCGM_FI_PROF_GR_ENGINE_ACTIVE
. Need to switch metrics to reportDCGM_FI_PROF_GR_ENGINE_ACTIVE
for GPU utilization.Describe your ideal solution
Need to switch to report
DCGM_FI_PROF_GR_ENGINE_ACTIVE
metric for GPU utilization.Additional context
No response
The text was updated successfully, but these errors were encountered: