You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
msg="Failed to collect metrics with error: Failed to transform metrics for transform unsupported KubernetesGPUIDType for MetricID 'device_name': podMapper"
#145
Closed
suchisur opened this issue
Mar 14, 2023
· 1 comment
Tried this : #27, basically added - name: "DCGM_EXPORTER_KUBERNETES_GPU_ID_TYPE" value: "device-name" On doing this, i run into the above-mentioned error as viewed on the dcgm-exporter daemonet pod logs
P.S> we are using time-slicing and each node has one GPU attached
suchisur
changed the title
I mounted the /proc on the node to /proc on the dcgm exporter pod and can view the processes on doing nvidia-smi now, however on prometheus no per pod metrics are available. Tried this : #27, basically added - name: "DCGM_EXPORTER_KUBERNETES_GPU_ID_TYPE" value: "device-name" On doing this, i run into: msg="Failed to collect metrics with error: Failed to transform metrics for transform unsupported KubernetesGPUIDType for MetricID 'device_name': podMapper"
msg="Failed to collect metrics with error: Failed to transform metrics for transform unsupported KubernetesGPUIDType for MetricID 'device_name': podMapper"
Mar 14, 2023
No description provided.
The text was updated successfully, but these errors were encountered: