-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error setting watches. Result: -33: This request is serviced by a module of DCGM that is not currently loaded #50
Comments
The DCP family of metrics (1001-1015) are not supported on RTX GPUs. The profiling module is not loaded if supported GPUs are not detected. WBR, |
Dear Nik, Thanks for the prompt reply. Is it possible or planned to add support for those RTX GPUs? We are trying to leverage DCGM to conduct some performance modeling research and hope to have your help. Best regards, |
By the way, I have also tried profiling of DCGM on GTX 1650 SUPER and observed the same error: Error setting watches. Result: -33: This request is serviced by a module of DCGM that is not currently loaded Do the GTX cards support profiling with DCGM? Best regards, |
The DCP metrics are only supported on Datacenter grade and Quadro GPUs. Neither RTX nor GTX kind of GPUs is supported. |
@nikkon-dev How about the NVIDIA RTX A4000? The NVIDIA RTX series represents a new series of Quadro GPUs, although regrettably, the DCGM does not seem to be compatible with it. For further information, kindly check the description on the NVIDIA's web page: https://www.nvidia.com/en-us/design-visualization/quadro/ |
Could you share the |
@nikkon-dev FYR. Thanks!
|
Hi,
I have tried to monitor some fields of my GPUs (GTX 3090). The configuration is as follows:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.57 Driver Version: 515.57 CUDA Version: 11.7 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:18:00.0 Off | N/A |
| 30% 42C P8 26W / 350W | 17MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... On | 00000000:3B:00.0 Off | N/A |
| 31% 42C P8 23W / 350W | 7MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... On | 00000000:86:00.0 Off | N/A |
| 31% 42C P8 23W / 350W | 7MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA GeForce ... On | 00000000:AF:00.0 Off | N/A |
| 31% 43C P8 38W / 350W | 7MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
When I tried the following command
dcgmi dmon -d 100 -e 1011 --host 127.0.0.1:39999
The system reported "Error setting watches. Result: -33: This request is serviced by a module of DCGM that is not currently loaded
".
Any suggestion?
Thank you very much and looking forward to your reply!
Best regards,
Qiang Wang
The text was updated successfully, but these errors were encountered: