-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How does eco2AI handle using single GPUs on a multi GPU rack? #11
Comments
Hello, @maltefranke! Thank you for your attention to eco2ai and your question! Here are the answers to your questions: eco2ai tracks the full GPU consumption of the system and only the CPU processes involved in the current code execution. Unfortunately, currently we do not have the opportunity to track only the resources used by the code for GPUs. However, you can specify the Tracker class with the parameter 'cpu_processes'. It can be set to "current", in which case the Tracker will calculate CPU utilization only for the currently running process, or "all", in which case the Tracker will calculate full CPU utilization. We use the Python library pynvml to track Nvidia GPUs. By specifying CUDA_VISIBLE_DEVICES=gpu_id, you are instructing your Python script to only recognize the GPU with the ID gpu_id, rather than all the GPUs in your PC/server system. It is difficult for me to provide a specific solution on how to make eco2ai recognize only a certain GPU. However, if the other GPUs are not being used, it will only affect the number of GPUs that can be recognized, not the energy consumption. I want to thank you because your issue has highlighted a potential improvement for our library. I believe that many other users have faced the same problem. In future versions, we will add the option to choose a specific GPU to track, if it is technically possible. If you have additional questions, I will be happy to answer them. |
Thank you very much for your thorough reply! Unfortunately, other models have been running on the other 3 GPUs at the same time and therefore the emissions are entangled, if I understand your explanation correctly. I can also work with a slurm system and request specific GPUs, which in my first attempts seems to work as intended with eco2ai (tracking only the requested GPUs). Nonetheless, not everyone has that luxury, and I believe adding GPU recognition from CUDA_VISIBLE_DEVICES would be a valuable addition to your project. |
Hey there!
First of all, thank you for the great effort to make AI emissions more transparent, and for drawing more attention to this important topic! I have been using your tool to track the emissions for (small) language models and have a couple of questions regarding the usage of your tool.
Setup:
I have a cluster system with 4 GPUs (Tesla V100) available. I run my models with the CUDA_VISIBLE_DEVICES=gpu_id prefix in the command line to only use 1 specific GPU for my training.
Problem:
In the output csv, it says that 4 GPUs have been tracked, although I enforce only 1 GPU to be used
Questions:
Thank you very much in advance for your help
The text was updated successfully, but these errors were encountered: