How does eco2AI handle using single GPUs on a multi GPU rack? #11

maltefranke · 2023-07-17T08:45:07Z

Hey there!

First of all, thank you for the great effort to make AI emissions more transparent, and for drawing more attention to this important topic! I have been using your tool to track the emissions for (small) language models and have a couple of questions regarding the usage of your tool.

Setup:
I have a cluster system with 4 GPUs (Tesla V100) available. I run my models with the CUDA_VISIBLE_DEVICES=gpu_id prefix in the command line to only use 1 specific GPU for my training.

Problem:
In the output csv, it says that 4 GPUs have been tracked, although I enforce only 1 GPU to be used

Questions:

Does eco2AI count the emissions of the whole rack while runtime (also runs by other people), or just the resources used by the code it is wrapped around? If the second is the case, why does it say 4 GPUs in the output file?
How would I specify using only 1 GPU in your code, if this is not done automatically?

Thank you very much in advance for your help

vladimir-laz · 2023-07-18T15:29:21Z

Hello, @maltefranke!

Thank you for your attention to eco2ai and your question!

Here are the answers to your questions:

eco2ai tracks the full GPU consumption of the system and only the CPU processes involved in the current code execution. Unfortunately, currently we do not have the opportunity to track only the resources used by the code for GPUs. However, you can specify the Tracker class with the parameter 'cpu_processes'. It can be set to "current", in which case the Tracker will calculate CPU utilization only for the currently running process, or "all", in which case the Tracker will calculate full CPU utilization.

We use the Python library pynvml to track Nvidia GPUs. By specifying CUDA_VISIBLE_DEVICES=gpu_id, you are instructing your Python script to only recognize the GPU with the ID gpu_id, rather than all the GPUs in your PC/server system. It is difficult for me to provide a specific solution on how to make eco2ai recognize only a certain GPU. However, if the other GPUs are not being used, it will only affect the number of GPUs that can be recognized, not the energy consumption.

I want to thank you because your issue has highlighted a potential improvement for our library. I believe that many other users have faced the same problem. In future versions, we will add the option to choose a specific GPU to track, if it is technically possible.

If you have additional questions, I will be happy to answer them.

maltefranke · 2023-07-19T09:20:50Z

Thank you very much for your thorough reply! Unfortunately, other models have been running on the other 3 GPUs at the same time and therefore the emissions are entangled, if I understand your explanation correctly. I can also work with a slurm system and request specific GPUs, which in my first attempts seems to work as intended with eco2ai (tracking only the requested GPUs). Nonetheless, not everyone has that luxury, and I believe adding GPU recognition from CUDA_VISIBLE_DEVICES would be a valuable addition to your project.
Thanks again for your support!

maltefranke closed this as completed Jul 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does eco2AI handle using single GPUs on a multi GPU rack? #11

How does eco2AI handle using single GPUs on a multi GPU rack? #11

maltefranke commented Jul 17, 2023

vladimir-laz commented Jul 18, 2023

maltefranke commented Jul 19, 2023

How does eco2AI handle using single GPUs on a multi GPU rack? #11

How does eco2AI handle using single GPUs on a multi GPU rack? #11

Comments

maltefranke commented Jul 17, 2023

vladimir-laz commented Jul 18, 2023

maltefranke commented Jul 19, 2023