-
Notifications
You must be signed in to change notification settings - Fork 308
Not Using Second GPU #52
Comments
The same thing. I've tried to run code on Mac Pro with 2 AMD Radeon Pro Vega II Duo (4 GPU). Using MNIST and CIFAR datasets it uses only one GPU and only 30-50%. |
@BatmanDZ can you try increasing the batch size? That increases GPU utilization for me. |
Yes, increased up to maximum. It did not help, GPU usage did not increased and it used only one GPU.
Отправлено с iPhone
… 7 дек. 2020 г., в 19:04, Dhawal Majithia ***@***.***> написал(а):
@BatmanDZ can you try increasing the batch size? That increases GPU utilization for me.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Can you post your model definition? |
Yes, it’s exactly the same.
… 7 дек. 2020 г., в 22:11, Dhawal Majithia ***@***.***> написал(а):
Can you post your model definition?
Is it similar to the CNN posted in #25 <#25> ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#52 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQ4G4VGCBJ22YHFANQF23VLSTUSFDANCNFSM4UN7HZ7A>.
|
Train on 60 steps, validate on 10 steps |
With the batch size 5000 it was worse - about 12% GPU usage and 36s per step |
It might be the case that the model is small enough to not gain from GPU usage. |
Tried. Usage increased to 50% and time per epoch decreased to 24s |
@anagrath Thank you for reporting this issue. Could you please send us more information about your config? Also, are you setting device type to 'any'? What batch size are you using? |
Am new to Tensorflow. It has been 20 years since I last played with neural nets and so I am sure terminology has changed and so I went and got a book a few days back that I am now going through. What I did was go here: https://www.tensorflow.org/tutorials/quickstart/beginner and turn that into a file. Then I imported the ml and set it to 'any'. I have played around with any/cpu/gpu settings. After I filed the issue, because other people seem to be experiencing the issue and there was fast uptake, I assumed this will get resolved at some point. So, I began modifying the code to see if I could get start get information on my own problem working with it.... (marketing data that I downloaded from one of our providers). here is the sorely misguided code I am currently using:
As you can see I started to extract out to command line different settings including the cpu/gpu/any. I am also now running three layers. Since it was not using all the resources, I figured I could run the same experiment with different settings to see if one of the settings work better on my data, though even when I fill up GPU 1 it does not switch over to GPU 2 (same as what I saw if I ran the MNIST tutorial setup multiple times). It would be interesting if we could specify which gpu on that line so that one configuration could be run on one gpu while another could be run on the second. In the end, I have very little clue on what I am supposed to be doing :) so I do not know if I am doing it right or if the above suggestion even makes sense. Maybe in another week when I kill the tensorflow book I will have a better idea... the early examples in the book I found use scikit and tensorflow examples are much later. It may take me a minute to catch up to you guys. |
I should also say that if you have an example that you need me to run, I am happy to try to run it to help get the issue resolved. |
@anagrath Thank you for posting the code. Could you tell me which command line arguments you are running this code with? |
have been running different numbers, sometimes simultaneously (different tabs). Was doing powers of 2. Started to get declining returns around 4096 on accuracy. Again, I am running multiple instances on different numbers to use up more resources get some answers faster. Here are some example runs: 300 512 1024 512 cpu |
I have created a simple training script using the MNIST data and the getting started from TensorFlow. It uses approximately 30% gpu with 800% CPU when running. I ran the script several times at the same time to see if it would use a second GPU and it will take my first GPU to 100% but not switch to my second GPU on future processes.
Was this supposed load balance GPU over multiple GPUs...
The text was updated successfully, but these errors were encountered: