Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the known working GPU config? #84

Closed
ghost opened this issue Aug 5, 2022 · 5 comments
Closed

What is the known working GPU config? #84

ghost opened this issue Aug 5, 2022 · 5 comments
Labels
GPU Everything to do with GPU, CUDA, cuDNN

Comments

@ghost
Copy link

ghost commented Aug 5, 2022

I am using an Amazon pressed Ubuntu 16 Deep Learning AMI which contains CUDA 10, 10.1, 10.2, and 11.

I am using Mambaforge with Python 3.6 or 3.7

Tensorflow 2 is automatically used. I plan to try Tensorflow 1.x next.

The process is loaded into GPU memory, but the GPU is never used.

Is there a known working full stack config for eynollah on the GPU (OS+version, CUDA+version, Python+version, Tensorflow+version, etc) that you don't mind sharing?

Thanks,

@cneud
Copy link
Member

cneud commented Aug 23, 2022

Hi @mach881040, for me it works well with NVIDIA 2070S GPU on Ubuntu 18.04, Python 3.7, Tensorflow 2.4.1 and CUDA 10.1. Note that there is also still a lot of room for improvement wrt GPU utilization - we hope to optimize this, but for our use case quality of results is much more important than throughput speed.

@bertsky
Copy link
Contributor

bertsky commented Feb 11, 2023

The process is loaded into GPU memory, but the GPU is never used.

I can confirm this with Ubuntu 22.04, Python 3.8, TF 2.10. It's not about low utilisation. The OP says no utilisation, and that's what I see, too. The memory consumption is only 107 MB (and not increasing), GPU util is never anything other than 0%.

@bertsky
Copy link
Contributor

bertsky commented Feb 11, 2023

Sorry, error on my part. Cause was an insufficient CUDA/TF installation. I probably ran into #72 as well.

(I am on CUDA 11.7 though, and now it does work. So the note in the Readme might not be correct.)

@bertsky
Copy link
Contributor

bertsky commented Feb 11, 2023

BTW, is there a particular reason for keeping the TF1-style session management? I found that if I remove it completely (including the explicit GC calls), and avoid repeating load_model calls by storing the model refs in Eynollah's instance, it gets about 9% faster on average (while max RSS of course does increase from 4 GB to 7 GB).

@cneud cneud added the GPU Everything to do with GPU, CUDA, cuDNN label Mar 31, 2023
@cneud
Copy link
Member

cneud commented May 13, 2023

BTW, is there a particular reason for keeping the TF1-style session management? I found that if I remove it completely (including the explicit GC calls), and avoid repeating load_model calls by storing the model refs in Eynollah's instance, it gets about 9% faster on average (while max RSS of course does increase from 4 GB to 7 GB).

This should already be fixed with 7345f6b (which has since been merged), right?

The working config for (limited) GPU use is now documented in the README.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GPU Everything to do with GPU, CUDA, cuDNN
Projects
None yet
Development

No branches or pull requests

2 participants