Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support GPU computing and progress track? #478

Closed
dyuliu opened this issue Jun 25, 2021 · 2 comments · Fixed by #588
Closed

Support GPU computing and progress track? #478

dyuliu opened this issue Jun 25, 2021 · 2 comments · Fixed by #588
Assignees
Labels
question General question about the software
Milestone

Comments

@dyuliu
Copy link
Contributor

dyuliu commented Jun 25, 2021

General problem description:

I tested SDV single tabular models and my training data are of the shape (10000, 7)

This actually is a pretty small dataset. As a highlight, we use all the default hyperparameter settings.

GaussianCopula is quick just in few minutes.

TVAE returned the results around 5.5 hours.

Copular GAN and CTGAN are slow. I've waited for more than 20 hours but it is still running.

(1)
I am thinking is there a way to simply pass a parameter or do some setting to allow the GAN model training with GPU?

(2)
Due to the long training process, I would highly suggest allowing us track the progress of the training. For example, you can always report how many epochs have been trained as well as the total expected training epochs.

Also, a more friendly way is to pass a parameter to the model so that we can dump the log file out.

@dyuliu dyuliu added pending review question General question about the software labels Jun 25, 2021
@csala
Copy link
Contributor

csala commented Jun 25, 2021

Hi @dyuliu thanks for the comments!

Something strange must be happening there, because by default all the Deep Learning models try to use any CUDA device that is available to them.
Can you check whether, torch.cuda.is_available() returns True, and whether nvidia-smi reports any GPU usage while CTGAN (or any of the other models) is running?

WRT the parameters that you suggest (1) and (2), both of them are already available (you can see them in the API Reference):

  • For (1) all the DL models have a cuda argument, which defaults to True and allows you to either enable/disable CUDA usage, or even explicitly tell the model which CUDA device to use (in case there is more than one available)
  • For (2) the CTGAN-based models have a verbose parameter, which defaults to False, that if set to True will make CTGAN print the progress and losses at every epoch. Notice that this one is not available on TVAE yet, though.

@dyuliu
Copy link
Contributor Author

dyuliu commented Jun 25, 2021

Hi @csala, this is very helpful.

An additional question regarding the GPU is that do you have any version requirement for cuda or cudnn?

It would be helpful in the library's document to have a paragraph to introduce this a little bit.

Also, point out the cuda version requirement, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question General question about the software
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants