Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small models for 11GB GPUs #26

Closed
justanhduc opened this issue Nov 30, 2021 · 7 comments
Closed

Small models for 11GB GPUs #26

justanhduc opened this issue Nov 30, 2021 · 7 comments

Comments

@justanhduc
Copy link

Hi. Thanks for opensourcing this amazing project. I am trying to train the network but I got OOM problem as I don't have any 16GB GPU. Could you please let me know which small models can I try on a 11GB GPU? Thanks so much!

@rinongal
Copy link
Owner

Hey!

If you want to decrease memory use, the following are all viable options:

  1. Disable the layer freezing module by setting auto_layer_iters to 0. If you're only doing texture-based changes then you probably don't need to freeze layers and this can save you a good chunk of memory.
  2. Use a lower resolution model (FFHQ 256, LSUN Church etc.).
  3. Only use one of the two CLIP models (ViT-B/32 is better for global textures, ViT-B/16 is a bit better for local textures and shapes).
  4. Decrease n_sample (number of output images during training).

If you just want to play with the model and don't want to do things like dogs to cats, I'd start with options (1) and (4) since they might be enough. We managed to train an FFHQ 1024x1024 model on a 1080 Ti, so 11GB should probably be doable.

@justanhduc
Copy link
Author

Hi @rinongal. Thanks for your tips. Indeed, (1) already reduced a lot of memory and made the training fit on a single 11GB GPU. However, it seems like the quality of the output is not as good as the original version which I checked via Colab. (3) actually didn't affect much as I observed. (4) alone cannot make the training possible either. I guess then (2) would be the most suitable solution if I want to keep the same translation quality, am I correct?

@rinongal
Copy link
Owner

rinongal commented Dec 1, 2021

You could try combining (1) with lowering the learning rate and increasing the number of iterations. Some previous issues reported better results when reducing learning rates when training with style image targets. It might help in your case as well.

Other than that, I'm afraid (2) might be your best option for reducing memory requirements.

@rinongal
Copy link
Owner

rinongal commented Dec 1, 2021

What options did you run in the Colab, btw? The layer freezing isn't enabled there by default (it's only turned on if you click on improve shape).

@justanhduc
Copy link
Author

What options did you run in the Colab, btw? The layer freezing isn't enabled there by default (it's only turned on if you click on improve shape).

Oops the results I used as reference weren't with improve shape. I thought improve shape will enable mixing noise. So is the config without improve shape in Colab totally equivalent to (1)?

@rinongal
Copy link
Owner

rinongal commented Dec 1, 2021

The config without improve shape in Colab is (1) + only ViT-B/32 (so (3)) and no mixing.

@rinongal
Copy link
Owner

Closing due to lack of activity. Feel free to re-open if you need additional help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants