Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What params should I use for the 128 to 1024 task, and for the 512 to 1024 task? How should I choose the Unet architecture? #73

Closed
roey1rg opened this issue Jul 29, 2022 · 5 comments

Comments

@roey1rg
Copy link

roey1rg commented Jul 29, 2022

Specifaclly, I'm asking about the following section in the config file:

"unet": {
    "in_channel": 6,
    "out_channel": 3,
    "inner_channel": 64,
    "channel_multiplier": [?????????],
    "attn_res": [?????????],
    "res_blocks": ?????????,
    "dropout": ?????????
},
@xenova
Copy link

xenova commented Jul 31, 2022

I assume you've looked at the hyperparameters used in the paper (https://arxiv.org/pdf/2104.07636.pdf), so for reference, here is what they used:
image

Regarding dropout, they mention:

We use a dropout rate of 0.2 for 16×16 → 128×128 models super-resolution, but otherwise, we do not use dropout.

Although your sizes are slightly different, I would imagine you could "interpolate/extrapolate" and choose hyperparameters. It would appear as though the general trend is that more parameters are needed for very small to medium sized images, and less are required for medium to large images.

We use 625M parameters for our 64×64 → {256×256, 512×512} models, 550M parameters for the 16×16 → 128×128 models, and 150M parameters for 256×256 → 1024×1024 model.

If I had to estimate, you could probably use the following:

  • 512x512 -> 1024x1024
    • Channel Dim = 32
    • Depth multipliers = {1, 2, 4, 8, 16, 32, 32}
    • ResNet blocks = 2 # Maybe 3
    • Dropout = 0
  • 128x128 -> 1024x1024
    • Channel Dim = 16 # I haven't seen anything lower than this
    • Depth multipliers = {1, 2, 4, 8, 16, 16, 32, 32, 32} # Might need to play around with this
    • ResNet blocks = 2
    • Dropout = 0

Of course, as with any ML task... hyperparameter tuning is highly task-specific, so you will undoubtedly need to play around with the values

@roey1rg
Copy link
Author

roey1rg commented Aug 2, 2022

Thank you for the detailed answer!
I tried to run the 512-to-1024 setup you proposed and I faced a CUDA out of memory problems, so I changed the depth multipliers from
[1, 2, 4, 8, 16, 32, 32] to
[1, 2, 4, 8, 8, 16, 32] and I managed to start the training.
do you think it is reasonable?

I'm using 4 GPUs with 24GB of memory each and I set the batch size to 4 (one image per GPU)

@xenova
Copy link

xenova commented Aug 2, 2022

Sure, that sounds reasonable 👍 Let me know how it goes. 👌

@Janspiry
Copy link
Owner

Feel free to reopen the issue if there is any question.

@huchi00057
Copy link

huchi00057 commented Feb 8, 2023

  • hannel Dim = 32
  • Depth multip

Thanks for explanation !! But I still have a question.

Which parameter represents Channel Dim in this code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants