What params should I use for the 128 to 1024 task, and for the 512 to 1024 task? How should I choose the Unet architecture? #73

roey1rg · 2022-07-29T05:57:20Z

Specifaclly, I'm asking about the following section in the config file:

"unet": {
    "in_channel": 6,
    "out_channel": 3,
    "inner_channel": 64,
    "channel_multiplier": [?????????],
    "attn_res": [?????????],
    "res_blocks": ?????????,
    "dropout": ?????????
},

The text was updated successfully, but these errors were encountered:

xenova · 2022-07-31T18:06:19Z

I assume you've looked at the hyperparameters used in the paper (https://arxiv.org/pdf/2104.07636.pdf), so for reference, here is what they used:

Regarding dropout, they mention:

We use a dropout rate of 0.2 for 16×16 → 128×128 models super-resolution, but otherwise, we do not use dropout.

Although your sizes are slightly different, I would imagine you could "interpolate/extrapolate" and choose hyperparameters. It would appear as though the general trend is that more parameters are needed for very small to medium sized images, and less are required for medium to large images.

We use 625M parameters for our 64×64 → {256×256, 512×512} models, 550M parameters for the 16×16 → 128×128 models, and 150M parameters for 256×256 → 1024×1024 model.

If I had to estimate, you could probably use the following:

512x512 -> 1024x1024
- Channel Dim = 32
- Depth multipliers = {1, 2, 4, 8, 16, 32, 32}
- ResNet blocks = 2 # Maybe 3
- Dropout = 0
128x128 -> 1024x1024
- Channel Dim = 16 # I haven't seen anything lower than this
- Depth multipliers = {1, 2, 4, 8, 16, 16, 32, 32, 32} # Might need to play around with this
- ResNet blocks = 2
- Dropout = 0

Of course, as with any ML task... hyperparameter tuning is highly task-specific, so you will undoubtedly need to play around with the values

roey1rg · 2022-08-02T08:08:29Z

Thank you for the detailed answer!
I tried to run the 512-to-1024 setup you proposed and I faced a CUDA out of memory problems, so I changed the depth multipliers from
[1, 2, 4, 8, 16, 32, 32] to
[1, 2, 4, 8, 8, 16, 32] and I managed to start the training.
do you think it is reasonable?

I'm using 4 GPUs with 24GB of memory each and I set the batch size to 4 (one image per GPU)

xenova · 2022-08-02T13:07:00Z

Sure, that sounds reasonable 👍 Let me know how it goes. 👌

Janspiry · 2022-08-25T06:11:02Z

Feel free to reopen the issue if there is any question.

huchi00057 · 2023-02-08T07:30:04Z

hannel Dim = 32

Depth multip

Thanks for explanation !! But I still have a question.

Which parameter represents Channel Dim in this code?

Janspiry closed this as completed Aug 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What params should I use for the 128 to 1024 task, and for the 512 to 1024 task? How should I choose the Unet architecture? #73

What params should I use for the 128 to 1024 task, and for the 512 to 1024 task? How should I choose the Unet architecture? #73

roey1rg commented Jul 29, 2022

xenova commented Jul 31, 2022 •

edited

roey1rg commented Aug 2, 2022 •

edited

xenova commented Aug 2, 2022

Janspiry commented Aug 25, 2022

huchi00057 commented Feb 8, 2023 •

edited

What params should I use for the 128 to 1024 task, and for the 512 to 1024 task? How should I choose the Unet architecture? #73

What params should I use for the 128 to 1024 task, and for the 512 to 1024 task? How should I choose the Unet architecture? #73

Comments

roey1rg commented Jul 29, 2022

xenova commented Jul 31, 2022 • edited

roey1rg commented Aug 2, 2022 • edited

xenova commented Aug 2, 2022

Janspiry commented Aug 25, 2022

huchi00057 commented Feb 8, 2023 • edited

xenova commented Jul 31, 2022 •

edited

roey1rg commented Aug 2, 2022 •

edited

huchi00057 commented Feb 8, 2023 •

edited