Right set of UNet hyperparameters when training DDPM

Hi there !
I am currently training a DDPM model on a custom image dataset following the cool [unconditional_image_generation](https://github.com/huggingface/diffusers/blob/main/examples/unconditional_image_generation/train_unconditional.py) example script.

Since I don't have the compute to perform comprehensive hyperparameter tuning of my architecture, I was wondering if there are any common intuitions when designing the `UNet` denoiser : **width/length of the residual blocks, number and positions of the attention blocks, etc. with respect to the number of samples in the training set as well as their resolution**.

If anyone has a wide experience in training DMs, it would be super cool to share insights here or in a dedicated blog post such as the one [discussing the hyperparameters choice when training Dreambooth](https://huggingface.co/blog/dreambooth).

Thank you ! 🤗 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Right set of UNet hyperparameters when training DDPM #1318

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Right set of UNet hyperparameters when training DDPM #1318

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions