Quality of the generated images #4

stsavian · 2023-02-15T13:35:20Z

Dear @cloneofsimo and @SilenceMonk, thanks so much for this code! It is beneficial and precisely the missing piece I need to understand diffusion models better. Also, I appreciated that I could just run the CIFAR10 training without any code modification.
I am playing around with your code to better understand the guided_diffusion repository, which I find too complex and I need to simplify.

I have trained on cifar10 and obtained the following results after 100 epochs.

As you can see, the prediction quality seems quite far from the ground truth.
I plan to extend your code to images with a larger resolution, however, I am hesitant now, as I do not understand if the network is learning or not. I would like to extend the code while maintaining convergence.

i) Is this behavior normal? Is there some critical hyperameter to tune to obtain clearer images?

UPDATE: I have trained on celebA and obtained the following results after 21 epochs (approx 14 hours on a 3090):

The celebA results seem already better than the cifar10, but I might need more training epochs because the generated images are still far from the groundtruth.

Still referring to the celebA results, you can see in the following image that the generated images could show a constant color,
background.
(below you can see celebA after 19 epochs)

This issue is similar to openai/guided-diffusion#81 .

Furthermore, you can see that the training does not progress linearly, if we take epoch 22 of celebA, we can notice that the network outputs smooth predictions with no structure again.

So overall I am not getting the training stability I was expecting. These results are (unfortunately) consistent with my issues for the guided_diffusion repository openai/guided-diffusion#42 .

iii) do you have any comment which could help overcame this issue?

Thanks again for your help!
Stefano

cloneofsimo · 2023-02-17T20:39:13Z

There are much more considerations to make when implementing these results for large scale dataset, like CelebA.

Please don't expect it to work stably anything beyond MNIST and perhaps CIFAR10 examples. I'm wondering why CIFAR10 didn't work for you... you should be able to reproduce the results in the readme...

Stuff that you should probably do to get this code to work beyond toy dataset:

Better Optimizer + batchsize + noise scheduler
Better model, probably one that isn't as naïve as this one.
Loss reweight, with timestep distribution

stsavian · 2023-02-20T12:28:29Z

@cloneofsimo thanks so much for your advice!

paulaceccon · 2023-07-08T05:15:58Z

I couldn't reproduce the results for CIFAR10. Strangely, running the code from this repo gives me only noise images (trying to find the issue). Given that it works for MNIST, and the only different is the UNet, I'd guess it might be something there...

stsavian mentioned this issue Feb 16, 2023

Training generates images with full red output dome272/Diffusion-Models-pytorch#21

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quality of the generated images #4

Quality of the generated images #4

stsavian commented Feb 15, 2023 •

edited

cloneofsimo commented Feb 17, 2023

stsavian commented Feb 20, 2023

paulaceccon commented Jul 8, 2023

Quality of the generated images #4

Quality of the generated images #4

Comments

stsavian commented Feb 15, 2023 • edited

cloneofsimo commented Feb 17, 2023

stsavian commented Feb 20, 2023

paulaceccon commented Jul 8, 2023

stsavian commented Feb 15, 2023 •

edited