Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After training the sample pictures get some weird color tints #81

Open
benjamin-bertram opened this issue Nov 7, 2022 · 11 comments
Open

Comments

@benjamin-bertram
Copy link

Sometimes in training i get some weird color schemes in my pictures, while the original data has no tints at all.
Is there a reason for it, and how could I avoid it?

Original data is like:
40SMA0013_1Y5_01

And the output is:
8c0b5bd2-c316-4c06-968c-5d4d2b0607b3

@smy1999
Copy link

smy1999 commented Dec 8, 2022

Have you solved the problem? I've got a same one.

@zengxianyu
Copy link

I also found degradation in image quality after finetuning on the same dataset (I'm using LSUN horse 256 resolution)

@stsavian
Copy link

I am also facing a similar problem!

@osmanmusa
Copy link

Same here ...

@stsavian
Copy link

stsavian commented Feb 2, 2023

Hello, I have found that by predicting the target (x_o), instead of the noise (epsilon), the phenomenon is dramatically reduced.
Have you tried to set predict_x_start to true?

Looking forward for your feedback,
Stefano

@ONobody
Copy link

ONobody commented Mar 2, 2023

@zengxianyu how to finetuning thanks

@tobiasbrinker
Copy link

Hi, for me only predicting the mean (instead of the mean+variance) by setting learn_sigma=False solved the problem.

@Walleeeda
Copy link

just training longer

@sibasmarak
Copy link

sibasmarak commented Oct 31, 2023

@stsavian @Walleeeda

For me, training longer and predict_xstart=True have not solved the problem (I am using the LSUN Church Outdoor dataset). I am training with learn_sigma=False now, although I was keeping it to the last since it is shown in the paper that predicting variance should help.

Update: None of the suggested solutions here work for me; I am getting weird tints always.

Are there any additional tricks to use while sampling from models that have been trained with predict_xstart=True? Currently, the samples are just pitch-black images. Also, it is worth mentioning that the loss $q_0 << q_3$ in this case (which is reversed in the default case of predict_xstart=False).

@MitcML
Copy link

MitcML commented Nov 2, 2023

same here, I am using LSUN bedroom model

@sibasmarak
Copy link

Hi, I have solved the problem (technically, @stsavian's idea, but I will try to put forth my observations).

TL;DR
The solution is to predict $x_0$ (predict_xstart=True) along with trying out several hyperparameters (notably, image_size, num_channels, num_head_channels). Also, for me, rescale_learned_sigmas=False worked better.


Some prior context: my custom dataset has black background in samples, with the content being differently coloured (imagine the MNIST dataset, but the numbers are of different colours, and the images have three channels)

The sampling process calls q_posterior_mean(), which requires the $x_0$ (or $x_{start}$). The default training setting predicts $x_0$ from predicted noise (see here), thus not that accurate (i.e., I observed that from noise, it predicts $x_0$ has a uniform background but cannot predict the exact background colour). However, this default setting might work well with a dataset with enough background diversity when trained for longer steps with the proper hyperparameter settings.

Another training setting (predict_xstart=True) attempts to predict $x_0$ directly instead of noise, hence better at predicting $x_0$ during sampling. However, there might be training instability (model expressivity and NaN loss). For me, it was a complete collapse into complete black samples and no content when I was using incorrect hyperparameter settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants