Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampling algorithm differ from paper. #5

Open
ariel415el opened this issue May 20, 2021 · 5 comments
Open

Sampling algorithm differ from paper. #5

ariel415el opened this issue May 20, 2021 · 5 comments

Comments

@ariel415el
Copy link

ariel415el commented May 20, 2021

Hi,
I want to elaborate on #2:
The sampling algorithm in your paper is a bit different that what shown in the paper.

The paper suggests this sample step
Screenshot from 2021-05-20 12-36-24

while you do this:
Screenshot from 2021-05-20 12-55-28

The clipping is done here

x_recon = tf.clip_by_value(x_recon, -1., 1.)

Now I checked and indeed, without the clipping, the two equations are the same.
Can you give any interpretation or intuition for the clipping and why it is needed?
It seem to be crucial for training while not mentioned in the paper

Thanks

@malekinho8
Copy link

Is there any update on this? In my experience this detail has been crucial in determining sample quality, yet it seems to be largely unaddressed with regards to diffusion models. Does anyone have any insight on this?

@Kaffaljidhmah2
Copy link

In https://huggingface.co/blog/annotated-diffusion, the author says:

Note that the code above is a simplified version of the original implementation. We found our simplification (which is in line with Algorithm 2 in the paper) to work just as well as the original, more complex implementation, which employs clipping.

@varun-ml
Copy link

The issue is that the predictions are often out of range. So the authors are are trying to impose some sort of a correction to get meaningful samples. To do that they are restricting x_reconstructed to -1 to +1 by clipping. So, here is how they generation samples

  1. Get error predictions at step t
  2. Get reconstructed image ie x_recon using error predictions
  3. Clip x_recon since we know x is in range 1 to -1
  4. using clipped x_recon, generate x_t

2 is done using eq
Screenshot 2022-10-19 180206

4 is done using
Screenshot 2022-10-19 180126

This is a hack and will lead to increased density at 1 and -1

@ndvbd
Copy link

ndvbd commented Jan 24, 2023

I don't see in the paper the defintion of σt - where is it mentioned and defined? Why do we need to add noise in the reverse process?

@wanghao-cst
Copy link

I don't see in the paper the defintion of σt - where is it mentioned and defined? Why do we need to add noise in the reverse process?

To make it be a normal distribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants