Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding L2 norm clamping in Diffusion Prior #68

Closed
xiankgx opened this issue May 6, 2022 · 8 comments
Closed

Regarding L2 norm clamping in Diffusion Prior #68

xiankgx opened this issue May 6, 2022 · 8 comments

Comments

@xiankgx
Copy link

xiankgx commented May 6, 2022

Why do we clamp only during sampling and not during training? Shouldn't they be matching? Please enlighten me.

https://github.com/lucidrains/DALLE2-pytorch/blob/main/dalle2_pytorch/dalle2_pytorch.py#L843-L844

https://github.com/lucidrains/DALLE2-pytorch/blob/main/dalle2_pytorch/dalle2_pytorch.py#L859-L860

https://github.com/lucidrains/DALLE2-pytorch/blob/main/dalle2_pytorch/dalle2_pytorch.py#L885-L900

@xiankgx
Copy link
Author

xiankgx commented May 6, 2022

Also, here we multiply with a scale without first doing l2norm.

https://github.com/lucidrains/DALLE2-pytorch/blob/main/dalle2_pytorch/dalle2_pytorch.py#L986

which is ok if we use XClip because we are doing l2norm here.

https://github.com/lucidrains/DALLE2-pytorch/blob/main/dalle2_pytorch/dalle2_pytorch.py#L180

But, we are not doing l2norm when using OpenAI CLIP.

https://github.com/lucidrains/DALLE2-pytorch/blob/main/dalle2_pytorch/dalle2_pytorch.py#L274-L275

@lucidrains
Copy link
Owner

@xiankgx good idea! i've added it here 14e63a3 although i think the whole l2norm clamping thing is not proven out yet

@lucidrains
Copy link
Owner

Also, here we multiply with a scale without first doing l2norm.

https://github.com/lucidrains/DALLE2-pytorch/blob/main/dalle2_pytorch/dalle2_pytorch.py#L986

which is ok if we use XClip because we are doing l2norm here.

https://github.com/lucidrains/DALLE2-pytorch/blob/main/dalle2_pytorch/dalle2_pytorch.py#L180

But, we are not doing l2norm when using OpenAI CLIP.

https://github.com/lucidrains/DALLE2-pytorch/blob/main/dalle2_pytorch/dalle2_pytorch.py#L213

https://github.com/lucidrains/DALLE2-pytorch/blob/main/dalle2_pytorch/dalle2_pytorch.py#L213 ohh, this isn't OpenAIClip, it is actually from CoCa https://arxiv.org/abs/2205.01917 , which debuted yesterday. i think it is a better version of clip

however, it is unclear from the CoCa paper whether they l2normed for cosine similarity contrastive learning

in the paper, it seems they use layernorms on both image and text cls tokens, but not sure if the l2norm is present

@xiankgx
Copy link
Author

xiankgx commented May 6, 2022

Sorry, wrong line quote.

@xiankgx
Copy link
Author

xiankgx commented May 6, 2022

Lol, don't take my word for it, I'm a newbie in diffusion models.

@lucidrains
Copy link
Owner

newbie

@xiankgx same, i think we all are, except for a few researchers around the world and maybe @crowsonkb lol

you are right! https://github.com/openai/CLIP/blob/main/clip/model.py#L364 they normalized it outside of the encoding functions, let me fix it now 🙏

@xiankgx
Copy link
Author

xiankgx commented May 6, 2022

Maybe we can ask crowsonkb for advice.

@lucidrains
Copy link
Owner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants