Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diffusion prior training trial run #86

Closed
xiankgx opened this issue May 11, 2022 · 3 comments
Closed

Diffusion prior training trial run #86

xiankgx opened this issue May 11, 2022 · 3 comments

Comments

@xiankgx
Copy link

xiankgx commented May 11, 2022

All l2norm related settings in DiffusionPrior() set to False as in the defaults, image_embed_scale set to 1.0 to disable any scaling. CLIP image embeddings are from OpenAI's ViT-B/32 without any l2norm, dataset is cc3m. @lucidrains What do you think of the results? Seems to be working?

L1 training loss
image

Training batch CLIP image embed to sampled CLIP image embed (from training batch text) L2 loss
image

Training batch CLIP text embed to sampled CLIP image embed (from training batch text) softmax accuracy
image

Training batch CLIP text embed to sampled CLIP image embed (from training batch text) cosine similarity
image

@lucidrains
Copy link
Owner

lucidrains commented May 13, 2022

@xiankgx yea i think it works :) also, @rom1504 told me some group out there has already trained a prior using the code in this repository, for their own CLIP generations, preprint on the way

@egeozsoy
Copy link

Any recommendations on batch size?

@lucidrains
Copy link
Owner

@egeozsoy
Screen Shot 2022-05-13 at 3 04 15 PM

table 3 in appendix says they used 4096. but evidently people are training it successfully on much smaller batch sizes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants