Diffusion prior training trial run #86

xiankgx · 2022-05-11T23:55:09Z

All l2norm related settings in DiffusionPrior() set to False as in the defaults, image_embed_scale set to 1.0 to disable any scaling. CLIP image embeddings are from OpenAI's ViT-B/32 without any l2norm, dataset is cc3m. @lucidrains What do you think of the results? Seems to be working?

L1 training loss

Training batch CLIP image embed to sampled CLIP image embed (from training batch text) L2 loss

Training batch CLIP text embed to sampled CLIP image embed (from training batch text) softmax accuracy

Training batch CLIP text embed to sampled CLIP image embed (from training batch text) cosine similarity

The text was updated successfully, but these errors were encountered:

lucidrains · 2022-05-13T01:11:50Z

@xiankgx yea i think it works :) also, @rom1504 told me some group out there has already trained a prior using the code in this repository, for their own CLIP generations, preprint on the way

egeozsoy · 2022-05-13T07:43:27Z

Any recommendations on batch size?

lucidrains · 2022-05-13T22:04:49Z

@egeozsoy

table 3 in appendix says they used 4096. but evidently people are training it successfully on much smaller batch sizes

lucidrains closed this as completed May 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diffusion prior training trial run #86

Diffusion prior training trial run #86

xiankgx commented May 11, 2022 •

edited

lucidrains commented May 13, 2022 •

edited

egeozsoy commented May 13, 2022

lucidrains commented May 13, 2022

Diffusion prior training trial run #86

Diffusion prior training trial run #86

Comments

xiankgx commented May 11, 2022 • edited

lucidrains commented May 13, 2022 • edited

egeozsoy commented May 13, 2022

lucidrains commented May 13, 2022

xiankgx commented May 11, 2022 •

edited

lucidrains commented May 13, 2022 •

edited