You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All l2norm related settings in DiffusionPrior() set to False as in the defaults, image_embed_scale set to 1.0 to disable any scaling. CLIP image embeddings are from OpenAI's ViT-B/32 without any l2norm, dataset is cc3m. @lucidrains What do you think of the results? Seems to be working?
L1 training loss
Training batch CLIP image embed to sampled CLIP image embed (from training batch text) L2 loss
Training batch CLIP text embed to sampled CLIP image embed (from training batch text) softmax accuracy
Training batch CLIP text embed to sampled CLIP image embed (from training batch text) cosine similarity
The text was updated successfully, but these errors were encountered:
@xiankgx yea i think it works :) also, @rom1504 told me some group out there has already trained a prior using the code in this repository, for their own CLIP generations, preprint on the way
All l2norm related settings in DiffusionPrior() set to False as in the defaults,
image_embed_scale
set to 1.0 to disable any scaling. CLIP image embeddings are from OpenAI'sViT-B/32
without any l2norm, dataset is cc3m. @lucidrains What do you think of the results? Seems to be working?L1 training loss
Training batch CLIP image embed to sampled CLIP image embed (from training batch text) L2 loss
Training batch CLIP text embed to sampled CLIP image embed (from training batch text) softmax accuracy
Training batch CLIP text embed to sampled CLIP image embed (from training batch text) cosine similarity
The text was updated successfully, but these errors were encountered: