Typical training results #61

edend10 · 2021-03-01T22:08:29Z

Hi, great repo and thanks for sharing your work!

I'm trying your bird dataset example from the Colab with OpenAI's pretrained VAE. I wasn't able to get meaningful results so far on the Colab or on my own vm (Tesla T4 GPU).

13 epochs in of train_dalle.py and only seeing these kinds of results:

On my vm I ran $ python train_dalle.py --image_text_folder /parent/to/birds/dataset/directory without changing any of the code (only replaced wandb with another experiment tracking framework, but I doubt that should make a difference)

Should the bird dataset work better with the pretrained VAE? Can you share some results or common training parameters/times/number of epochs?

The text was updated successfully, but these errors were encountered:

lucidrains · 2021-03-01T22:49:27Z

@edend10 Hi Eden! Thanks for trying out the repository! I may have found a bug with the pretrained VAE wrapper, fixed in the latest commit https://github.com/lucidrains/DALLE-pytorch/blob/0.2.2/dalle_pytorch/vae.py#L82 🙏 I'll be training this myself this week, and ironing out any remaining issues (other than data and scale of course)

edend10 · 2021-03-01T23:50:26Z

Thanks for the response @lucidrains !
Ohh interesting, I'll check out the changes and try it out. Will look out for more updates!

CDitzel · 2021-03-02T08:08:52Z

what are those two mapping functions for anyway?

Are they just for transforming the pixel value range for the input data they just over at OpenAI?

AlexanderRayCarlson · 2021-03-04T01:17:03Z

Hello! Thank you for this excellent work. I seem to be getting something similar - abstract sorts of blue squares when training in the colab notebook. It looks like the package (0.2.2) is updated with the latest fix - is there anything else needed to do at the moment?

awilson9 · 2021-03-05T21:06:35Z

This is still happening for me as well on the pretrained VAE on 0.2.2

afiaka87 · 2021-03-09T16:08:28Z

This is an early output (2 epochs) from the new code that removes the normalization from train_dalle.py. Was that the necessary fix @lucidrains ?

DEPTH = 6
BATCH_SIZE = 8

"a female mannequin"

Much more cohesive and a much stronger start now. No strange blueness, at the very least.

liuqk3 · 2021-03-10T03:18:41Z

Hi @afiaka87, Amazing results! Can you share more details about your configurations? such as the dataset, learning rate, lr scheduler, number of text and image (8192, right?) tokens? Thanks.

afiaka87 · 2021-03-10T08:53:38Z

Hi @afiaka87, Amazing results! Can you share more details about your configurations? such as the dataset, learning rate, lr scheduler, number of text and image (8192, right?) tokens? Thanks.

I should mention the dataset I'm using includes images released by OpenAI with their DALL-E. The mannequin image is not being generated from text alone, it's from an image text pair. Anyway, my point is that my dataset is bad and I'm mostly just messing around. It's probably the case that using images generated from DALL-E itself is bound to converge much quicker than usual.

I'm using the defaults in train_dalle.py except for the BATCH SIZE and DEPTH. Pretrained OpenAI VAE, top_k=0.9, and reversible=True. I tried mixing attention layers but it adds memory. (edit: I dont think it does actually. training with all four attn_types currently)

I'm working on creating a hyperparameter sweep with wandb currently. I think a learning rate of 2e-4 might be better for depth greater than 12 or so.

I still can't get a stable learning rate with 64 depth.

afiaka87 · 2021-03-10T17:00:47Z

Edit: You can find the whole training session here:

edit: edit: err here: https://wandb.ai/afiaka87/dalle-pytorch-openai-samples/reports/Training-on-OpenAI-DALL-E-Generated-Images--Vmlldzo1MTk2MjQ?accessToken=89u5e10c2oag5mlv46xm2sz6orkyqdlwjrsj8vd95oz8ke3ez6v8v2fh07klk6j1
I'm starting over because there have been updates to the main branch.

Original post:

"a professional high quality emoji of a spider starfish chimera . a spider imitating a starfish . a spider made of starfish . a professional emoji ."

Left it running at 16 depth, 8 heads, batch size of 12 learning_rate=2e-4. The loss is going down at a steady consistent rate. (edit: just kidding! it seems to get stuck at around ~6.0 on this run. which seems high?)

DEPTH: 16
HEADS: 8
TOP_K: 0.85
EPOCHS: 27
SHUFFLE: True
DIM_HEAD: 64
MODEL_DIM: 512
BATCH_SIZE: 12
REVERSIBLE: true
TEXT_SEQ_LEN: 256
LEARNING_RATE: 0.0002
GRAD_CLIP_NORM: 0.5

afiaka87 · 2021-03-10T17:10:12Z

Edit:

Here, I used Weights & Biases to create a report. This link has all the images generated (every 100th iteration) for 27,831 total iterations

Edit: this one should work i think
https://wandb.ai/afiaka87/dalle-pytorch-openai-samples/reports/Training-on-OpenAI-DALL-E-Generated-Images--Vmlldzo1MTk2MjQ?accessToken=89u5e10c2oag5mlv46xm2sz6orkyqdlwjrsj8vd95oz8ke3ez6v8v2fh07klk6j1

tommy19970714 · 2021-03-11T08:58:24Z

@afiaka87 Thank you for sharing your report of Weights & Biases!
But I can't see the report because its project is private.
Can you allow us to see it?

afiaka87 · 2021-03-11T10:57:57Z

Hm, does this work? @tommy19970714 ?
https://wandb.ai/afiaka87/dalle-pytorch-openai-samples/reports/Training-on-OpenAI-DALL-E-Generated-Images--Vmlldzo1MTk2MjQ?accessToken=89u5e10c2oag5mlv46xm2sz6orkyqdlwjrsj8vd95oz8ke3ez6v8v2fh07klk6j1

afiaka87 · 2021-03-11T11:18:08Z

Hi @afiaka87, Amazing results! Can you share more details about your configurations? such as the dataset, learning rate, lr scheduler, number of text and image (8192, right?) tokens? Thanks.

Just for more info on the dataset itself, it is roughly 1,100,000 256x256 image-text pairs that were generated by OpenAI's DALL-E. They presented roughly ~30k unique text prompts of which they posted the top 32 of 512 generations on https://openai.com/blog/dall-e/. Many images were corrupt, and not every prompt has a full 32 examples, but the total number of images winds up being about 1.1 million. If you look at many of the examples on that page, you'll see that DALL-E (in that form at least), can and will make mistakes. These mistakes are also in this dataset. Anyway I'm just messing around having fun training and what not. This is definitely not going to produce a good model or anything.

There are also a large number of images in the dataset which are intended to be used with the "mask" feature. I don't know if that's possible yet in DALLE-pytorch though. Anyway, that can't be helping much.

One valuable thing I've taken from this is that it seems to take at least ~2000 iterations with a batch size of 4 to approach any sort of coherent reproductions. This number specifically probably varies, but in terms of "knowing when to start over", I would say rougly 3000 steps might be a good soft target.

tommy19970714 · 2021-03-12T11:14:53Z

Thank you for shareing your result!
I will refer your parameters.

afiaka87 · 2021-03-13T22:27:52Z

@tommy19970714

I did a hyperparameter sweep with weights and biases. Forty Eight 1200 iteration runs of dalle-pytorch while varying Learning Rate, Depth and Heads, (minimizing the total loss at the end of each run).

#84 (comment)

afiaka87 · 2021-03-14T04:57:52Z

Most important thing to note here is that the learning rate actually needs to go up to about 0.0005 when dealing with ~26-32 depth

afiaka87 · 2021-03-15T19:45:27Z

I've done a much longer training session on that same dataset here:

#86

afiaka87 mentioned this issue Mar 15, 2021

More "OpenAI Blog Post" Training | Depth 32 | Heads 8 | LR 5e-4 #86

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Typical training results #61

Typical training results #61

edend10 commented Mar 1, 2021

lucidrains commented Mar 1, 2021

edend10 commented Mar 1, 2021

CDitzel commented Mar 2, 2021

AlexanderRayCarlson commented Mar 4, 2021

awilson9 commented Mar 5, 2021

afiaka87 commented Mar 9, 2021 •

edited

liuqk3 commented Mar 10, 2021 •

edited

afiaka87 commented Mar 10, 2021 •

edited

afiaka87 commented Mar 10, 2021 •

edited

afiaka87 commented Mar 10, 2021 •

edited

tommy19970714 commented Mar 11, 2021

afiaka87 commented Mar 11, 2021 •

edited

afiaka87 commented Mar 11, 2021 •

edited

tommy19970714 commented Mar 12, 2021

afiaka87 commented Mar 13, 2021

afiaka87 commented Mar 14, 2021

afiaka87 commented Mar 15, 2021

Typical training results #61

Typical training results #61

Comments

edend10 commented Mar 1, 2021

lucidrains commented Mar 1, 2021

edend10 commented Mar 1, 2021

CDitzel commented Mar 2, 2021

AlexanderRayCarlson commented Mar 4, 2021

awilson9 commented Mar 5, 2021

afiaka87 commented Mar 9, 2021 • edited

liuqk3 commented Mar 10, 2021 • edited

afiaka87 commented Mar 10, 2021 • edited

afiaka87 commented Mar 10, 2021 • edited

afiaka87 commented Mar 10, 2021 • edited

tommy19970714 commented Mar 11, 2021

afiaka87 commented Mar 11, 2021 • edited

afiaka87 commented Mar 11, 2021 • edited

tommy19970714 commented Mar 12, 2021

afiaka87 commented Mar 13, 2021

afiaka87 commented Mar 14, 2021

afiaka87 commented Mar 15, 2021

afiaka87 commented Mar 9, 2021 •

edited

liuqk3 commented Mar 10, 2021 •

edited

afiaka87 commented Mar 10, 2021 •

edited

afiaka87 commented Mar 10, 2021 •

edited

afiaka87 commented Mar 10, 2021 •

edited

afiaka87 commented Mar 11, 2021 •

edited

afiaka87 commented Mar 11, 2021 •

edited