Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

almost noise image generated #8

Open
Ishihara-Masabumi opened this issue Sep 8, 2022 · 39 comments
Open

almost noise image generated #8

Ishihara-Masabumi opened this issue Sep 8, 2022 · 39 comments

Comments

@Ishihara-Masabumi
Copy link

After training for 100 epochs, I tried to infer with the following command line.

!python imagen.py --imagen model.pth --tags "1girl, red_hair" --output red_hair.png

Then, the generated image of red_hair.png is as follows:

red_hair

So almost noise image!
Could you please tell me how to generate red_hair girl image?

@deepglugs
Copy link
Owner

deepglugs commented Sep 8, 2022 via email

@Ishihara-Masabumi
Copy link
Author

Thank you.
Then, the following command line is correct?

python imagen.py --train --source ./datasets --imagen model1.pth --sample_unet 1 --train_unet 1

@deepglugs
Copy link
Owner

deepglugs commented Sep 8, 2022

--sample_unet is unnecessary during training. samples will always be produced by the unet under training. It's only used for sampling outside of training.

Also, I made a typo with my first reply. Training unet2 you need --train_unet=2. So:

python imagen.py --train --source ./datasets --imagen model1.pth --train_unet 1 # train unet1
python imagen.py --train --source ./datasets --imagen model1.pth --train_unet 2 # train unet2

python imagen.py --imagen model1.pth --sample_unet=1 --tags "1girl, red_hair" # sample from unet1
python imagen.py --imagen model1.pth --sample_unet=2 --tags "1girl, red_hair" # sample from unet2

@Ishihara-Masabumi
Copy link
Author

Thanks!
BTW, this time the following image is generated.

red_hair1

It is like something meaningful, but it is very ambiguous.
Is this the limit?

@deepglugs
Copy link
Owner

try lowering --cond_scale to 1.0 or 1.1. This will turn off prompt conditioning, but should give you an idea at the quality your model is capable of at the current training step.

@Ishihara-Masabumi
Copy link
Author

I generated the image using the following command line.

python imagen.py --imagen model1.pth --tags "1girl, red_hair" --output red_hair2.png --sample_unet 2 --cond_scale 1.0

The generated image is as follows:

red_hair2

Is it OK?

@deepglugs
Copy link
Owner

Looks like it needs more training. What does --sample_unet=1 look like?

@Ishihara-Masabumi
Copy link
Author

Ishihara-Masabumi commented Sep 12, 2022

The generated image after 488 epochs training is as follows:

red_hair3

Is this OK for generated image?

@deepglugs
Copy link
Owner

No. That's a very strange image. Mine look like this after a night of training unet2 (cond_scale 1.1):

imagen_24_663_loss0 128552

After a few epochs, they should look like this early on:
imagen_21_90_loss0 754791

So check your data and settings (sample_steps, cond_scale)

@Ishihara-Masabumi
Copy link
Author

Thank you for your information.
BTW, my training and generating command lines are as follows:

python imagen.py --train --epochs 1000 --source ./datasets --imagen model3.pth --train_unet 2
python imagen.py --imagen model3.pth --tags "1girl, red_hair" --output red_hair3.png --sample_unet 2 --cond_scale 1.1

Is there anything wrong with it?

@deepglugs
Copy link
Owner

The command looks reasonable. How many images in your dataset?

@Ishihara-Masabumi
Copy link
Author

My dataset is the same as the holo dataset from you.
The image number is 261, the tag file number is 263.

@deepglugs
Copy link
Owner

Ahh. You'll need a lot more data. The smallest dataset I've trained has 18k images.

You can try a tag combination closer to what you've trained with, but I would get more data. Maybe several thousand at least. gel_fetch has a "start_id" you can use to pull additional data. Set it to 1+.

@Ishihara-Masabumi
Copy link
Author

I tried to fetch images and texts using the following command line.
But the result was "added 0 images".

python3 gel_fetch.py --tags "holo2" --txt holo2/tags --img holo2/imgs --danbooru --num 18000 --start_id 1
added 0 images

@deepglugs
Copy link
Owner

deepglugs commented Sep 13, 2022 via email

@Ishihara-Masabumi
Copy link
Author

I'm sorry I had a mistake.
Then, I tried to fetch images and datas again as the following command line.

python3 gel_fetch.py --tags "holo" --txt holo2/tags --img holo2/imgs --danbooru --num 18000 --start_id 1

But, the number of images I got was just 263.
Please tell me how to get images beyond 1000.

@deepglugs
Copy link
Owner

deepglugs commented Sep 13, 2022 via email

@Ishihara-Masabumi
Copy link
Author

Both the following command lines fetched no images, with the message below.

python3 gel_fetch.py --tags "holo" --txt holo2/tags --img holo2/imgs --danbooru --num 18000 --start_id 2
python3 gel_fetch.py --tags "holo" --txt holo2/tags --img holo2/imgs --danbooru --num 18000 --start_id 3


https://cdn.donmai.us/sample/e5/17/sample-e51747e7fa932bcd899f43a6fa12bc3a.jpg
skipping 4445577 because it already exists
https://cdn.donmai.us/sample/62/53/sample-6253c9e5989dae54f1f1a4753b79633d.jpg
skipping 4440381 because it already exists
https://cdn.donmai.us/sample/4f/17/sample-4f172e3dab7cad4b5a1112792546f3fd.jpg
skipping 4422398 because it already exists
https://cdn.donmai.us/sample/02/24/sample-0224f7d54d89ed1e4b5483eda35d467b.jpg
skipping 4422396 because it already exists
https://cdn.donmai.us/sample/e7/ba/sample-e7bacfe3b934a5c46be75befccf3154a.jpg
skipping 4416289 because it already exists
https://cdn.donmai.us/sample/88/76/sample-887641b06505644e11df8e7ee2d4e78d.jpg
skipping 4403609 because it already exists
https://cdn.donmai.us/sample/31/61/sample-3161f75f68d08b7638eb9b439c8166d5.jpg
skipping 4403608 because it already exists
added 0 images

@deepglugs
Copy link
Owner

Looks like you may have got all of Holo. You can try other tags. "red_hair" will probably get you a lot more.

@Ishihara-Masabumi
Copy link
Author

Then, please tell me all the tag names as holo, red_hair, ....

@deepglugs
Copy link
Owner

There are thousands of tags. Here's some of the most popular:

https://danbooru.donmai.us/tags?commit=Search&search%5Bhide_empty%5D=yes&search%5Border%5D=count

@Ishihara-Masabumi
Copy link
Author

Using 2466 images, I tried to train imagen model.
After that I tried to generate "1girl, red_hair" image, the image is below.

red_hair4

Is it lack of training images?

@zhaobingbingbing
Copy link

Hi,I hope to train 256*256 unet separately.
For training, it's like,
python imagen.py --train --source --tags_source --imagen yourmodel.pth --train_unet 2 --no_elu
For sampling, it's like,
python imagen.py --imagen yourmodel.pth --sample_unet 2 --tags "1girl, red_hair" --output ./red_hair.png --cond_scale 1.0
I use 100k image and txt pairs as dataset, and the loss seems right, but I can not generate meaningful images. Do you know the reason.
red_hair8

@deepglugs
Copy link
Owner

Did you train unet1 as well? Usually, you need to train unet1 a lot and then train unet2. So something like:

python imagen.py --train --source dataset --imagen yourmodel.pth --train_unet 1 --no_elu --epochs=80

then

python imagen.py --train --source dataset --imagen yourmodel.pth --train_unet 2 --no_elu --start_epoch=81 --epochs=160

@zhaobingbingbing
Copy link

I did not train unet1, but train unet2 separately should be possible.
I noticed some tips in lucidrains/imagen-pytorch.
image

@deepglugs
Copy link
Owner

Ahh, okay. I have a commit locally that supports nullunet. I'll push that now.

@deepglugs
Copy link
Owner

deepglugs commented Sep 15, 2022

I pushed. There's now a --null_unet1 argument for training and a --start_image for sampling. Sampling during training is not supported, so be sure to use --no_sample.

For sampling I use:

python3 imagen.py --imagen danbooru_320_sr.pth --sample_unet=2 --size 256 --start_image something_64px.png --output something_sr_256.png --tags "1girl, blonde_hair, red_bikini" --cond_scale=1.1 --replace

@Ishihara-Masabumi
Copy link
Author

You mean to use --no_sample instead of --start_image.
Is it right?

@deepglugs
Copy link
Owner

--no_sample for training. For sampling (inference), --start_image

@Ishihara-Masabumi
Copy link
Author

Then, in your option '--start_image something_64px.png', what and why is something_64px.png?

@Ishihara-Masabumi
Copy link
Author

Using your new imagen.py, the following error message occurred.

!python imagen.py --train --source datasets --imagen model5.pth --train_unet 1 --no_sample --no_elu --epochs=80
!python imagen.py --train --source datasets --imagen model5.pth --train_unet 2 --no_sample --no_elu --start_epoch=81 --epochs=160
image_sizes=[64, 64]
The base dimension of your u-net should ideally be no smaller than 128, as recommended by a professional DDPM trainer https://nonint.com/2022/05/04/friends-dont-let-friends-train-small-diffusion-models/
Fetching image indexes in datasets...
2466 images
2485 tags
Traceback (most recent call last):
  File "imagen.py", line 956, in <module>
    main()
  File "imagen.py", line 187, in main
    train(args)
  File "imagen.py", line 647, in train
    use_text_encodings=args.embeddings is not None)
TypeError: __init__() got an unexpected keyword argument 'styles'
image_sizes=[64, 64]
The base dimension of your u-net should ideally be no smaller than 128, as recommended by a professional DDPM trainer https://nonint.com/2022/05/04/friends-dont-let-friends-train-small-diffusion-models/
Fetching image indexes in datasets...
2466 images
2485 tags
Traceback (most recent call last):
  File "imagen.py", line 956, in <module>
    main()
  File "imagen.py", line 187, in main
    train(args)
  File "imagen.py", line 647, in train
    use_text_encodings=args.embeddings is not None)
TypeError: __init__() got an unexpected keyword argument 'styles'

@zhaobingbingbing
Copy link

You can just delete the line 647 and have a new try.

@Ishihara-Masabumi
Copy link
Author

After deleteing line 647, the following error message occurd.

image_sizes=[64, 64]
The base dimension of your u-net should ideally be no smaller than 128, as recommended by a professional DDPM trainer https://nonint.com/2022/05/04/friends-dont-let-friends-train-small-diffusion-models/
Fetching image indexes in datasets...
2466 images
2485 tags
Traceback (most recent call last):
  File "imagen.py", line 956, in <module>
    main()
  File "imagen.py", line 187, in main
    train(args)
  File "imagen.py", line 646, in train
    no_preload=True)
TypeError: __init__() got an unexpected keyword argument 'styles'
image_sizes=[64, 64]
The base dimension of your u-net should ideally be no smaller than 128, as recommended by a professional DDPM trainer https://nonint.com/2022/05/04/friends-dont-let-friends-train-small-diffusion-models/
Fetching image indexes in datasets...
2466 images
2485 tags
Traceback (most recent call last):
  File "imagen.py", line 956, in <module>
    main()
  File "imagen.py", line 187, in main
    train(args)
  File "imagen.py", line 646, in train
    no_preload=True)
TypeError: __init__() got an unexpected keyword argument 'styles'

@zhaobingbingbing
Copy link

Try to delete all those lines with errors until it can work.

@Ishihara-Masabumi
Copy link
Author

Ishihara-Masabumi commented Sep 15, 2022

That's OK.
BTW, what and why is something_64px.png in '--start_image something_64px.png'?
I don't have --start_image something_64px.png'.

@deepglugs
Copy link
Owner

That's OK. BTW, what and why is something_64px.png in '--start_image something_64px.png'? I don't have --start_image something_64px.png'.

You can pick any image you want as your start image. Just resize it to 64x64 (although, it'll probably work with something bigger if you have more memory).

@Ishihara-Masabumi
Copy link
Author

Hi, I have 2 error messanges.
The first error is as follows:

!python imagen.py --imagen model5.pth --sample_unet=2 --size 256 --start_image girl.png --output girl_256.png --tags "1girl, blonde_hair" --cond_scale=1.1 --replace

usage: imagen.py [-h] [--source SOURCE] [--tags_source TAGS_SOURCE]
                 [--cond_images COND_IMAGES] [--style STYLE]
                 [--embeddings EMBEDDINGS] [--tags TAGS] [--vocab VOCAB]
                 [--size SIZE] [--sample_steps SAMPLE_STEPS]
                 [--num_unets NUM_UNETS] [--vocab_limit VOCAB_LIMIT]
                 [--epochs EPOCHS] [--imagen IMAGEN] [--output OUTPUT]
                 [--replace] [--unet_dims UNET_DIMS] [--unet2_dims UNET2_DIMS]
                 [--dim_mults DIM_MULTS] [--start_size START_SIZE]
                 [--sample_unet SAMPLE_UNET] [--device DEVICE]
                 [--text_encoder TEXT_ENCODER] [--cond_scale COND_SCALE]
                 [--no_elu] [--num_samples NUM_SAMPLES]
                 [--init_image INIT_IMAGE] [--skip_steps SKIP_STEPS]
                 [--sigma_max SIGMA_MAX] [--full_load] [--no_memory_efficient]
                 [--print_params] [--unet_size_mult UNET_SIZE_MULT]
                 [--self_cond] [--batch_size BATCH_SIZE]
                 [--micro_batch_size MICRO_BATCH_SIZE]
                 [--samples_out SAMPLES_OUT] [--train] [--train_encoder]
                 [--shuffle_tags] [--train_unet TRAIN_UNET]
                 [--random_drop_tags RANDOM_DROP_TAGS] [--fp16] [--bf16]
                 [--workers WORKERS] [--no_text_transform]
                 [--start_epoch START_EPOCH] [--no_patching]
                 [--create_embeddings] [--verify_images]
                 [--pretrained PRETRAINED] [--no_sample] [--lr LR]
                 [--loss LOSS] [--sample_rate SAMPLE_RATE] [--wandb] [--is_t5]
                 [--webdataset]
imagen.py: error: unrecognized arguments: --start_image girl.png

That is no --start_image option.

The second error is as follows:

!python imagen.py --imagen model5.pth --sample_unet=2 --size 256 --output girl_256.png --tags "1girl, blonde_hair" --cond_scale=1.1 --replace

The base dimension of your u-net should ideally be no smaller than 128, as recommended by a professional DDPM trainer https://nonint.com/2022/05/04/friends-dont-let-friends-train-small-diffusion-models/
loading non-EMA version of unets
image sizes: [64, 256]
Traceback (most recent call last):
  File "imagen.py", line 956, in <module>
    main()
  File "imagen.py", line 189, in main
    sample(args)
  File "imagen.py", line 264, in sample
    stop_at_unet_number=args.sample_unet)
  File "/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/imagen_pytorch/imagen_pytorch.py", line 100, in inner
    out = fn(model, *args, **kwargs)
TypeError: sample() got an unexpected keyword argument 'sigma_max'

@deepglugs
Copy link
Owner

You may need to update imagen-pytorch and pull deep-imagen again.

@deepglugs
Copy link
Owner

deepglugs commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants