Add inpainting support to pivotal tuning #152

levi · 2023-01-29T17:16:23Z

Still playing around with the results of this, but wanted to put it up to get feedback on the implementation.

Mainly integrates ShivamShrirao/diffusers' approach to training dreambooth with inpainting. Generates a random mask of cutout rectangles for each image in the training set and then combines it with the noisy latents during training.

I kept the mask example attribute naming different to prevent any issues with masked score estimation. Potentially there is useful overlap here, like sharing a segmentation mask between inpainting training and MSE.

The mask generation function is pretty simple. TBD on the results on my end. The SD2 inpainting model card describes using the random mask generation method defined by LAMA. It's a mask strategy using chained polygons (clusters of rects) and wide rectangles of various sizes. Would be cool to implement this in a future PR. Source here: https://github.com/saic-mdal/lama/blob/358536640559121052e45f307982ee9969ae269b/saicinpainting/training/data/masks.py#L176

cloneofsimo · 2023-01-30T02:20:02Z

Whoa that's pretty cool. I'll have a look thank you for this PR!

cloneofsimo · 2023-01-31T06:31:58Z

Hey @levi, can you show us some example outputs? PR looks good to me, but it would be cool if there was example ones

levi · 2023-01-31T13:04:40Z

Yeah I’m currently in the process of hooking this up with my inference tool. Will have some examples in a few days.

cloneofsimo · 2023-02-04T14:11:00Z

Hey @levi , have you had success with these models?

levi · 2023-02-04T15:18:54Z

Hey @levi , have you had success with these models?

I sent you an email a few days ago with some questions. Let me know. I'll have time this weekend to run some tests.

cloneofsimo · 2023-02-04T15:37:01Z

Yes, I checked the email thanks, I just realized that follwing params are better:
Note that these params aren't used to make images on the readme. paramas used to make the images are the default ones on the example script folder. These ones are the ones used to make images recently.

export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export INSTANCE_DIR="./data/data_captioned"
export OUTPUT_DIR="./exps/krk_captioned_scale2"

lora_pti \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --train_text_encoder \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=2 \
  --gradient_checkpointing \
  --scale_lr \
  --learning_rate_unet=2e-4 \
  --learning_rate_text=1e-6 \
  --learning_rate_ti=5e-4 \
  --color_jitter \
  --lr_scheduler="linear" \
  --lr_warmup_steps=0 \
  --lr_scheduler_lora="constant" \
  --lr_warmup_steps_lora=100 \
  --placeholder_tokens="<s1>|<s2>" \
  --placeholder_token_at_data="<krk>|<s1><s2>" \
  --save_steps=100 \
  --max_train_steps_ti=700 \
  --max_train_steps_tuning=700 \
  --perform_inversion=True \
  --clip_ti_decay \
  --weight_decay_ti=0.000 \
  --weight_decay_lora=0.000 \
  --device="cuda:0" \
  --lora_rank=8 \
  --use_face_segmentation_condition \
  --lora_dropout_p=0.1 \
  --lora_scale=2.0 \

levi · 2023-02-05T18:01:58Z

I added a notebook that follows the same inference prompts as the main inference notebook. The results are pretty bad right now.

With LoRa scale 1.0, the output looks like its undertrained, so I'm increasing the training steps to 3k and will report back any changes. Are there any parameter adjustments you recommend?

levi · 2023-02-06T06:32:21Z

Looking a lot better after 3k steps!

levi · 2023-02-06T06:35:49Z

Wow, I'm surprisingly impressed with a scale of 1.0.

levi · 2023-02-06T17:33:05Z

@cloneofsimo ok, I updated the inference notebook to show some examples using the krk training params. The model looks good after 3000 training steps -- full training params in the inpainting_example.sh script. I wasn't able to get my Disney style model to look good so I ended up removing it from the PR. However, it's likely a parameter or two away from excellent results.

cloneofsimo · 2023-02-07T01:44:52Z

Thanks @levi ! this looks awesome. Just speculations, but I think additionally making inpaint-masks at randomly user-preferred area might additionally speed up stuff, because at this place we only want to additionally train face-regions. I'll test this out myself as well. LGTM!

levi force-pushed the levi/inpaint branch 2 times, most recently from 9ce3f86 to 5eb9880 Compare January 29, 2023 20:14

cloneofsimo changed the base branch from master to develop January 30, 2023 02:20

Basic inpainting training to LoRa PTI

af7025b

levi force-pushed the levi/inpaint branch from 9a6e3fd to af7025b Compare February 1, 2023 21:53

Inpainting example notebook and training scripts

90f972e

levi force-pushed the levi/inpaint branch from 7be0ccb to a4faef0 Compare February 6, 2023 17:28

Update examples and readme

2ee3c67

levi force-pushed the levi/inpaint branch from a4faef0 to 2ee3c67 Compare February 6, 2023 17:30

cloneofsimo merged commit 6f82996 into cloneofsimo:develop Feb 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add inpainting support to pivotal tuning #152

Add inpainting support to pivotal tuning #152

levi commented Jan 29, 2023

cloneofsimo commented Jan 30, 2023

cloneofsimo commented Jan 31, 2023

levi commented Jan 31, 2023

cloneofsimo commented Feb 4, 2023

levi commented Feb 4, 2023

cloneofsimo commented Feb 4, 2023 •

edited

levi commented Feb 5, 2023

levi commented Feb 6, 2023

levi commented Feb 6, 2023

levi commented Feb 6, 2023

cloneofsimo commented Feb 7, 2023

Add inpainting support to pivotal tuning #152

Add inpainting support to pivotal tuning #152

Conversation

levi commented Jan 29, 2023

cloneofsimo commented Jan 30, 2023

cloneofsimo commented Jan 31, 2023

levi commented Jan 31, 2023

cloneofsimo commented Feb 4, 2023

levi commented Feb 4, 2023

cloneofsimo commented Feb 4, 2023 • edited

levi commented Feb 5, 2023

levi commented Feb 6, 2023

levi commented Feb 6, 2023

levi commented Feb 6, 2023

cloneofsimo commented Feb 7, 2023

cloneofsimo commented Feb 4, 2023 •

edited