Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training is successfull, output image does not contain the person whose 10 images i used to train it #1158

Closed
nayan-dhabarde opened this issue Nov 21, 2023 · 5 comments

Comments

@nayan-dhabarde
Copy link

command = [f"accelerate launch --config_file "config.yaml" train_dreambooth.py
--pretrained_model_name_or_path="SG161222/Realistic_Vision_V4.0_noVAE"
--instance_data_dir="smart_cropped_images"
--class_data_dir="{dataset}"
--output_dir="trained_model"
--train_text_encoder
--instance_prompt="a photo of a nayansks {promptGender}"
--class_prompt="a photo of a {promptGender}"
--with_prior_preservation --prior_loss_weight=1.0
--resolution=768
--train_batch_size=1
--gradient_accumulation_steps=1
--checkpointing_steps=1500
--learning_rate=1e-6
--lr_scheduler="constant"
--lr_warmup_steps=0
--use_lora
--lora_r 16
--lora_alpha 27
--lora_text_encoder_r 16
--lora_text_encoder_alpha 17
--max_train_steps=1500
--num_class_images=1500 "]

I was using the same script for training using only Dreambooth (without the lora params). The dreambooth solutions works fine for me.

I have also tried diffuser's Dreambooth+Lora solution but never worked for me either

Now, I tried using this repo and Got same failure.

Can you please help me understanding what could have gone wrong?

The inference script is exactly same used here:

https://github.com/huggingface/peft/blob/main/examples/lora_dreambooth/lora_dreambooth_inference.ipynb

MODEL_NAME = "SG161222/Realistic_Vision_V4.0_noVAE"  
INSTANCE_PROMPT = "a photo of a nayansks man"

pipe = get_lora_sd_pipeline("trained_model", adapter_name="man")

prompt = "a photo of a nayansks man"
negative_prompt = "low quality, blurry, unfinished"
image = pipe(prompt, num_inference_steps=50, guidance_scale=7, negative_prompt=negative_prompt).images[0]
image

these steps are also taken from
https://huggingface.co/docs/peft/task_guides/dreambooth_lora

@younesbelkada
Copy link
Contributor

I experimented a bit with dreambooth + PEFT and I think that you need to play a bit with hyperparameters, see for example this thread: huggingface/diffusers#5840

@nayan-dhabarde
Copy link
Author

@younesbelkada As discussed in the huggingface/diffusers#5840, using a higher learning rate and changing lr scheduler to cosine does work and shows me subject under training in results.

I will try with other hyper parameters, I am just wondering if this script will be able to achieve results similar to the one's which are achieved with Dreambooth only script

@nayan-dhabarde
Copy link
Author

@younesbelkada This video shows default kohya_ss parameters which has yielded good results for people
Here is a screenshot from the video
image
Here is the link to the video:
https://www.youtube.com/watch?v=TpuDOsuKIBo&t=9s

@younesbelkada
Copy link
Contributor

Thanks very much @nayan-dhabarde !
Btw we have also merged #1189 that should enable different initialization strategies, we found out that when using "gaussian" init strategy you can use smaller LR

@nayan-dhabarde
Copy link
Author

@younesbelkada Looks worth checking, I will try this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants