From f62024f7ee3284e17643b4cf3e4fcdcd4f9cd9c9 Mon Sep 17 00:00:00 2001 From: linoytsaban Date: Thu, 17 Oct 2024 23:15:59 +0300 Subject: [PATCH 1/4] fix arg naming --- examples/advanced_diffusion_training/README_flux.md | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/examples/advanced_diffusion_training/README_flux.md b/examples/advanced_diffusion_training/README_flux.md index e755fd8b61e0..79321ec1b755 100644 --- a/examples/advanced_diffusion_training/README_flux.md +++ b/examples/advanced_diffusion_training/README_flux.md @@ -96,7 +96,7 @@ Please keep the following points in mind: To activate pivotal tuning for both encoders, add the flag `--enable_t5_ti`. * When not fine-tuning the text encoders, we ALWAYS precompute the text embeddings to save memory. * **pure textual inversion** - to support the full range from pivotal tuning to textual inversion we introduce `--train_transformer_frac` which controls the amount of epochs the transformer LoRA layers are trained. By default, `--train_transformer_frac==1`, to trigger a textual inversion run set `--train_transformer_frac==0`. Values between 0 and 1 are supported as well, and we welcome the community to experiment w/ different settings and share the results! -* **token initializer** - similar to the original textual inversion work, you can specify a token of your choosing as the starting point for training. By default, when enabling `--train_text_encoder_ti`, the new inserted tokens are initialized randomly. You can specify a token in `--initializer_token` such that the starting point for the trained embeddings will be the embeddings associated with your chosen `--initializer_token`. +* **token initializer** - similar to the original textual inversion work, you can specify a concept of your choosing as the starting point for training. By default, when enabling `--train_text_encoder_ti`, the new inserted tokens are initialized randomly. You can specify a token in `--initializer_concept` such that the starting point for the trained embeddings will be the embeddings associated with your chosen `--initializer_token`. ## Training examples @@ -147,7 +147,6 @@ accelerate launch train_dreambooth_lora_flux_advanced.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --dataset_name=$DATASET_NAME \ --instance_prompt="3d icon in the style of TOK" \ - --validation_prompt="a TOK icon of an astronaut riding a horse, in the style of TOK" \ --output_dir=$OUTPUT_DIR \ --caption_column="prompt" \ --mixed_precision="bf16" \ @@ -165,7 +164,7 @@ accelerate launch train_dreambooth_lora_flux_advanced.py \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --rank=8 \ - --max_train_steps=1000 \ + --max_train_steps=700 \ --checkpointing_steps=2000 \ --seed="0" \ --push_to_hub @@ -190,7 +189,6 @@ accelerate launch train_dreambooth_lora_flux_advanced.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --dataset_name=$DATASET_NAME \ --instance_prompt="3d icon in the style of TOK" \ - --validation_prompt="a TOK icon of an astronaut riding a horse, in the style of TOK" \ --output_dir=$OUTPUT_DIR \ --caption_column="prompt" \ --mixed_precision="bf16" \ @@ -209,7 +207,7 @@ accelerate launch train_dreambooth_lora_flux_advanced.py \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --rank=8 \ - --max_train_steps=1000 \ + --max_train_steps=700 \ --checkpointing_steps=2000 \ --seed="0" \ --push_to_hub @@ -229,7 +227,6 @@ accelerate launch train_dreambooth_lora_flux_advanced.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --dataset_name=$DATASET_NAME \ --instance_prompt="3d icon in the style of TOK" \ - --validation_prompt="a TOK icon of an astronaut riding a horse, in the style of TOK" \ --output_dir=$OUTPUT_DIR \ --caption_column="prompt" \ --mixed_precision="bf16" \ @@ -249,7 +246,7 @@ accelerate launch train_dreambooth_lora_flux_advanced.py \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --rank=8 \ - --max_train_steps=1000 \ + --max_train_steps=700 \ --checkpointing_steps=2000 \ --seed="0" \ --push_to_hub From e23a29b21f6f1094c130a704eb15ade622316f47 Mon Sep 17 00:00:00 2001 From: linoytsaban Date: Thu, 17 Oct 2024 23:25:04 +0300 Subject: [PATCH 2/4] fix arg naming --- examples/advanced_diffusion_training/README_flux.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/examples/advanced_diffusion_training/README_flux.md b/examples/advanced_diffusion_training/README_flux.md index 79321ec1b755..8de1e76e70de 100644 --- a/examples/advanced_diffusion_training/README_flux.md +++ b/examples/advanced_diffusion_training/README_flux.md @@ -270,8 +270,8 @@ pipe = AutoPipelineForText2Image.from_pretrained("black-forest-labs/FLUX.1-dev", pipe.load_lora_weights(repo_id, weight_name="pytorch_lora_weights.safetensors") ``` 2. now we load the pivotal tuning embeddings -💡note that if you didn't enable `--enable_t5_ti`, you only load the embeddings to the CLIP encoder - +> [!NOTE] #1 if `--enable_t5_ti` wasn't passed, we only load the embeddings to the CLIP encoder. +> [!NOTE] #2 the number of tokens (i.e. ,...,) is either determined by --num_new_tokens_per_abstraction or by --initializer_concept. Make sure to update inference code accordingly :) ```python text_encoders = [pipe.text_encoder, pipe.text_encoder_2] tokenizers = [pipe.tokenizer, pipe.tokenizer_2] From 93d69e7381ccc7c5fdae5db9376be9bf6ad20ca7 Mon Sep 17 00:00:00 2001 From: linoytsaban Date: Thu, 17 Oct 2024 23:27:57 +0300 Subject: [PATCH 3/4] fix arg naming --- examples/advanced_diffusion_training/README_flux.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/examples/advanced_diffusion_training/README_flux.md b/examples/advanced_diffusion_training/README_flux.md index 8de1e76e70de..ad2075c5de31 100644 --- a/examples/advanced_diffusion_training/README_flux.md +++ b/examples/advanced_diffusion_training/README_flux.md @@ -96,7 +96,7 @@ Please keep the following points in mind: To activate pivotal tuning for both encoders, add the flag `--enable_t5_ti`. * When not fine-tuning the text encoders, we ALWAYS precompute the text embeddings to save memory. * **pure textual inversion** - to support the full range from pivotal tuning to textual inversion we introduce `--train_transformer_frac` which controls the amount of epochs the transformer LoRA layers are trained. By default, `--train_transformer_frac==1`, to trigger a textual inversion run set `--train_transformer_frac==0`. Values between 0 and 1 are supported as well, and we welcome the community to experiment w/ different settings and share the results! -* **token initializer** - similar to the original textual inversion work, you can specify a concept of your choosing as the starting point for training. By default, when enabling `--train_text_encoder_ti`, the new inserted tokens are initialized randomly. You can specify a token in `--initializer_concept` such that the starting point for the trained embeddings will be the embeddings associated with your chosen `--initializer_token`. +* **token initializer** - similar to the original textual inversion work, you can specify a concept of your choosing as the starting point for training. By default, when enabling `--train_text_encoder_ti`, the new inserted tokens are initialized randomly. You can specify a token in `--initializer_concept` such that the starting point for the trained embeddings will be the embeddings associated with your chosen `--initializer_concept`. ## Training examples @@ -271,6 +271,7 @@ pipe.load_lora_weights(repo_id, weight_name="pytorch_lora_weights.safetensors") ``` 2. now we load the pivotal tuning embeddings > [!NOTE] #1 if `--enable_t5_ti` wasn't passed, we only load the embeddings to the CLIP encoder. + > [!NOTE] #2 the number of tokens (i.e. ,...,) is either determined by --num_new_tokens_per_abstraction or by --initializer_concept. Make sure to update inference code accordingly :) ```python text_encoders = [pipe.text_encoder, pipe.text_encoder_2] From 99344c63782459e848faae81ed3e0c8b6cb99ad9 Mon Sep 17 00:00:00 2001 From: linoytsaban Date: Thu, 17 Oct 2024 23:29:27 +0300 Subject: [PATCH 4/4] fix arg naming --- examples/advanced_diffusion_training/README_flux.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/advanced_diffusion_training/README_flux.md b/examples/advanced_diffusion_training/README_flux.md index ad2075c5de31..8817431bede5 100644 --- a/examples/advanced_diffusion_training/README_flux.md +++ b/examples/advanced_diffusion_training/README_flux.md @@ -272,7 +272,7 @@ pipe.load_lora_weights(repo_id, weight_name="pytorch_lora_weights.safetensors") 2. now we load the pivotal tuning embeddings > [!NOTE] #1 if `--enable_t5_ti` wasn't passed, we only load the embeddings to the CLIP encoder. -> [!NOTE] #2 the number of tokens (i.e. ,...,) is either determined by --num_new_tokens_per_abstraction or by --initializer_concept. Make sure to update inference code accordingly :) +> [!NOTE] #2 the number of tokens (i.e. ,...,) is either determined by `--num_new_tokens_per_abstraction` or by `--initializer_concept`. Make sure to update inference code accordingly :) ```python text_encoders = [pipe.text_encoder, pipe.text_encoder_2] tokenizers = [pipe.tokenizer, pipe.tokenizer_2]