From f62024f7ee3284e17643b4cf3e4fcdcd4f9cd9c9 Mon Sep 17 00:00:00 2001
From: linoytsaban <linoy@huggingface.co>
Date: Thu, 17 Oct 2024 23:15:59 +0300
Subject: [PATCH 1/4] fix arg naming

---
 examples/advanced_diffusion_training/README_flux.md | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/examples/advanced_diffusion_training/README_flux.md b/examples/advanced_diffusion_training/README_flux.md
index e755fd8b61e0..79321ec1b755 100644
--- a/examples/advanced_diffusion_training/README_flux.md
+++ b/examples/advanced_diffusion_training/README_flux.md
@@ -96,7 +96,7 @@ Please keep the following points in mind:
 To activate pivotal tuning for both encoders, add the flag `--enable_t5_ti`. 
 * When not fine-tuning the text encoders, we ALWAYS precompute the text embeddings to save memory.
 * **pure textual inversion** - to support the full range from pivotal tuning to textual inversion we introduce `--train_transformer_frac` which controls the amount of epochs the transformer LoRA layers are trained. By default, `--train_transformer_frac==1`, to trigger a textual inversion run set `--train_transformer_frac==0`. Values between 0 and 1 are supported as well, and we welcome the community to experiment w/ different settings and share the results!
-* **token initializer** - similar to the original textual inversion work, you can specify a token of your choosing as the starting point for training. By default, when enabling `--train_text_encoder_ti`, the new inserted tokens are initialized randomly. You can specify a token in `--initializer_token` such that the starting point for the trained embeddings will be the embeddings associated with your chosen `--initializer_token`.
+* **token initializer** - similar to the original textual inversion work, you can specify a concept of your choosing as the starting point for training. By default, when enabling `--train_text_encoder_ti`, the new inserted tokens are initialized randomly. You can specify a token in `--initializer_concept` such that the starting point for the trained embeddings will be the embeddings associated with your chosen `--initializer_token`.
 
 ## Training examples
 
@@ -147,7 +147,6 @@ accelerate launch train_dreambooth_lora_flux_advanced.py \
   --pretrained_model_name_or_path=$MODEL_NAME \
   --dataset_name=$DATASET_NAME \
   --instance_prompt="3d icon in the style of TOK" \
-  --validation_prompt="a TOK icon of an astronaut riding a horse, in the style of TOK" \
   --output_dir=$OUTPUT_DIR \
   --caption_column="prompt" \
   --mixed_precision="bf16" \
@@ -165,7 +164,7 @@ accelerate launch train_dreambooth_lora_flux_advanced.py \
   --lr_scheduler="constant" \
   --lr_warmup_steps=0 \
   --rank=8 \
-  --max_train_steps=1000 \
+  --max_train_steps=700 \
   --checkpointing_steps=2000 \
   --seed="0" \
   --push_to_hub
@@ -190,7 +189,6 @@ accelerate launch train_dreambooth_lora_flux_advanced.py \
   --pretrained_model_name_or_path=$MODEL_NAME \
   --dataset_name=$DATASET_NAME \
   --instance_prompt="3d icon in the style of TOK" \
-  --validation_prompt="a TOK icon of an astronaut riding a horse, in the style of TOK" \
   --output_dir=$OUTPUT_DIR \
   --caption_column="prompt" \
   --mixed_precision="bf16" \
@@ -209,7 +207,7 @@ accelerate launch train_dreambooth_lora_flux_advanced.py \
   --lr_scheduler="constant" \
   --lr_warmup_steps=0 \
   --rank=8 \
-  --max_train_steps=1000 \
+  --max_train_steps=700 \
   --checkpointing_steps=2000 \
   --seed="0" \
   --push_to_hub
@@ -229,7 +227,6 @@ accelerate launch train_dreambooth_lora_flux_advanced.py \
   --pretrained_model_name_or_path=$MODEL_NAME \
   --dataset_name=$DATASET_NAME \
   --instance_prompt="3d icon in the style of TOK" \
-  --validation_prompt="a TOK icon of an astronaut riding a horse, in the style of TOK" \
   --output_dir=$OUTPUT_DIR \
   --caption_column="prompt" \
   --mixed_precision="bf16" \
@@ -249,7 +246,7 @@ accelerate launch train_dreambooth_lora_flux_advanced.py \
   --lr_scheduler="constant" \
   --lr_warmup_steps=0 \
   --rank=8 \
-  --max_train_steps=1000 \
+  --max_train_steps=700 \
   --checkpointing_steps=2000 \
   --seed="0" \
   --push_to_hub

From e23a29b21f6f1094c130a704eb15ade622316f47 Mon Sep 17 00:00:00 2001
From: linoytsaban <linoy@huggingface.co>
Date: Thu, 17 Oct 2024 23:25:04 +0300
Subject: [PATCH 2/4] fix arg naming

---
 examples/advanced_diffusion_training/README_flux.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/examples/advanced_diffusion_training/README_flux.md b/examples/advanced_diffusion_training/README_flux.md
index 79321ec1b755..8de1e76e70de 100644
--- a/examples/advanced_diffusion_training/README_flux.md
+++ b/examples/advanced_diffusion_training/README_flux.md
@@ -270,8 +270,8 @@ pipe = AutoPipelineForText2Image.from_pretrained("black-forest-labs/FLUX.1-dev",
 pipe.load_lora_weights(repo_id, weight_name="pytorch_lora_weights.safetensors")
 ```
 2. now we load the pivotal tuning embeddings 
-💡note that if you didn't enable `--enable_t5_ti`, you only load the embeddings to the CLIP encoder
-
+> [!NOTE] #1 if `--enable_t5_ti` wasn't passed, we only load the embeddings to the CLIP encoder.
+> [!NOTE] #2 the number of tokens (i.e. <s0>,...,<si>) is either determined by --num_new_tokens_per_abstraction or by --initializer_concept. Make sure to update inference code accordingly :)
 ```python
 text_encoders = [pipe.text_encoder, pipe.text_encoder_2]
 tokenizers = [pipe.tokenizer, pipe.tokenizer_2]

From 93d69e7381ccc7c5fdae5db9376be9bf6ad20ca7 Mon Sep 17 00:00:00 2001
From: linoytsaban <linoy@huggingface.co>
Date: Thu, 17 Oct 2024 23:27:57 +0300
Subject: [PATCH 3/4] fix arg naming

---
 examples/advanced_diffusion_training/README_flux.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/examples/advanced_diffusion_training/README_flux.md b/examples/advanced_diffusion_training/README_flux.md
index 8de1e76e70de..ad2075c5de31 100644
--- a/examples/advanced_diffusion_training/README_flux.md
+++ b/examples/advanced_diffusion_training/README_flux.md
@@ -96,7 +96,7 @@ Please keep the following points in mind:
 To activate pivotal tuning for both encoders, add the flag `--enable_t5_ti`. 
 * When not fine-tuning the text encoders, we ALWAYS precompute the text embeddings to save memory.
 * **pure textual inversion** - to support the full range from pivotal tuning to textual inversion we introduce `--train_transformer_frac` which controls the amount of epochs the transformer LoRA layers are trained. By default, `--train_transformer_frac==1`, to trigger a textual inversion run set `--train_transformer_frac==0`. Values between 0 and 1 are supported as well, and we welcome the community to experiment w/ different settings and share the results!
-* **token initializer** - similar to the original textual inversion work, you can specify a concept of your choosing as the starting point for training. By default, when enabling `--train_text_encoder_ti`, the new inserted tokens are initialized randomly. You can specify a token in `--initializer_concept` such that the starting point for the trained embeddings will be the embeddings associated with your chosen `--initializer_token`.
+* **token initializer** - similar to the original textual inversion work, you can specify a concept of your choosing as the starting point for training. By default, when enabling `--train_text_encoder_ti`, the new inserted tokens are initialized randomly. You can specify a token in `--initializer_concept` such that the starting point for the trained embeddings will be the embeddings associated with your chosen `--initializer_concept`.
 
 ## Training examples
 
@@ -271,6 +271,7 @@ pipe.load_lora_weights(repo_id, weight_name="pytorch_lora_weights.safetensors")
 ```
 2. now we load the pivotal tuning embeddings 
 > [!NOTE] #1 if `--enable_t5_ti` wasn't passed, we only load the embeddings to the CLIP encoder.
+
 > [!NOTE] #2 the number of tokens (i.e. <s0>,...,<si>) is either determined by --num_new_tokens_per_abstraction or by --initializer_concept. Make sure to update inference code accordingly :)
 ```python
 text_encoders = [pipe.text_encoder, pipe.text_encoder_2]

From 99344c63782459e848faae81ed3e0c8b6cb99ad9 Mon Sep 17 00:00:00 2001
From: linoytsaban <linoy@huggingface.co>
Date: Thu, 17 Oct 2024 23:29:27 +0300
Subject: [PATCH 4/4] fix arg naming

---
 examples/advanced_diffusion_training/README_flux.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/advanced_diffusion_training/README_flux.md b/examples/advanced_diffusion_training/README_flux.md
index ad2075c5de31..8817431bede5 100644
--- a/examples/advanced_diffusion_training/README_flux.md
+++ b/examples/advanced_diffusion_training/README_flux.md
@@ -272,7 +272,7 @@ pipe.load_lora_weights(repo_id, weight_name="pytorch_lora_weights.safetensors")
 2. now we load the pivotal tuning embeddings 
 > [!NOTE] #1 if `--enable_t5_ti` wasn't passed, we only load the embeddings to the CLIP encoder.
 
-> [!NOTE] #2 the number of tokens (i.e. <s0>,...,<si>) is either determined by --num_new_tokens_per_abstraction or by --initializer_concept. Make sure to update inference code accordingly :)
+> [!NOTE] #2 the number of tokens (i.e. <s0>,...,<si>) is either determined by `--num_new_tokens_per_abstraction` or by `--initializer_concept`. Make sure to update inference code accordingly :)
 ```python
 text_encoders = [pipe.text_encoder, pipe.text_encoder_2]
 tokenizers = [pipe.tokenizer, pipe.tokenizer_2]