-
Notifications
You must be signed in to change notification settings - Fork 6.4k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
unload_textual_inversion unloads only the pipeline tokenizer and text_encoder.
I can submit a PR if you like
Reproduction
Following Advanced Lora training inference examples
# load embeddings to the text encoders
state_dict = load_file(embedding_path)
# notice we load the tokens <s0><s1>, as "TOK" as only a place-holder and training was performed using the new initialized tokens - <s0><s1>
# load embeddings of text_encoder 1 (CLIP ViT-L/14)
pipe.load_textual_inversion(state_dict["clip_l"], token=["<s0>", "<s1>"], text_encoder=pipe.text_encoder, tokenizer=pipe.tokenizer)
# load embeddings of text_encoder 2 (CLIP ViT-G/14)
pipe.load_textual_inversion(state_dict["clip_g"], token=["<s0>", "<s1>"], text_encoder=pipe.text_encoder_2, tokenizer=pipe.tokenizer_2)
And then trying to unload and load again:
pipe.unload_textual_inversion()
# load embeddings of text_encoder 1 (CLIP ViT-L/14)
pipe.load_textual_inversion(state_dict["clip_l"], token=["<s0>", "<s1>"], text_encoder=pipe.text_encoder, tokenizer=pipe.tokenizer)
# load embeddings of text_encoder 2 (CLIP ViT-G/14)
pipe.load_textual_inversion(state_dict["clip_g"], token=["<s0>", "<s1>"], text_encoder=pipe.text_encoder_2, tokenizer=pipe.tokenizer_2)
Will raise the following exception:
ValueError: Token already in tokenizer vocabulary. Please choose a different token name or remove and embedding from the tokenizer and text encoder.
Logs
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/lora_advanced/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/ubuntu/miniconda3/envs/lora_advanced/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/ubuntu/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>
cli.main()
File "/home/ubuntu/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
run()
File "/home/ubuntu/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
runpy.run_path(target, run_name="__main__")
File "/home/ubuntu/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/ubuntu/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/ubuntu/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
exec(code, run_globals)
File "/home/ubuntu/dev/tailored_generation/tailored_generation/finetune/eval.py", line 76, in <module>
pipe.load_textual_inversion(state_dict["clip_g"], token=["<s0>", "<s1>"], text_encoder=pipe.text_encoder_2, tokenizer=pipe.tokenizer_2)
File "/home/ubuntu/miniconda3/envs/lora_advanced/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/lora_advanced/lib/python3.10/site-packages/diffusers/loaders/textual_inversion.py", line 402, in load_textual_inversion
tokens, embeddings = self._retrieve_tokens_and_embeddings(tokens, state_dicts, tokenizer)
File "/home/ubuntu/miniconda3/envs/lora_advanced/lib/python3.10/site-packages/diffusers/loaders/textual_inversion.py", line 229, in _retrieve_tokens_and_embeddings
raise ValueError(
ValueError: Token <s0> already in tokenizer vocabulary. Please choose a different token name or remove <s0> and embedding from the tokenizer and text encoder.
System Info
diffusers
version: 0.26.2- Platform: Linux-5.15.0-1036-aws-x86_64-with-glibc2.31
- Python version: 3.10.13
- PyTorch version (GPU?): 2.2.0+cu121 (True)
- Huggingface_hub version: 0.20.3
- Transformers version: 4.37.2
- Accelerate version: 0.27.0
- xFormers version: not installed
- Using GPU in script?: NVIDIA A10G
- Using distributed or parallel set-up in script?: no
Who can help?
yiyixuxu
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working