Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Creating new embedding.pt fails pickle check #15214

Open
3 of 6 tasks
BBird6 opened this issue Mar 11, 2024 · 8 comments
Open
3 of 6 tasks

[Bug]: Creating new embedding.pt fails pickle check #15214

BBird6 opened this issue Mar 11, 2024 · 8 comments
Labels
bug-report Report of a bug, yet to be confirmed

Comments

@BBird6
Copy link

BBird6 commented Mar 11, 2024

Checklist

  • The issue exists after disabling all extensions
  • The issue exists on a clean installation of webui
  • The issue is caused by an extension, but I believe it is caused by a bug in the webui
  • The issue exists in the current version of the webui
  • The issue has not been reported before recently
  • The issue has been reported before but has not been fixed yet

What happened?

Whenever I create a new embedding, the pickle check fails to verify the new created file.
Old embeddings are read without any problem.
(This is my first new TI training since the 1.8.0 update)

Steps to reproduce the problem

  1. Training tab
  2. Create embedding

What should have happened?

.pt file should pass the pickle check

What browsers do you use to access the UI ?

Google Chrome

Sysinfo

sysinfo-2024-03-11-06-37.json

Console logs

venv "C:\Users\black\stable-diffusion-webui\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
Version: v1.8.0
Commit hash: bef51aed032c0aaa5cfd80445bc4cf0d85b408b5
Launching Web UI with arguments: --xformers
2024-03-11 07:46:14.167317: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
WARNING:tensorflow:From C:\Users\black\stable-diffusion-webui\stable-diffusion-webui\venv\lib\site-packages\keras\src\losses.py:2976: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.

Loading weights [6ce0161689] from C:\Users\black\stable-diffusion-webui\stable-diffusion-webui\models\Stable-diffusion\v1-5-pruned-emaonly.safetensors
Creating model from config: C:\Users\black\stable-diffusion-webui\stable-diffusion-webui\configs\v1-inference.yaml
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 16.2s (prepare environment: 2.2s, import torch: 4.4s, import gradio: 1.0s, setup paths: 6.3s, initialize shared: 0.2s, other imports: 0.6s, list SD models: 0.1s, load scripts: 0.6s, create ui: 0.5s, gradio launch: 0.3s).
Applying attention optimization: xformers... done.
*** Error verifying pickled file from C:\Users\black\stable-diffusion-webui\stable-diffusion-webui\embeddings\Test_embed.pt
*** The file may be malicious, so the program is not going to read it.
*** You can skip this check with --disable-safe-unpickle commandline argument.
***
    Traceback (most recent call last):
      File "C:\Users\black\stable-diffusion-webui\stable-diffusion-webui\modules\safe.py", line 137, in load_with_extra
        check_pt(filename, extra_handler)
      File "C:\Users\black\stable-diffusion-webui\stable-diffusion-webui\modules\safe.py", line 84, in check_pt
        check_zip_filenames(filename, z.namelist())
      File "C:\Users\black\stable-diffusion-webui\stable-diffusion-webui\modules\safe.py", line 76, in check_zip_filenames
        raise Exception(f"bad file inside {filename}: {name}")
    Exception: bad file inside C:\Users\black\stable-diffusion-webui\stable-diffusion-webui\embeddings\Test_embed.pt: Test_embed/byteorder

---
*** Error loading embedding Test_embed.pt
    Traceback (most recent call last):
      File "C:\Users\black\stable-diffusion-webui\stable-diffusion-webui\modules\textual_inversion\textual_inversion.py", line 203, in load_from_dir
        self.load_from_file(fullfn, fn)
      File "C:\Users\black\stable-diffusion-webui\stable-diffusion-webui\modules\textual_inversion\textual_inversion.py", line 184, in load_from_file
        embedding = create_embedding_from_data(data, name, filename=filename, filepath=path)
      File "C:\Users\black\stable-diffusion-webui\stable-diffusion-webui\modules\textual_inversion\textual_inversion.py", line 284, in create_embedding_from_data
        if 'string_to_param' in data:  # textual inversion embeddings
    TypeError: argument of type 'NoneType' is not iterable

---
Model loaded in 3.6s (load weights from disk: 0.1s, create model: 0.5s, apply weights to model: 2.1s, load textual inversion embeddings: 0.5s, calculate empty prompt: 0.1s).

Additional information

No response

@BBird6 BBird6 added the bug-report Report of a bug, yet to be confirmed label Mar 11, 2024
@Aleh2
Copy link

Aleh2 commented Mar 19, 2024

Just ran into this issue myself. It's genuinely bizarre.

@aleksusklim
Copy link

In my EmbeddingMerge extension I am using this function internally to create a template embedding file for SD1 models.

Two users are telling me that they also have this "unsafe pickle" error in my extension: one of them decided to add --disable-safe-unpickle, while another one is asking me to change .pt to .safetensors format to fix this error.

Strange thing is that for many other people there are no errors whatsoever, including for myself!
I even tried to add torch.load(…,weights_only=True) when loading a properly saved embedding, but it didn't help.

Despite the error says something about byteorder, that user has a pretty standard Intel machine with Windows 10.

Since the same issue is happening with vanilla WebUI after creating a new embedding in Tran tab, I believe it should be fixed here in upstream, rather than in my own extension. Right?
(Putting aside the fact that "nobody uses TI train in WebUI anymore" and that here it also throws a completely different error for SDXL models since no training support exists for them)

The core problem might be deeper than just unsafe pickles, because it is not even happening for the majority of users.
Does anybody have clues of what might be causing this?

@BBird6
Copy link
Author

BBird6 commented Mar 26, 2024

Yeah, it is the byteorder tag which makes it different from older embeddings.
Yesterday I installed the 1.7 version of Automatic1111, created the .pt file there, and it didn't had the
byteorder line in the code.
Then I took the file over to the 1.8 embeddings folder, and started training there.
On the first step of writing the .pt after training the 1.8 version reverted the .pt file to the byteorder again, making it unusable.

It has nothing to do with your extension, as I don't use it.
--disable safe unpickle is no option for me, as the v1.8 won't even read it as a valid embedding.

@evilspoons
Copy link

I am running into the same thing. I have a very loose grasp of what is happening in training - I just follow a tutorial - and the tutorial I used on 1.7.x doesn't work on 1.8.x because of this exact same issue. My old embeddings still work fine, but I can't create any new ones. Really annoying.

@Tower13Studios
Copy link

Two users are telling me that they also have this "unsafe pickle" error in my extension: one of them decided to add --disable-safe-unpickle, while another one is asking me to change .pt to .safetensors format to fix this error.

--disable-safe-unpickle just ignores the error messages, if you'd create an embedding for publishing it, everyone else would run into the error with this embedding

Converting to safetenors doesn't work, the script goes straight to a callback error due to the corrupted file

the byteorder seems to be different but also the .data folder seems odd compared to my working files

@LingXuanYin
Copy link

I try to fix this ,if you are in a hurry ,try edit modules/textual_inversion/textual_inversion.py line 64 and line 71 just like this
image
After do this ,restart webui and recreat embedding
This could cause some performance issues but train and infer should be all correct now

@MisterSeajay
Copy link
Contributor

I'm seeing the same problem using...

Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
Version: v1.9.0-62-gddb28b33
Commit hash: ddb28b33a3561a360b429c76f28f7ff1ffe282a0
CUDA 12.1
Launching Web UI with arguments: --data-dir=S:\GenAI\ImageGen\data --clip-models-path=S:\GenAI\ImageGen\data\models\clip --ckpt-dir=S:\GenAI\ImageGen\data\models\checkpoints --embeddings-dir=S:\GenAI\ImageGen\data\models\embeddings --vae-dir=S:\GenAI\ImageGen\data\models\vaeE --styles-file=S:\GenAI\ImageGen\data\*.csv --enable-insecure-extension-access --listen --theme=dark --no-half-vae --xformers

Another symptom I didn't see mentioned was that when trying to create an embedding, the UI doesn't give any hint of a problem but when you switch from Create embedding to the Train sub-tab, the new embedding isn't present in the list of available embeddings.

@MisterSeajay
Copy link
Contributor

MisterSeajay commented May 22, 2024

So, the PyTorch Tutorial for loading and saving modesl mentions that the zip format changed in PyTorch 1.6+:

The 1.6 release of PyTorch switched torch.save to use a new zip file-based format. torch.load still retains the ability to load files in the old format. If for any reason you want torch.save to use the old format, pass the kwarg parameter _use_new_zipfile_serialization=False.

... and I can confirm that the workaround above by @LingXuanYin works. At least, I'm now able to create an embedding file without triggering this error and the new embedding DOES show on the list of Embeddings in the Train tab.

I'm unclear whether #15774 is a fix (rather than just a workaround). Would a better solution to be able to load .pt files with the new serialization format? I can only assume that's much more work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-report Report of a bug, yet to be confirmed
Projects
None yet
Development

No branches or pull requests

7 participants