Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dreambooth extension causes BLIP interrogation to give error (if number of beams is changed to anything greater then 1) #1110

Closed
Vigilence opened this issue Mar 23, 2023 · 2 comments
Labels
new Just added, you should probably sort this.

Comments

@Vigilence
Copy link

Vigilence commented Mar 23, 2023

⚠️If you do not follow the template, your issue may be closed without a response ⚠️

Kindly read and fill this form in its entirety.

0. Initial troubleshooting

Please check each of these before opening an issue. If you've checked them, delete this section of your bug report. Have you:

  • Updated the Stable-Diffusion-WebUI to the latest version? Yes
  • Updated Dreambooth to the latest revision? Yes
  • Completely restarted the stable-diffusion-webUI, not just reloaded the UI? Yes
  • Read the Readme? Yes

1. Please find the following lines in the console and paste them below.

#######################################################################################################
Initializing Dreambooth
If submitting an issue on github, please provide the below text for debugging purposes:

Python revision: 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Dreambooth revision: da2e40415f1cb63cc4de46d6dc97eb8676c6e30c
SD-WebUI revision: a9fed7c364061ae6efb37f797b6b522cb3cf7aa2

Successfully installed accelerate-0.17.1
Successfully installed requests-2.28.2
Successfully installed fastapi-0.90.1 starlette-0.23.1
Successfully installed gitpython-3.1.31
Successfully installed transformers-4.27.2

[+] torch version 1.13.1+cu117 installed.
[+] torchvision version 0.14.1+cu117 installed.
[+] xformers version 0.0.17.dev476 installed.
[+] accelerate version 0.17.1 installed.
[+] diffusers version 0.14.0 installed.
[+] transformers version 4.27.2 installed.
[+] bitsandbytes version 0.35.4 installed.

#######################################################################################################

2. Describe the bug

Installation of Dreambooth (latest update) causes clip interrogator (built in) to give an error readout if the number of beams is increased to any number besides the default value of 1. To narrow down what extension causes this issue I created a new installation of automatic1111 and installed only dreambooth.

To replicate steps:

  1. Create new installation of automatic1111.
  2. Upload an image in img2img and press clip interrogation (works fine).
  3. Head to settings, interrogation settings and change Interrogate: num_beams for BLIP to any number greater then 1.
  4. Upload an image in img2img and press blip interrogation (gives error).
  5. Close automatic1111
  6. Uninstall dreambooth and delete the "sd_dreambooth_extension" folder.
  7. Restart automatic1111
  8. Upload an image in img2img, press blip interrogation (with beams greater then 1). It now works.

Screenshots/Config
If the issue is specific to an error while training, please provide a screenshot of training parameters or the
db_config.json file from /models/dreambooth/MODELNAME/db_config.json

Blip interrogation error when dreambooth is installed (and never used before).
2023-03-23_0-42-43

Blip interrogation when dreambooth is uninstalled and folder deleted
2023-03-23_0-46-24

Interrogation settings
2023-03-23_0-46-45

3. Provide logs

If a crash has occurred, please provide the entire stack trace from the log, including the last few log messages before the crash occurred.

Launching Web UI with arguments: --medvram --xformers --theme dark
Loading weights [7dce63578a] from E:\Automatic1111\stable-diffusion-webui\models\Stable-diffusion\aaa_eGV10.ckpt
Creating model from config: E:\Automatic1111\stable-diffusion-webui\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying xformers cross attention optimization.
Textual inversion embeddings loaded(6): bad-artist-anime, bad-artist, easynegative, pureerosface_v1, Style-Moana-neg, ulzzang-6500-v1.1
Textual inversion embeddings skipped(7): embellish1, embellish2, embellish3, nartfixer, nfixer, nfixernext, nrealfixer
Model loaded in 5.0s (load weights from disk: 2.8s, create model: 0.5s, apply weights to model: 0.8s, apply half(): 0.9s).
CUDA SETUP: Loading binary E:\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cudaall.dll...
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 17.4s (import gradio: 2.1s, import ldm: 1.1s, other imports: 2.1s, list extensions: 0.2s, load scripts: 4.8s, load SD checkpoint: 5.3s, create ui: 1.4s, gradio launch: 0.2s).
load checkpoint from E:\Automatic1111\stable-diffusion-webui\models\BLIP\model_base_caption_capfilt_large.pth
Error interrogating
Traceback (most recent call last):
  File "E:\Automatic1111\stable-diffusion-webui\modules\interrogate.py", line 198, in interrogate
    caption = self.generate_caption(pil_image)
  File "E:\Automatic1111\stable-diffusion-webui\modules\interrogate.py", line 183, in generate_caption
    caption = self.blip_model.generate(gpu_image, sample=False, num_beams=shared.opts.interrogate_clip_num_beams, min_length=shared.opts.interrogate_clip_min_length, max_length=shared.opts.interrogate_clip_max_length)
  File "E:\Automatic1111\stable-diffusion-webui\repositories\BLIP\models\blip.py", line 156, in generate
    outputs = self.text_decoder.generate(input_ids=input_ids,
  File "E:\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "E:\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\transformers\generation\utils.py", line 1490, in generate
    return self.beam_search(
  File "E:\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\transformers\generation\utils.py", line 2749, in beam_search
    outputs = self(
  File "E:\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "E:\Automatic1111\stable-diffusion-webui\repositories\BLIP\models\med.py", line 886, in forward
    outputs = self.bert(
  File "E:\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "E:\Automatic1111\stable-diffusion-webui\repositories\BLIP\models\med.py", line 781, in forward
    encoder_outputs = self.encoder(
  File "E:\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "E:\Automatic1111\stable-diffusion-webui\repositories\BLIP\models\med.py", line 445, in forward
    layer_outputs = layer_module(
  File "E:\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "E:\Automatic1111\stable-diffusion-webui\repositories\BLIP\models\med.py", line 361, in forward
    cross_attention_outputs = self.crossattention(
  File "E:\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "E:\Automatic1111\stable-diffusion-webui\repositories\BLIP\models\med.py", line 277, in forward
    self_outputs = self.self(
  File "E:\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "E:\Automatic1111\stable-diffusion-webui\repositories\BLIP\models\med.py", line 178, in forward
    attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: The size of tensor a (13) must match the size of tensor b (169) at non-singleton dimension 0

4. Environment

What OS? Windows 10

If Windows - WSL or native? Native

What GPU are you using? 2080 8 GB

@Vigilence Vigilence added the new Just added, you should probably sort this. label Mar 23, 2023
ArrowM added a commit that referenced this issue Mar 23, 2023
Reverting transformers for #1110
@ArrowM
Copy link
Collaborator

ArrowM commented Mar 23, 2023

It's the transformers library. I updated the dev branch to transformers==4.26.1 which seems to work. The dev branch will eventually be merged with main. Please check out the dev branch or modify the transformers line of extensions/sd_dreambooth_extension/requirements.txt to be

transformers==4.26.1

@ArrowM ArrowM closed this as completed Mar 23, 2023
@Vigilence
Copy link
Author

It's the transformers library. I updated the dev branch to transformers==4.26.1 which seems to work. The dev branch will eventually be merged with main. Please check out the dev branch or modify the transformers line of extensions/sd_dreambooth_extension/requirements.txt to be

transformers==4.26.1

Your solution fixed the issue on my end, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new Just added, you should probably sort this.
Projects
None yet
Development

No branches or pull requests

2 participants