Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stable Diffusion 3 support #16030

Open
wants to merge 24 commits into
base: dev
Choose a base branch
from
Open

Stable Diffusion 3 support #16030

wants to merge 24 commits into from

Conversation

AUTOMATIC1111
Copy link
Owner

@AUTOMATIC1111 AUTOMATIC1111 commented Jun 16, 2024

Description

  • add support for infinite length prompts
  • add support for using optimized attention (currently uses SDP regardless of the setting, with some einsums inside) (do we even need that?)
  • add support for not loading T5
  • add support for counting tokens
  • add support for DDIM and other timesteps profilers
  • investigate why DPM++ 2M Karras isn't working well
  • go through generation with profiler
  • support img2img
  • support medvram

Screenshots/videos:

firefox_D27dsqzVE7

@protector131090
Copy link

protector131090 commented Jun 16, 2024

Error on loading model. AttributeError: 'NoneType' object has no attribute 'is_sdxl'

@AUTOMATIC1111
Copy link
Owner Author

protector131090: Share the full stack trace from the console.

@protector131090
Copy link

protector131090: Share the full stack trace from the console.

Traceback (most recent call last):
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1429, in process_api
inputs = self.preprocess_data(fn_index, inputs, state)
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1222, in preprocess_data
self.validate_inputs(fn_index, inputs)
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1209, in validate_inputs
raise ValueError(
ValueError: An event handler (modelmerger) didn't receive enough input values (needed: 16, got: 0).
Check if the event handler calls a Javascript function, and make sure its return value is correct.
Wanted inputs:
[label, dropdown, dropdown, dropdown, radio, slider, checkbox, textbox, radio, radio, dropdown, textbox, checkbox, checkbox, checkbox, textbox]
Received inputs:
[]
Traceback (most recent call last):
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1429, in process_api
inputs = self.preprocess_data(fn_index, inputs, state)
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1222, in preprocess_data
self.validate_inputs(fn_index, inputs)
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1209, in validate_inputs
raise ValueError(
ValueError: An event handler (modelmerger) didn't receive enough input values (needed: 16, got: 0).
Check if the event handler calls a Javascript function, and make sure its return value is correct.
Wanted inputs:
[label, dropdown, dropdown, dropdown, radio, slider, checkbox, textbox, radio, radio, dropdown, textbox, checkbox, checkbox, checkbox, textbox]
Received inputs:
[]

@AUTOMATIC1111
Copy link
Owner Author

Okay, well, you're trying to merge models, not generate images.

@protector131090
Copy link

protector131090 commented Jun 16, 2024

Снимок экрана (2873)

Okay, well, you're trying to merge models, not generate images.

what do you mean? i switch model in UI like usual. Here is switch to epic realism - generate - switch to Realvis 4 - generate and switch to sd3 medium :

Applying attention optimization: xformers... done.
Model loaded in 8.6s (load weights from disk: 0.2s, create model: 0.4s, apply weights to model: 7.4s, move model to device: 0.4s).
Using already loaded model epicrealism_naturalSinRC1VAE.safetensors [84d76a0328]: done in 3.6s (send model to cpu: 2.6s, send model to device: 0.9s)
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00, 6.45it/s]
Using already loaded model realvisxlV40_v40Bakedvae.safetensors [912c9dc74f]: done in 3.5s (send model to cpu: 0.7s, send model to device: 2.8s)
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:04<00:00, 4.13it/s]
Reusing loaded model epicrealism_naturalSinRC1VAE.safetensors [84d76a0328] to load comfy\sd3_medium.safetensors [cc236278d2]
Loading weights [cc236278d2] from C:\sd.webui\copy\models\Stable-diffusion\comfy\sd3_medium.safetensors
Creating model from config: C:\sd.webui\copy\configs\sd3-inference.yaml
mmdit initializing with: input_size=None, patch_size=2, in_channels=16, depth=24, mlp_ratio=4.0, learn_sigma=False, adm_in_channels=2048, context_embedder_config={'target': 'torch.nn.Linear', 'params': {'in_features': 4096, 'out_features': 1536}}, register_length=0, attn_mode='torch', rmsnorm=False, scale_mod_only=False, swiglu=False, out_channels=None, pos_embed_scaling_factor=None, pos_embed_offset=None, pos_embed_max_size=192, num_patches=36864, qk_norm=None, qkv_bias=True, dtype=torch.float16, device='cpu'
creating model quickly: TypeError
Traceback (most recent call last):
File "C:\Program Files\Python310\lib\threading.py", line 973, in _bootstrap
self._bootstrap_inner()
File "C:\Program Files\Python310\lib\threading.py", line 1016, in _bootstrap_inner
self.run()
File "C:\sd.webui\copy\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
response = f(*args, **kwargs)
File "C:\sd.webui\copy\modules\ui_settings.py", line 316, in
fn=lambda value, k=k: self.run_settings_single(value, key=k),
File "C:\sd.webui\copy\modules\ui_settings.py", line 95, in run_settings_single
if value is None or not opts.set(key, value):
File "C:\sd.webui\copy\modules\options.py", line 165, in set
option.onchange()
File "C:\sd.webui\copy\modules\call_queue.py", line 14, in f
res = func(*args, **kwargs)
File "C:\sd.webui\copy\modules\initialize_util.py", line 181, in
shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: sd_models.reload_model_weights()), call=False)
File "C:\sd.webui\copy\modules\sd_models.py", line 969, in reload_model_weights
load_model(checkpoint_info, already_loaded_state_dict=state_dict)
File "C:\sd.webui\copy\modules\sd_models.py", line 809, in load_model
sd_model = instantiate_from_config(sd_config.model, state_dict)
File "C:\sd.webui\copy\modules\sd_models.py", line 764, in instantiate_from_config
return constructor(**params)
File "C:\sd.webui\copy\modules\models\sd3\sd3_model.py", line 136, in init
self.cond_stage_model = SD3Cond()
File "C:\sd.webui\copy\modules\models\sd3\sd3_model.py", line 61, in init
self.tokenizer = SD3Tokenizer()
File "C:\sd.webui\copy\modules\models\sd3\other_impls.py", line 207, in init
self.t5xxl = T5XXLTokenizer()
File "C:\sd.webui\copy\modules\models\sd3\other_impls.py", line 305, in init
super().init(pad_with_end=False, tokenizer=T5TokenizerFast.from_pretrained("google/t5-v1_1-xxl"), has_start_token=False, pad_to_max_length=False, max_length=99999999, min_length=77)
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\tokenization_utils_base.py", line 1825, in from_pretrained
return cls._from_pretrained(
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\tokenization_utils_base.py", line 1988, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs)
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\models\t5\tokenization_t5_fast.py", line 133, in init
super().init(
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\tokenization_utils_fast.py", line 114, in init
fast_tokenizer = convert_slow_tokenizer(slow_tokenizer)
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\convert_slow_tokenizer.py", line 1307, in convert_slow_tokenizer
return converter_class(transformer_tokenizer).converted()
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\convert_slow_tokenizer.py", line 445, in init
from .utils import sentencepiece_model_pb2 as model_pb2
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\utils\sentencepiece_model_pb2.py", line 91, in
_descriptor.EnumValueDescriptor(
File "C:\sd.webui\copy\venv\lib\site-packages\google\protobuf\descriptor.py", line 789, in new
_message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.If you cannot immediately regenerate your protos, some other possible workarounds are:

  1. Downgrade the protobuf package to 3.20.x or lower.
  2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

Failed to create model quickly; will retry using slow method.
mmdit initializing with: input_size=None, patch_size=2, in_channels=16, depth=24, mlp_ratio=4.0, learn_sigma=False, adm_in_channels=2048, context_embedder_config={'target': 'torch.nn.Linear', 'params': {'in_features': 4096, 'out_features': 1536}}, register_length=0, attn_mode='torch', rmsnorm=False, scale_mod_only=False, swiglu=False, out_channels=None, pos_embed_scaling_factor=None, pos_embed_offset=None, pos_embed_max_size=192, num_patches=36864, qk_norm=None, qkv_bias=True, dtype=torch.float16, device='cpu'
changing setting sd_model_checkpoint to comfy\sd3_medium.safetensors [cc236278d2]: TypeError
Traceback (most recent call last):
File "C:\sd.webui\copy\modules\options.py", line 165, in set
option.onchange()
File "C:\sd.webui\copy\modules\call_queue.py", line 14, in f
res = func(*args, **kwargs)
File "C:\sd.webui\copy\modules\initialize_util.py", line 181, in
shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: sd_models.reload_model_weights()), call=False)
File "C:\sd.webui\copy\modules\sd_models.py", line 969, in reload_model_weights
load_model(checkpoint_info, already_loaded_state_dict=state_dict)
File "C:\sd.webui\copy\modules\sd_models.py", line 818, in load_model
sd_model = instantiate_from_config(sd_config.model, state_dict)
File "C:\sd.webui\copy\modules\sd_models.py", line 764, in instantiate_from_config
return constructor(**params)
File "C:\sd.webui\copy\modules\models\sd3\sd3_model.py", line 136, in init
self.cond_stage_model = SD3Cond()
File "C:\sd.webui\copy\modules\models\sd3\sd3_model.py", line 61, in init
self.tokenizer = SD3Tokenizer()
File "C:\sd.webui\copy\modules\models\sd3\other_impls.py", line 207, in init
self.t5xxl = T5XXLTokenizer()
File "C:\sd.webui\copy\modules\models\sd3\other_impls.py", line 305, in init
super().init(pad_with_end=False, tokenizer=T5TokenizerFast.from_pretrained("google/t5-v1_1-xxl"), has_start_token=False, pad_to_max_length=False, max_length=99999999, min_length=77)
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\tokenization_utils_base.py", line 1825, in from_pretrained
return cls._from_pretrained(
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\tokenization_utils_base.py", line 1988, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs)
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\models\t5\tokenization_t5_fast.py", line 133, in init
super().init(
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\tokenization_utils_fast.py", line 114, in init
fast_tokenizer = convert_slow_tokenizer(slow_tokenizer)
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\convert_slow_tokenizer.py", line 1307, in convert_slow_tokenizer
return converter_class(transformer_tokenizer).converted()
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\convert_slow_tokenizer.py", line 445, in init
from .utils import sentencepiece_model_pb2 as model_pb2
File "C:\sd.webui\copy\venv\lib\site-packages\transformers\utils\sentencepiece_model_pb2.py", line 28, in
DESCRIPTOR = _descriptor.FileDescriptor(
File "C:\sd.webui\copy\venv\lib\site-packages\google\protobuf\descriptor.py", line 1072, in new
return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: Couldn't build proto file into descriptor pool: duplicate file name sentencepiece_model.proto

Traceback (most recent call last):
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
result = await self.call_function(
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\sd.webui\copy\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\sd.webui\copy\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "C:\sd.webui\copy\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
response = f(*args, **kwargs)
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 273, in refresh_models
new_model_list = self.get_model_list()
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 269, in get_model_list
return [f for f in os.listdir(model_dir) if f != ".gitkeep" and not any(tag in f for tag in get_sd_rm_tag())]
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 269, in
return [f for f in os.listdir(model_dir) if f != ".gitkeep" and not any(tag in f for tag in get_sd_rm_tag())]
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 261, in get_sd_rm_tag
if shared.sd_model.is_sdxl:
AttributeError: 'NoneType' object has no attribute 'is_sdxl'
Traceback (most recent call last):
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
result = await self.call_function(
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\sd.webui\copy\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\sd.webui\copy\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "C:\sd.webui\copy\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
response = f(*args, **kwargs)
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 273, in refresh_models
new_model_list = self.get_model_list()
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 269, in get_model_list
return [f for f in os.listdir(model_dir) if f != ".gitkeep" and not any(tag in f for tag in get_sd_rm_tag())]
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 269, in
return [f for f in os.listdir(model_dir) if f != ".gitkeep" and not any(tag in f for tag in get_sd_rm_tag())]
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 261, in get_sd_rm_tag
if shared.sd_model.is_sdxl:
AttributeError: 'NoneType' object has no attribute 'is_sdxl'

@AUTOMATIC1111
Copy link
Owner Author

Well, that now does look like the right stack trace. Can you try running with environment variable PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (i e add set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python to webui-user.bat) to see if that changes anything?

@protector131090
Copy link

protector131090 commented Jun 16, 2024

Well, that now does look like the right stack trace. Can you try running with environment variable PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (i e add set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python to webui-user.bat) to see if that changes anything?

@echo off

set PYTHON=
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=--xformers --ckpt-dir "S:\SD CHECKPOINTS" --disable-safe-unpickle --api
set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python

call webui.bat

venv "C:\sd.webui\copy\venv\Scripts\Python.exe"
Python 3.10.10 (tags/v3.10.10:aad5f6a, Feb 7 2023, 17:20:36) [MSC v.1929 64 bit (AMD64)]
Version: v1.9.4-173-g7ee2114c
Commit hash: 7ee2114
[Auto-Photoshop-SD] Attempting auto-update...
[Auto-Photoshop-SD] switch branch to extension branch.
checkout_result: Your branch is up to date with 'origin/master'.

[Auto-Photoshop-SD] Current Branch.
branch_result: * master

[Auto-Photoshop-SD] Fetch upstream.
fetch_result:
[Auto-Photoshop-SD] Pull upstream.
pull_result: Already up to date.
Launching Web UI with arguments: --xformers --ckpt-dir S:\SD CHECKPOINTS --disable-safe-unpickle --api
python_server_full_path: C:\sd.webui\copy\extensions\Auto-Photoshop-StableDiffusion-Plugin\server/python_server
[-] ADetailer initialized. version: 24.5.1, num models: 10
[AddNet] Updating model hashes...
0it [00:00, ?it/s]
[AddNet] Updating model hashes...
0it [00:00, ?it/s]
ControlNet preprocessor location: C:\sd.webui\webui\extensions\sd-webui-controlnet\annotator\downloads
2024-06-16 10:23:22,252 - ControlNet - INFO - ControlNet v1.1.449
Loading weights [912c9dc74f] from S:\SD CHECKPOINTS\realvisxlV40_v40Bakedvae.safetensors
Creating model from config: C:\sd.webui\copy\repositories\generative-models\configs\inference\sd_xl_base.yaml
Applying attention optimization: xformers... done.
Model loaded in 3.5s (load weights from disk: 0.3s, create model: 0.3s, apply weights to model: 2.5s, calculate empty prompt: 0.1s).
2024-06-16 10:23:26,357 - ControlNet - INFO - ControlNet UI callback registered.
C:\sd.webui\copy\extensions\sd-webui-additional-networks\scripts\metadata_editor.py:399: GradioDeprecationWarning: The style method is deprecated. Please set these arguments in the constructor instead.
with gr.Row().style(equal_height=False):
C:\sd.webui\copy\extensions\sd-webui-additional-networks\scripts\metadata_editor.py:521: GradioDeprecationWarning: The style method is deprecated. Please set these arguments in the constructor instead.
cover_image = gr.Image(
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
Startup time: 25.6s (prepare environment: 11.0s, import torch: 3.0s, import gradio: 0.8s, setup paths: 0.8s, initialize shared: 0.2s, other imports: 0.4s, load scripts: 2.7s, create ui: 5.3s, gradio launch: 0.3s, add APIs: 0.9s).
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00, 6.59it/s]
Loading model comfy\sd3_medium.safetensors [cc236278d2] (2 out of 2)
Loading weights [cc236278d2] from C:\sd.webui\copy\models\Stable-diffusion\comfy\sd3_medium.safetensors
Creating model from config: C:\sd.webui\copy\configs\sd3-inference.yaml
mmdit initializing with: input_size=None, patch_size=2, in_channels=16, depth=24, mlp_ratio=4.0, learn_sigma=False, adm_in_channels=2048, context_embedder_config={'target': 'torch.nn.Linear', 'params': {'in_features': 4096, 'out_features': 1536}}, register_length=0, attn_mode='torch', rmsnorm=False, scale_mod_only=False, swiglu=False, out_channels=None, pos_embed_scaling_factor=None, pos_embed_offset=None, pos_embed_max_size=192, num_patches=36864, qk_norm=None, qkv_bias=True, dtype=torch.float16, device='cpu'
changing setting sd_model_checkpoint to comfy\sd3_medium.safetensors [cc236278d2]: OutOfMemoryError
Traceback (most recent call last):
File "C:\sd.webui\copy\modules\options.py", line 165, in set
option.onchange()
File "C:\sd.webui\copy\modules\call_queue.py", line 14, in f
res = func(*args, **kwargs)
File "C:\sd.webui\copy\modules\initialize_util.py", line 181, in
shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: sd_models.reload_model_weights()), call=False)
File "C:\sd.webui\copy\modules\sd_models.py", line 950, in reload_model_weights
sd_model = reuse_model_from_already_loaded(sd_model, checkpoint_info, timer)
File "C:\sd.webui\copy\modules\sd_models.py", line 916, in reuse_model_from_already_loaded
load_model(checkpoint_info)
File "C:\sd.webui\copy\modules\sd_models.py", line 834, in load_model
load_model_weights(sd_model, checkpoint_info, state_dict, timer)
File "C:\sd.webui\copy\modules\sd_models.py", line 436, in load_model_weights
model.load_state_dict(state_dict, strict=False)
File "C:\sd.webui\copy\modules\sd_disable_initialization.py", line 223, in
module_load_state_dict = self.replace(torch.nn.Module, 'load_state_dict', lambda *args, **kwargs: load_state_dict(module_load_state_dict, *args, **kwargs))
File "C:\sd.webui\copy\modules\sd_disable_initialization.py", line 221, in load_state_dict
original(module, state_dict, strict=strict)
File "C:\sd.webui\copy\venv\lib\site-packages\torch\nn\modules\module.py", line 2138, in load_state_dict
load(self, state_dict)
File "C:\sd.webui\copy\venv\lib\site-packages\torch\nn\modules\module.py", line 2126, in load
load(child, child_state_dict, child_prefix)
File "C:\sd.webui\copy\venv\lib\site-packages\torch\nn\modules\module.py", line 2126, in load
load(child, child_state_dict, child_prefix)
File "C:\sd.webui\copy\venv\lib\site-packages\torch\nn\modules\module.py", line 2126, in load
load(child, child_state_dict, child_prefix)
[Previous line repeated 7 more times]
File "C:\sd.webui\copy\venv\lib\site-packages\torch\nn\modules\module.py", line 2120, in load
module._load_from_state_dict(
File "C:\sd.webui\copy\modules\sd_disable_initialization.py", line 225, in
linear_load_from_state_dict = self.replace(torch.nn.Linear, '_load_from_state_dict', lambda *args, **kwargs: load_from_state_dict(linear_load_from_state_dict, *args, **kwargs))
File "C:\sd.webui\copy\modules\sd_disable_initialization.py", line 191, in load_from_state_dict
module._parameters[name] = torch.nn.parameter.Parameter(torch.zeros_like(param, device=device, dtype=dtype), requires_grad=param.requires_grad)
File "C:\sd.webui\copy\venv\lib\site-packages\torch_meta_registrations.py", line 4507, in zeros_like
res = aten.empty_like.default(
File "C:\sd.webui\copy\venv\lib\site-packages\torch_ops.py", line 448, in call
return self.op(*args, **kwargs or {})
File "C:\sd.webui\copy\venv\lib\site-packages\torch_refs_init
.py", line 4681, in empty_like
return torch.empty_permuted(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 160.00 MiB. GPU 0 has a total capacty of 23.99 GiB of which 0 bytes is free. Of the allocated memory 22.79 GiB is allocated by PyTorch, and 184.33 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Traceback (most recent call last):
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
result = await self.call_function(
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\sd.webui\copy\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\sd.webui\copy\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "C:\sd.webui\copy\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
response = f(*args, **kwargs)
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 273, in refresh_models
new_model_list = self.get_model_list()
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 269, in get_model_list
return [f for f in os.listdir(model_dir) if f != ".gitkeep" and not any(tag in f for tag in get_sd_rm_tag())]
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 269, in
return [f for f in os.listdir(model_dir) if f != ".gitkeep" and not any(tag in f for tag in get_sd_rm_tag())]
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 261, in get_sd_rm_tag
if shared.sd_model.is_sdxl:
AttributeError: 'NoneType' object has no attribute 'is_sdxl'
Traceback (most recent call last):
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
result = await self.call_function(
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\sd.webui\copy\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\sd.webui\copy\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "C:\sd.webui\copy\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "C:\sd.webui\copy\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
response = f(*args, **kwargs)
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 273, in refresh_models
new_model_list = self.get_model_list()
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 269, in get_model_list
return [f for f in os.listdir(model_dir) if f != ".gitkeep" and not any(tag in f for tag in get_sd_rm_tag())]
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 269, in
return [f for f in os.listdir(model_dir) if f != ".gitkeep" and not any(tag in f for tag in get_sd_rm_tag())]
File "C:\sd.webui\copy\extensions\sd-webui-animatediff\scripts\animatediff_ui.py", line 261, in get_sd_rm_tag
if shared.sd_model.is_sdxl:
AttributeError: 'NoneType' object has no attribute 'is_sdxl'

@AUTOMATIC1111
Copy link
Owner Author

AUTOMATIC1111 commented Jun 16, 2024

protector131090: The problem in your stack trace seems to be that you're running out of VRAM.

I found out that there is a hump in VRAM usage when loading the model, going over 24GB. On my machine I have a setting that automatically offloads that to RAM, so it worked for me.

Anyway, the fix for that is very simple, and now VRAM use hovers around 15.3GB during model load and after it.

@protector131090
Copy link

protector131090: The problem in your stack trace seems to be that you're running out of VRAM.

I found out that there is a hump in VRAM usage when loading the model, going over 24GB. On my machine I have a setting that automatically offloads that to RAM, so it worked for me.

Anyway, the fix for that is very simple, and now VRAM use hovers around 15.3GB during model load and after it.

model does load now. but i get tgis error:

venv "C:\sd.webui\SD3\webui\venv\Scripts\Python.exe"
Python 3.10.10 (tags/v3.10.10:aad5f6a, Feb 7 2023, 17:20:36) [MSC v.1929 64 bit (AMD64)]
Version: v1.9.4-174-gb443fdcf
Commit hash: b443fdc
Launching Web UI with arguments: --xformers --ckpt-dir S:\SD CHECKPOINTS --disable-safe-unpickle --api
Loading weights [cc236278d2] from C:\sd.webui\SD3\webui\models\Stable-diffusion\comfy\sd3_medium.safetensors
Creating model from config: C:\sd.webui\SD3\webui\configs\sd3-inference.yaml
mmdit initializing with: input_size=None, patch_size=2, in_channels=16, depth=24, mlp_ratio=4.0, learn_sigma=False, adm_in_channels=2048, context_embedder_config={'target': 'torch.nn.Linear', 'params': {'in_features': 4096, 'out_features': 1536}}, register_length=0, attn_mode='torch', rmsnorm=False, scale_mod_only=False, swiglu=False, out_channels=None, pos_embed_scaling_factor=None, pos_embed_offset=None, pos_embed_max_size=192, num_patches=36864, qk_norm=None, qkv_bias=True, dtype=torch.float16, device='cpu'
C:\sd.webui\SD3\webui\venv\lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
Startup time: 10.6s (prepare environment: 2.0s, import torch: 3.6s, import gradio: 0.8s, setup paths: 1.0s, initialize shared: 0.3s, other imports: 0.7s, load scripts: 1.0s, create ui: 0.4s, gradio launch: 0.4s, add APIs: 0.2s).
C:\sd.webui\SD3\webui\venv\lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
Applying attention optimization: xformers... done.
Model loaded in 9.6s (load weights from disk: 0.1s, create model: 1.9s, apply weights to model: 2.5s, apply half(): 0.1s, load weights from state dict: 4.5s, move model to device: 0.1s, calculate empty prompt: 0.3s).
50%|█████████████████████████████████████████ | 10/20 [00:00<00:00, 14.69it/s]C:\sd.webui\SD3\webui\modules\sd_samplers_common.py:68: RuntimeWarning: invalid value encountered in cast:00, 15.97it/s]
x_sample = x_sample.astype(np.uint8)
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:01<00:00, 13.61it/s]
*** Error completing request███████████████████████████████████████████████████████████| 20/20 [00:01<00:00, 13.35it/s]
*** Arguments: ('task(vu43rbt2pr3n4pt)', <gradio.routes.Request object at 0x0000024206328190>, '', '', [], 1, 1, 7, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', 'Use same scheduler', '', '', [], 0, 20, 'DPM++ 2M', 'Automatic', False, '', 0.8, -1, False, -1, 0, 0, 0, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {}
Traceback (most recent call last):
File "C:\sd.webui\SD3\webui\modules\call_queue.py", line 58, in f
res = list(func(*args, **kwargs))
File "C:\sd.webui\SD3\webui\modules\call_queue.py", line 37, in f
res = func(*args, **kwargs)
File "C:\sd.webui\SD3\webui\modules\txt2img.py", line 109, in txt2img
processed = processing.process_images(p)
File "C:\sd.webui\SD3\webui\modules\processing.py", line 847, in process_images
res = process_images_inner(p)
File "C:\sd.webui\SD3\webui\modules\processing.py", line 995, in process_images_inner
devices.test_for_nans(samples_ddim, "unet")
File "C:\sd.webui\SD3\webui\modules\devices.py", line 265, in test_for_nans
raise NansException(message)
modules.devices.NansException: A tensor with NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

@tritant
Copy link

tritant commented Jun 16, 2024

Hello, I get an error when I try to load the model

To create a public link, set share=True in launch().
Startup time: 23.8s (prepare environment: 7.8s, import torch: 3.0s, import gradio: 0.7s, setup paths: 0.8s, initialize shared: 0.2s, other imports: 0.5s, load scripts: 3.0s, create ui: 1.6s, gradio launch: 2.4s, add APIs: 3.0s, app_started_callback: 0.8s).
Applying attention optimization: xformers... done.
Model loaded in 8.3s (load weights from disk: 0.7s, create model: 0.4s, apply weights to model: 5.4s, apply half(): 1.2s, move model to device: 0.3s, calculate empty prompt: 0.3s).
Reusing loaded model CreaPrompt_Ultimate_Mix2.safetensors [fc814f22af] to load sd3_medium.safetensors [cc236278d2]
Loading weights [cc236278d2] from C:\sd.webui\webui\models\Stable-diffusion\sd3_medium.safetensors
Creating model from config: C:\sd.webui\webui\configs\sd3-inference.yaml
mmdit initializing with: input_size=None, patch_size=2, in_channels=16, depth=24, mlp_ratio=4.0, learn_sigma=False, adm_in_channels=2048, context_embedder_config={'target': 'torch.nn.Linear', 'params': {'in_features': 4096, 'out_features': 1536}}, register_length=0, attn_mode='torch', rmsnorm=False, scale_mod_only=False, swiglu=False, out_channels=None, pos_embed_scaling_factor=None, pos_embed_offset=None, pos_embed_max_size=192, num_patches=36864, qk_norm=None, qkv_bias=True, dtype=torch.float16, device='cpu'
creating model quickly: TypeError
Traceback (most recent call last):
File "C:\Users\jice\AppData\Local\Programs\Python\Python310\lib\threading.py", line 973, in _bootstrap
self._bootstrap_inner()
File "C:\Users\jice\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner
self.run()
File "C:\sd.webui\webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "C:\sd.webui\webui\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
response = f(*args, **kwargs)
File "C:\sd.webui\webui\modules\ui_settings.py", line 316, in
fn=lambda value, k=k: self.run_settings_single(value, key=k),
File "C:\sd.webui\webui\modules\ui_settings.py", line 95, in run_settings_single
if value is None or not opts.set(key, value):
File "C:\sd.webui\webui\modules\options.py", line 165, in set
option.onchange()
File "C:\sd.webui\webui\modules\call_queue.py", line 14, in f
res = func(*args, **kwargs)
File "C:\sd.webui\webui\modules\initialize_util.py", line 181, in
shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: sd_models.reload_model_weights()), call=False)
File "C:\sd.webui\webui\modules\sd_models.py", line 970, in reload_model_weights
load_model(checkpoint_info, already_loaded_state_dict=state_dict)
File "C:\sd.webui\webui\modules\sd_models.py", line 810, in load_model
sd_model = instantiate_from_config(sd_config.model, state_dict)
File "C:\sd.webui\webui\modules\sd_models.py", line 765, in instantiate_from_config
return constructor(**params)
File "C:\sd.webui\webui\modules\models\sd3\sd3_model.py", line 136, in init
self.cond_stage_model = SD3Cond()
File "C:\sd.webui\webui\modules\models\sd3\sd3_model.py", line 61, in init
self.tokenizer = SD3Tokenizer()
File "C:\sd.webui\webui\modules\models\sd3\other_impls.py", line 207, in init
self.t5xxl = T5XXLTokenizer()
File "C:\sd.webui\webui\modules\models\sd3\other_impls.py", line 305, in init
super().init(pad_with_end=False, tokenizer=T5TokenizerFast.from_pretrained("google/t5-v1_1-xxl"), has_start_token=False, pad_to_max_length=False, max_length=99999999, min_length=77)
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\tokenization_utils_base.py", line 1825, in from_pretrained
return cls._from_pretrained(
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\tokenization_utils_base.py", line 1988, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\models\t5\tokenization_t5_fast.py", line 133, in init
super().init(
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\tokenization_utils_fast.py", line 114, in init
fast_tokenizer = convert_slow_tokenizer(slow_tokenizer)
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\convert_slow_tokenizer.py", line 1307, in convert_slow_tokenizer
return converter_class(transformer_tokenizer).converted()
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\convert_slow_tokenizer.py", line 445, in init
from .utils import sentencepiece_model_pb2 as model_pb2
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\utils\sentencepiece_model_pb2.py", line 91, in
_descriptor.EnumValueDescriptor(
File "C:\sd.webui\webui\venv\lib\site-packages\google\protobuf\descriptor.py", line 789, in new
_message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:

  1. Downgrade the protobuf package to 3.20.x or lower.
  2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

Failed to create model quickly; will retry using slow method.
mmdit initializing with: input_size=None, patch_size=2, in_channels=16, depth=24, mlp_ratio=4.0, learn_sigma=False, adm_in_channels=2048, context_embedder_config={'target': 'torch.nn.Linear', 'params': {'in_features': 4096, 'out_features': 1536}}, register_length=0, attn_mode='torch', rmsnorm=False, scale_mod_only=False, swiglu=False, out_channels=None, pos_embed_scaling_factor=None, pos_embed_offset=None, pos_embed_max_size=192, num_patches=36864, qk_norm=None, qkv_bias=True, dtype=torch.float16, device='cpu'
changing setting sd_model_checkpoint to sd3_medium.safetensors [cc236278d2]: TypeError
Traceback (most recent call last):
File "C:\sd.webui\webui\modules\options.py", line 165, in set
option.onchange()
File "C:\sd.webui\webui\modules\call_queue.py", line 14, in f
res = func(*args, **kwargs)
File "C:\sd.webui\webui\modules\initialize_util.py", line 181, in
shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: sd_models.reload_model_weights()), call=False)
File "C:\sd.webui\webui\modules\sd_models.py", line 970, in reload_model_weights
load_model(checkpoint_info, already_loaded_state_dict=state_dict)
File "C:\sd.webui\webui\modules\sd_models.py", line 819, in load_model
sd_model = instantiate_from_config(sd_config.model, state_dict)
File "C:\sd.webui\webui\modules\sd_models.py", line 765, in instantiate_from_config
return constructor(**params)
File "C:\sd.webui\webui\modules\models\sd3\sd3_model.py", line 136, in init
self.cond_stage_model = SD3Cond()
File "C:\sd.webui\webui\modules\models\sd3\sd3_model.py", line 61, in init
self.tokenizer = SD3Tokenizer()
File "C:\sd.webui\webui\modules\models\sd3\other_impls.py", line 207, in init
self.t5xxl = T5XXLTokenizer()
File "C:\sd.webui\webui\modules\models\sd3\other_impls.py", line 305, in init
super().init(pad_with_end=False, tokenizer=T5TokenizerFast.from_pretrained("google/t5-v1_1-xxl"), has_start_token=False, pad_to_max_length=False, max_length=99999999, min_length=77)
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\tokenization_utils_base.py", line 1825, in from_pretrained
return cls._from_pretrained(
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\tokenization_utils_base.py", line 1988, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\models\t5\tokenization_t5_fast.py", line 133, in init
super().init(
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\tokenization_utils_fast.py", line 114, in init
fast_tokenizer = convert_slow_tokenizer(slow_tokenizer)
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\convert_slow_tokenizer.py", line 1307, in convert_slow_tokenizer
return converter_class(transformer_tokenizer).converted()
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\convert_slow_tokenizer.py", line 445, in init
from .utils import sentencepiece_model_pb2 as model_pb2
File "C:\sd.webui\webui\venv\lib\site-packages\transformers\utils\sentencepiece_model_pb2.py", line 28, in
DESCRIPTOR = _descriptor.FileDescriptor(
File "C:\sd.webui\webui\venv\lib\site-packages\google\protobuf\descriptor.py", line 1072, in new
return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: Couldn't build proto file into descriptor pool: duplicate file name sentencepiece_model.proto

@AUTOMATIC1111
Copy link
Owner Author

I added protobuf==3.20.0 to requirements which hopefully will make the problem go away.

@protector131090
Copy link

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:01<00:00, 11.80it/s]
*** Error completing request
*** Arguments: ('task(5kbarr62htczn7w)', <gradio.routes.Request object at 0x0000012DF24243D0>, '', '', [], 1, 1, 7, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', 'Use same scheduler', '', '', [], 0, 20, 'DPM++ 2M', 'Automatic', False, '', 0.8, -1, False, -1, 0, 0, 0, False, 'MultiDiffusion', False, True, 1024, 1024, 96, 96, 48, 4, 'None', 2, False, 10, 1, 1, 64, False, False, False, False, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 'DemoFusion', False, 128, 64, 4, 2, False, 10, 1, 1, 64, False, True, 3, 1, 1, True, 0.85, 0.6, 4, False, False, 3072, 192, True, True, True, False, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {}
Traceback (most recent call last):
File "C:\sd.webui\SD3\webui\modules\call_queue.py", line 58, in f
res = list(func(*args, **kwargs))
File "C:\sd.webui\SD3\webui\modules\call_queue.py", line 37, in f
res = func(*args, **kwargs)
File "C:\sd.webui\SD3\webui\modules\txt2img.py", line 109, in txt2img
processed = processing.process_images(p)
File "C:\sd.webui\SD3\webui\modules\processing.py", line 847, in process_images
res = process_images_inner(p)
File "C:\sd.webui\SD3\webui\modules\processing.py", line 995, in process_images_inner
devices.test_for_nans(samples_ddim, "unet")
File "C:\sd.webui\SD3\webui\modules\devices.py", line 265, in test_for_nans
raise NansException(message)
modules.devices.NansException: A tensor with NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

but with --no-half i get OOM ....

@protector131090
Copy link

--disable-nan-check --precision half Fixed it for me!

@tritant
Copy link

tritant commented Jun 16, 2024

I added protobuf==3.20.0 to requirements which hopefully will make the problem go away.

Yes, it works for me now thank you very much

@rltgjqmcpgjadyd
Copy link

rltgjqmcpgjadyd commented Jun 16, 2024

I have same NaNs issue & same solution
--disable-nan-check is not need

@ananosleep
Copy link

ananosleep commented Jun 16, 2024

Downloading clip from SAI's original repo suffers 401 HTTP error: Unauthorized
So maybe we need a download source which doesn't need logging in.

By the way, consider to use t5xxl_fp8_e4m3fn.safetensors instead of t5xxl_fp16.safetensors ?
I tried to rename t5xxl_fp8_e4m3fn.safetensors to t5xxl_fp16.safetensors, put it into models/CLIP, and it works well.

@JasperG
Copy link

JasperG commented Jun 16, 2024

--disable-nan-check --precision half Fixed it for me!

Note that this must be seen as a work-around; This will make SDXL output black as this makes the use of --no-half-vae impossible (contradicting).

@AUTOMATIC1111
Copy link
Owner Author

pushed an update to make it possible to work without --precision half.

ananosleep: well, shit... It worked for me.

@rltgjqmcpgjadyd
Copy link

ok it worked without --precision half

@AUTOMATIC1111
Copy link
Owner Author

ananosleep: Changed download links to a public huggingface repository.

@tritant
Copy link

tritant commented Jun 16, 2024

It works for me, but on an rtx 4060ti 16GB it's really very slow. To compare A1111(t5fp8) 7 minutes for 1024x1024 and with comfyui 30 seconds for 1024x1024

@light-and-ray
Copy link
Contributor

I have mentioned it doesn't support any divided by 8 resolution. For any of them it gives an error

@yo1nkers
Copy link

yo1nkers commented Jun 18, 2024 via email

@greasebig
Copy link

the requirements.txt is different from 1.9.3?

@yo1nkers
Copy link

yo1nkers commented Jun 18, 2024 via email

@protector131090
Copy link

are there any news on long prompts? and inpainting

@FurkanGozukara

This comment was marked as off-topic.

@tritant
Copy link

tritant commented Jun 24, 2024

with --medvram arg, when i try to load sdxl checkpoint i have this error

Reusing loaded model sd3_medium.safetensors [cc236278d2] to load CreaPrompt_Ultimate3.safetensors [c83f76acb6]1.48it/s]
Loading weights [c83f76acb6] from C:\sd.webui\webui\models\Stable-diffusion\CreaPrompt_Ultimate3.safetensors
Creating model from config: C:\sd.webui\webui\repositories\generative-models\configs\inference\sd_xl_base.yaml
changing setting sd_model_checkpoint to CreaPrompt_Ultimate3.safetensors [c83f76acb6]: AttributeError
Traceback (most recent call last):
File "C:\sd.webui\webui\modules\options.py", line 165, in set
option.onchange()
File "C:\sd.webui\webui\modules\call_queue.py", line 14, in f
res = func(*args, **kwargs)
File "C:\sd.webui\webui\modules\initialize_util.py", line 181, in
shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: sd_models.reload_model_weights()), call=False)
File "C:\sd.webui\webui\modules\sd_models.py", line 970, in reload_model_weights
load_model(checkpoint_info, already_loaded_state_dict=state_dict)
File "C:\sd.webui\webui\modules\sd_models.py", line 842, in load_model
send_model_to_device(sd_model)
File "C:\sd.webui\webui\modules\sd_models.py", line 746, in send_model_to_device
lowvram.apply(m)
File "C:\sd.webui\webui\modules\lowvram.py", line 29, in apply
setup_for_low_vram(sd_model, not shared.cmd_opts.lowvram)
File "C:\sd.webui\webui\modules\lowvram.py", line 110, in setup_for_low_vram
if hasattr(sd_model.cond_stage_model, "medvram_modules"):
File "C:\sd.webui\webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1695, in getattr
raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")
AttributeError: 'DiffusionEngine' object has no attribute 'cond_stage_model'. Did you mean: 'cond_stage_key'?

@light-and-ray
Copy link
Contributor

light-and-ray commented Jun 24, 2024

SD3 medvram stared work for me. But sd1 and sdxl are broken in medvram mode

@light-and-ray
Copy link
Contributor

rtx 3060, T5 + medvram: uses 11.4 GB for the first generation with 1 prompt, any further generations with the same prompt use 6.1 GB. So there is a place for optimization

Btw, can you add an option to run T5 only on cpu

@protector131090
Copy link

THANK YOU! for fixing img2img ! that is amazing!!!!

@protector131090
Copy link

protector131090 commented Jun 28, 2024

Long prompts give me this error. How do i fix it? thanks.
UPD --precision half fixes it.


100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 28/28 [00:10<00:00, 2.73it/s]
*** Error completing request
*** Arguments: ('task(vs0etxqaz3pd71e)', <gradio.routes.Request object at 0x0000026E43BCBD60>, '(documentary photography:1.4) photo of an elf woman, with deep, emerald green eyes, long, braided, dark brown hair, wearing a simple, brown leather tunic and pants, with a worn leather pouch on her hip, gazing intently into the distance, standing amidst a dense, ancient forest, under the soft, dappled light filtering through the leaves, shot at eye level on a Leica M3, with a 50mm lens, (film grain:1.1), in the style of Roberto Ferri.', '', [], 1, 1, 4, 1024, 1024, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', 'Use same scheduler', '', '', [], 0, 28, 'DPM++ 2M', 'SGM Uniform', False, '', 0.8, -1, False, -1, 0, 0, 0, True, False, False, False, 'base', False, False, {'ad_model': 'face_yolov8n.pt', 'ad_model_classes': '', 'ad_tab_enable': True, 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_model_classes': '', 'ad_tab_enable': True, 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_model_classes': '', 'ad_tab_enable': True, 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_model_classes': '', 'ad_tab_enable': True, 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_model_classes': '', 'ad_tab_enable': True, 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_model_classes': '', 'ad_tab_enable': True, 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False, 'Not set', True, True, '', '', '', '', '', 1.3, 'Not set', 'Not set', 1.3, 'Not set', 1.3, 'Not set', 1.3, 1.3, 'Not set', 1.3, 'Not set', 1.3, 'Not set', 1.3, 'Not set', 1.3, 'Not set', 1.3, 'Not set', False, 'None') {}
Traceback (most recent call last):
File "C:\sd.webui\SD3\webui\modules\call_queue.py", line 58, in f
res = list(func(*args, **kwargs))
File "C:\sd.webui\SD3\webui\modules\call_queue.py", line 37, in f
res = func(*args, **kwargs)
File "C:\sd.webui\SD3\webui\modules\txt2img.py", line 109, in txt2img
processed = processing.process_images(p)
File "C:\sd.webui\SD3\webui\modules\processing.py", line 847, in process_images
res = process_images_inner(p)
File "C:\sd.webui\SD3\webui\modules\processing.py", line 995, in process_images_inner
devices.test_for_nans(samples_ddim, "unet")
File "C:\sd.webui\SD3\webui\modules\devices.py", line 265, in test_for_nans
raise NansException(message)
modules.devices.NansException: A tensor with NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

@tritant
Copy link

tritant commented Jun 28, 2024

Hires.fix not working with sd3
100%|██████████████████████████████████████████████████████████████████████████████████| 28/28 [00:21<00:00, 1.28it/s]
tiled upscale: 100%|███████████████████████████████████████████████████████████████████| 35/35 [00:06<00:00, 5.15it/s]
0%| | 0/5 [00:01<?, ?it/s]
*** Error completing request
RuntimeError: The size of tensor a (123) must match the size of tensor b (122) at non-singleton dimension 3

@tritant
Copy link

tritant commented Jun 28, 2024

when an sd3 checkpoint is selected and the browser page is refreshed, the token counter is set to zero, the prompt must be modified for it to be calculated

@tritant
Copy link

tritant commented Jun 29, 2024

Since the last update, I can no longer load an SD3 model with --medvram

Loading weights [cc236278d2] from C:\sd.webui\webui\models\Stable-diffusion\sd3_medium.safetensors
Creating model from config: C:\sd.webui\webui\configs\sd3-inference.yaml
changing setting sd_model_checkpoint to sd3_medium.safetensors [cc236278d2]: AttributeError
Traceback (most recent call last):
File "C:\sd.webui\webui\modules\options.py", line 165, in set
option.onchange()
File "C:\sd.webui\webui\modules\call_queue.py", line 14, in f
res = func(*args, **kwargs)
File "C:\sd.webui\webui\modules\initialize_util.py", line 181, in
shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: sd_models.reload_model_weights()), call=False)
File "C:\sd.webui\webui\modules\sd_models.py", line 977, in reload_model_weights
load_model(checkpoint_info, already_loaded_state_dict=state_dict)
File "C:\sd.webui\webui\modules\sd_models.py", line 849, in load_model
send_model_to_device(sd_model)
File "C:\sd.webui\webui\modules\sd_models.py", line 756, in send_model_to_device
lowvram.apply(m)
File "C:\sd.webui\webui\modules\lowvram.py", line 29, in apply
setup_for_low_vram(sd_model, not shared.cmd_opts.lowvram)
File "C:\sd.webui\webui\modules\lowvram.py", line 100, in setup_for_low_vram
setattr(obj, field, None)
File "C:\sd.webui\webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1754, in setattr
super().setattr(name, value)
AttributeError: can't set attribute 'cond_stage_model'

@kevkevinpal
Copy link

investigate why DPM++ 2M Karras isn't working well

Trying to debug into this, but unsure what you mean by "isn't working well" I was able to get an image

Screenshot 2024-06-30 at 9 52 03 AM

@drhead
Copy link
Contributor

drhead commented Jul 2, 2024

investigate why DPM++ 2M Karras isn't working well

Karras schedule doesn't really make much sense to use on a flow model, it would be best to use Uniform. DPM++ 2M itself should work fine, but as far as the Karras schedule goes you can consider it to be as out of scope as SDE samplers (which will not work well, and you can't make them work well).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: Stable Diffusion 3 Medium support