stable diffusion demo ,run error #2581

xiaohaipeng · 2023-01-03T08:33:26Z

Description

when run demo-diffusion.py,met error.

Environment

used the docker provided ,nvcr.io/nvidia/tensorrt :22.10-py3

TensorRT Version: 8.5.0.12
NVIDIA GPU: V100
NVIDIA Driver Version: 515.43.04
CUDA Version: 11.8
CUDNN Version: None
Operating System: Ubantu
Python Version (if applicable): 3.8.10
Tensorflow Version (if applicable):
PyTorch Version (if applicable): 1.12.0+cu116
Baremetal or Container (if so, version):

Relevant Files

[I] Total Nodes | Original: 1251, After Folding: 1078 | 173 Nodes Folded
[I] Folding Constants | Pass 3
[I] Total Nodes | Original: 1078, After Folding: 1078 | 0 Nodes Folded
CLIP: fold constants .. 1078 nodes, 1812 tensors, 1 inputs, 1 outputs
CLIP: shape inference .. 1078 nodes, 1812 tensors, 1 inputs, 1 outputs
CLIP: removed 12 casts .. 1054 nodes, 1788 tensors, 1 inputs, 1 outputs
CLIP: inserted 25 LayerNorm plugins .. 842 nodes, 1526 tensors, 1 inputs, 1 outputs
CLIP: final .. 842 nodes, 1526 tensors, 1 inputs, 1 outputs
Building TensorRT engine for onnx/clip.opt.onnx: engine/clip.plan
[W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See CUDA_MODULE_LOADING in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[W] parsers/onnx/onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[E] parsers/onnx/ModelImporter.cpp:740: While parsing node number 7 [LayerNorm -> "LayerNormV-0"]:
[E] parsers/onnx/ModelImporter.cpp:741: --- Begin node ---
[E] parsers/onnx/ModelImporter.cpp:742: input: "input.7"
input: "LayerNormGamma-0"
input: "LayerNormBeta-0"
output: "LayerNormV-0"
name: "LayerNormN-0"
op_type: "LayerNorm"
attribute {
name: "epsilon"
f: 1e-05
type: FLOAT
}
[E] parsers/onnx/ModelImporter.cpp:743: --- End node ---
[E] parsers/onnx/ModelImporter.cpp:745: ERROR: parsers/onnx/builtin_op_importers.cpp:5365 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[E] In node 7 (importFallbackPluginImporter): UNSUPPORTED_NODE: Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[!] Could not parse ONNX correctly
Traceback (most recent call last):
File "demo-diffusion.py", line 482, in
demo.loadEngines(args.engine_dir, args.onnx_dir, args.onnx_opset,
File "demo-diffusion.py", line 241, in loadEngines
engine.build(onnx_opt_path, fp16=True,
File "/workspace/demo/Diffusion/utilities.py", line 72, in build
engine = engine_from_network(network_from_onnx_path(onnx_path), config=CreateConfig(fp16=fp16, profiles=[p],
File "", line 3, in func_impl
File "/usr/local/lib/python3.8/dist-packages/polygraphy/backend/base/loader.py", line 42, in call
return self.call_impl(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/polygraphy/backend/trt/loader.py", line 183, in call_impl
trt_util.check_onnx_parser_errors(parser, success)
File "/usr/local/lib/python3.8/dist-packages/polygraphy/backend/trt/util.py", line 85, in check_onnx_parser_errors
G_LOGGER.critical("Could not parse ONNX correctly")
File "/usr/local/lib/python3.8/dist-packages/polygraphy/logger/logger.py", line 597, in critical
raise PolygraphyException(message) from None
polygraphy.exception.exception.PolygraphyException: Could not parse ONNX correctly

The text was updated successfully, but these errors were encountered:

nekorobov · 2023-01-03T08:44:08Z

Hi, Volta is not supported for the fast plugins in stable diffusion. See #2560

xiaohaipeng · 2023-01-03T08:56:20Z

are the results same as those released in hugging face?
can i use those released in huggine face to inference with V100?
@nekorobov

Hi, Volta is not supported for the fast plugins in stable diffusion. See #2560

nekorobov · 2023-01-09T09:38:56Z

Yes, results are the same. You can get it running in TRT on V100 only without optimized plugins.
To do so, please, specify --onnx-minimal-optimizations --force-onnx-optimize --force-engine-build

BugFreeee · 2023-03-02T08:24:22Z

Yes, results are the same. You can get it running in TRT on V100 only without optimized plugins. To do so, please, specify --onnx-minimal-optimizations --force-onnx-optimize --force-engine-build

I'm seeing the same error with A100-40GB.

TensorRT Version: v8500
NVIDIA GPU: A100
NVIDIA Driver Version: 525.85.12
CUDA Version: 12.0
CUDNN Version: None
Operating System: Ubuntu
Python Version (if applicable): 3.8.10
Tensorflow Version (if applicable):
PyTorch Version (if applicable): 1.12.0+cu116
Baremetal or Container (if so, version): container, nvcr.io/nvidia/tensorrt :22.10-py3

================================================================================
cmd line out:

[I] Initializing StableDiffusion demo with TensorRT Plugins
Building TensorRT engine for onnx/clip.opt.onnx: engine/clip.plan
[W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See CUDA_MODULE_LOADING in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[W] parsers/onnx/onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[E] parsers/onnx/ModelImporter.cpp:740: While parsing node number 7 [LayerNorm -> "LayerNormV-0"]:
[E] parsers/onnx/ModelImporter.cpp:741: --- Begin node ---
[E] parsers/onnx/ModelImporter.cpp:742: input: "input.7"
input: "LayerNormGamma-0"
input: "LayerNormBeta-0"
output: "LayerNormV-0"
name: "LayerNormN-0"
op_type: "LayerNorm"
attribute {
name: "epsilon"
f: 1e-05
type: FLOAT
}
attribute {
name: "axis"
i: -1
type: INT
}
[E] parsers/onnx/ModelImporter.cpp:743: --- End node ---
[E] parsers/onnx/ModelImporter.cpp:745: ERROR: parsers/onnx/builtin_op_importers.cpp:5365 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[E] In node 7 (importFallbackPluginImporter): UNSUPPORTED_NODE: Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[!] Could not parse ONNX correctly
Traceback (most recent call last):
File "demo-diffusion.py", line 487, in
demo.loadEngines(args.engine_dir, args.onnx_dir, args.onnx_opset,
File "demo-diffusion.py", line 246, in loadEngines
engine.build(onnx_opt_path, fp16=True,
File "/workspace/demo/Diffusion/utilities.py", line 72, in build
engine = engine_from_network(network_from_onnx_path(onnx_path), config=CreateConfig(fp16=fp16, profiles=[p],
File "", line 3, in func_impl
File "/usr/local/lib/python3.8/dist-packages/polygraphy/backend/base/loader.py", line 42, in call
return self.call_impl(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/polygraphy/backend/trt/loader.py", line 183, in call_impl
trt_util.check_onnx_parser_errors(parser, success)
File "/usr/local/lib/python3.8/dist-packages/polygraphy/backend/trt/util.py", line 85, in check_onnx_parser_errors
G_LOGGER.critical("Could not parse ONNX correctly")
File "/usr/local/lib/python3.8/dist-packages/polygraphy/logger/logger.py", line 597, in critical
raise PolygraphyException(message) from None
polygraphy.exception.exception.PolygraphyException: Could not parse ONNX correctly

BugFreeee · 2023-03-02T09:03:09Z

Yes, results are the same. You can get it running in TRT on V100 only without optimized plugins. To do so, please, specify --onnx-minimal-optimizations --force-onnx-optimize --force-engine-build

I'm able to run it using "git clone git@github.com:NVIDIA/TensorRT.git -b release/8.5 --single-branch".
The above error was reported when I'm using latest commit which is 8.5.3.1.
Maybe you guys tweak something in recent commits casuing the error?

rajeevsrao · 2023-03-18T00:53:59Z

@BugFreeee @xiaohaipeng can you please try the new demoDiffusion code in release/8.6? We have removed the need for plugins and the deployment experience should hopefully go smoother.

BugFreeee · 2023-03-20T06:44:52Z

@BugFreeee @xiaohaipeng can you please try the new demoDiffusion code in release/8.6? We have removed the need for plugins and the deployment experience should hopefully go smoother.

Hi, thanks for the reply. I have root caused the issue. It's indeed plugin related. I relaunched the built container and forgot to set those plugin env param, which casues the aforementioned issue.

Also in recent update, you mentioned the removal of those plugins. Will that dwarf the performance? Are you going to add them back in the future?

ttyio · 2023-07-18T17:42:57Z

@BugFreeee , those plugin has some accuracy issue, and we fixed it in the native implementation. Have no plan to add them back. I will close this since it works in your side. Thanks!

nekorobov added triaged Issue has been triaged by maintainers Demo: Diffusion Issues regarding demoDiffusion labels Jan 3, 2023

ttyio closed this as completed Jul 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stable diffusion demo ,run error #2581

stable diffusion demo ,run error #2581

xiaohaipeng commented Jan 3, 2023

nekorobov commented Jan 3, 2023

xiaohaipeng commented Jan 3, 2023 •

edited

nekorobov commented Jan 9, 2023

BugFreeee commented Mar 2, 2023

BugFreeee commented Mar 2, 2023 •

edited

rajeevsrao commented Mar 18, 2023

BugFreeee commented Mar 20, 2023

ttyio commented Jul 18, 2023

stable diffusion demo ,run error #2581

stable diffusion demo ,run error #2581

Comments

xiaohaipeng commented Jan 3, 2023

Description

Environment

Relevant Files

nekorobov commented Jan 3, 2023

xiaohaipeng commented Jan 3, 2023 • edited

nekorobov commented Jan 9, 2023

BugFreeee commented Mar 2, 2023

BugFreeee commented Mar 2, 2023 • edited

rajeevsrao commented Mar 18, 2023

BugFreeee commented Mar 20, 2023

ttyio commented Jul 18, 2023

xiaohaipeng commented Jan 3, 2023 •

edited

BugFreeee commented Mar 2, 2023 •

edited