StableDiffusion / Non-zero status code returned while running Add node. Name:'Add_221' #142

imranypatel · 2024-05-30T14:16:47Z

While trying Basic Stable Diffusion Example of https://www.nuget.org/packages/OnnxStack.StableDiffusion, at the following line in the code:

    // Run Pipleine
    var result = await pipeline.GenerateImageAsync(promptOptions);

Following exception is raised:

Microsoft.ML.OnnxRuntime.OnnxRuntimeException
  HResult=0x80131500
  Message=[ErrorCode:RuntimeException] Non-zero status code returned while running Add node. Name:'Add_221' Status Message: D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2482)\onnxruntime.DLL!00007FFC280E7AA5: (caller: 00007FFC280E712D) Exception(3) tid(1b6c) 80004005 Unspecified error

  Source=Microsoft.ML.OnnxRuntime
  StackTrace:
   at Microsoft.ML.OnnxRuntime.NativeApiStatus.VerifySuccess(IntPtr nativeStatus)
   at Microsoft.ML.OnnxRuntime.InferenceSession.<>c__DisplayClass75_0.<RunAsync>b__0(IReadOnlyCollection`1 outputs, IntPtr status)
--- End of stack trace from previous location ---
   at Microsoft.ML.OnnxRuntime.InferenceSession.<RunAsync>d__75.MoveNext()
   at OnnxStack.StableDiffusion.Pipelines.StableDiffusionPipeline.<EncodePromptTokensAsync>d__39.MoveNext()
   at OnnxStack.StableDiffusion.Pipelines.StableDiffusionPipeline.<GeneratePromptEmbedsAsync>d__40.MoveNext()
   at OnnxStack.StableDiffusion.Pipelines.StableDiffusionPipeline.<CreatePromptEmbedsAsync>d__37.MoveNext()
   at OnnxStack.StableDiffusion.Pipelines.StableDiffusionPipeline.<RunInternalAsync>d__31.MoveNext()
   at OnnxStack.StableDiffusion.Pipelines.StableDiffusionPipeline.<GenerateImageAsync>d__26.MoveNext()
   at TestOnnxStack.TestStableDiffusion.<Test01>d__1.MoveNext() in ...\TestOnnxStack\TestStableDiffusion.cs:line 42
   at Program.<<Main>$>d__0.MoveNext() in ...i\TestOnnxStack\Program.cs:line 6

To reproduce:

Create .net8 console project
Add nuget package microsoft.ml.onnxruntime.directml (1.17.3) and onnxstack.stablediffusion (0.31.0)
Copy code in Program.cs from Basic Stable Diffusion Example of https://www.nuget.org/packages/OnnxStack.StableDiffusion documentation.
create d:\model folder to git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 -b onnx
change path to model in code e.g. var pipeline = StableDiffusionPipeline.CreatePipeline(@"D:\model\stable-diffusion-v1-5");

Platform

Windows 10
.NET 8.0.200
OnnxStack.StableDiffusion 0.31.0
Microsoft.ML.OnnxRuntime.DirectML Version="1.17.3"
Target CPU x64

The text was updated successfully, but these errors were encountered:

saddam213 · 2024-05-30T19:48:28Z

That;s an odd looking error coming from deep within

I'll download that model and see if its a regression, been a while since I have used the version of the model

I have this version on disk, and that seems to work ok following your steps
https://huggingface.co/TensorStack/stable-diffusion-v1-5-onnx

I will check the other model now and update you with what I find

EDIT:

Downloaded a fresh copy of the model you used and it seemed to work fine

Must be another cause, corrupt download?
What kind of GPU/Device are you using?

imranypatel · 2024-05-30T20:50:18Z

Downloaded the model from https://huggingface.co/TensorStack/stable-diffusion-v1-5-onnx

Now getting slightly different error than above:

Microsoft.ML.OnnxRuntime.OnnxRuntimeException
  HResult=0x80131500
  Message=[ErrorCode:RuntimeException] Non-zero status code returned while running Mul node. Name:'' Status Message: D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2482)\onnxruntime.DLL!00007FFC9DD77AA5: (caller: 00007FFC9DD7712D) Exception(3) tid(6c74) 80004005 Unspecified error

  Source=Microsoft.ML.OnnxRuntime
  StackTrace:
   at Microsoft.ML.OnnxRuntime.NativeApiStatus.VerifySuccess(IntPtr nativeStatus)
   at Microsoft.ML.OnnxRuntime.InferenceSession.<>c__DisplayClass75_0.<RunAsync>b__0(IReadOnlyCollection`1 outputs, IntPtr status)
--- End of stack trace from previous location ---
   at Microsoft.ML.OnnxRuntime.InferenceSession.<RunAsync>d__75.MoveNext()
   at OnnxStack.StableDiffusion.Pipelines.StableDiffusionPipeline.<EncodePromptTokensAsync>d__39.MoveNext()
   at OnnxStack.StableDiffusion.Pipelines.StableDiffusionPipeline.<GeneratePromptEmbedsAsync>d__40.MoveNext()
   at OnnxStack.StableDiffusion.Pipelines.StableDiffusionPipeline.<CreatePromptEmbedsAsync>d__37.MoveNext()
   at OnnxStack.StableDiffusion.Pipelines.StableDiffusionPipeline.<RunInternalAsync>d__31.MoveNext()
   at OnnxStack.StableDiffusion.Pipelines.StableDiffusionPipeline.<GenerateImageAsync>d__26.MoveNext()
   at TestOnnxStack.TestStableDiffusion.<Test01>d__1.MoveNext() in ...\TestOnnxStack\TestStableDiffusion.cs:line 42
   at Program.<<Main>$>d__0.MoveNext() in ...i\TestOnnxStack\Program.cs:line 6

Device/CPU.:

GPU:

Kind of stuck at the moment as just a beginner in this ML.Net/Onnx/directML/etc. space.

Thank you for your support.

saddam213 · 2024-05-30T21:02:08Z

Unfortunately you GPU may not have enough VRAM for stable diffusion, at minimum you would need 3GB-4GB for a F16 model

You can switch to CPU mode by using the CPU execution provider and see if that works

var pipeline = StableDiffusionPipeline.CreatePipeline(@"D:\models\test\stable-diffusion-v1-5", executionProvider: ExecutionProvider.Cpu);

Its ok, I'm just learning ML too :)

imranypatel · 2024-05-31T11:01:11Z

I was expecting better error reporting or help for troubleshooting from .Net framework and APIs built upon for ML. Situation seems not as much different from python platforms.

Tried for Execution Provider CPU only to see similar problem; not copying detail here as I think, based on your response, I better first get hardware/software platform in order for ML on windows/,net.

Based on your learning so far, could you suggest/refer to requirements for platform (laptop, cpu, gpu, ram, windows os, etc) as developer to explore ML in general and ML.Net space in particular?

Good luck on your learning ride!

Thank you.

imranypatel · 2024-06-01T11:02:38Z

Tried on another machine with success.

Typical GPU state during image generation:

Dxdiag system:

Dxdiage AMD Radeon GPU:

It takes on average about 8 minutes per image generation.

Now looking into how to reduce the time, which is of course not accetable at the moment.

Would welcome any suggestion in that direction!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StableDiffusion / Non-zero status code returned while running Add node. Name:'Add_221' #142

StableDiffusion / Non-zero status code returned while running Add node. Name:'Add_221' #142

imranypatel commented May 30, 2024

saddam213 commented May 30, 2024 •

edited

Loading

imranypatel commented May 30, 2024

saddam213 commented May 30, 2024 •

edited

Loading

imranypatel commented May 31, 2024

imranypatel commented Jun 1, 2024

StableDiffusion / Non-zero status code returned while running Add node. Name:'Add_221' #142

StableDiffusion / Non-zero status code returned while running Add node. Name:'Add_221' #142

Comments

imranypatel commented May 30, 2024

saddam213 commented May 30, 2024 • edited Loading

imranypatel commented May 30, 2024

saddam213 commented May 30, 2024 • edited Loading

imranypatel commented May 31, 2024

imranypatel commented Jun 1, 2024

saddam213 commented May 30, 2024 •

edited

Loading

saddam213 commented May 30, 2024 •

edited

Loading