-
-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: CUDA error at ~60% each time. 4070 - 8gb vram - 16gb DDR5 ram. #268
Comments
I have the literal exact same problem, 4070 and all. I have tried using xformers, not using xformers, disabling xformers, updating pip and torch, and so many other adjustments. Though I could never get a single gif to generate, it would just stop at around 60% and throw that same error. |
I have no idea what's going on on your side and I cannot reproduce your error. What's your pytorch version? Try following:
if 1 and 2 do not work, try fresh re-install A1111, ControlNet and this extension, and see what's going on. |
i have this problem too. i try to use xformers and sdp but it doesn't affect. is there any other way to fix it instead of deleting whole a1111 and all dependencies? |
I have no idea why you meet this problem. Revert back to v1.10.0 is a good first step, to test if the problem is because of the update from v1.10.0 to v1.11.0 |
3060 12g have same problem |
版本: v1.6.0 • python: 3.10.11 • torch: 2.0.1+cu118 • xformers: 0.0.21 • gradio: 3.41.2 • checkpoint: 4a5eb827f4 |
Who can try reverting AnimateDiff extension to v1.10.0? Please let me know whether or not v1.10.0 works for you. The problem actually pop up after v1.11.0 update. Read my comments above for how to do it. |
There is no way for me to reproduce your problem. Your experiments are the only source for me to try addressing the problem. |
Another thing you can do - screenshot your webui to let me know ALL your configuration |
I don’t believe this is a VRAM problem. There must be somewhere that has a bug. |
Add print(_context) before this line. Indentation same as L187. Post here the terminal log. |
After some quick experiments, I believe that channelslast is one of the causes of the problem. Without channeles last, the vram consumption is only A: 5.33 GB, R: 8.16 GB, Sys: 12.0/23.6445 GB (50.6%), but with channels last, it becomes A: 11.25 GB, R: 13.29 GB, Sys: 17.1/23.6445 GB (72.2%). Please remove channels last and re-try, |
Info:
What I tried:
Change torch version
Some times I succeded generating the images (not the gif) continuing to free the cuda cache (as suggested here), but now I can no more, I don't know why. Free cuda cacheI opened cmd again and used
When it worked I got:
and I didn't use motion lora, just Error log:with context added here:
Update:How do I remove channels last? |
@GiusTex I don't see any _context print information on your attached terminal log. Also the terminal log does not seem like v1.10.0. You did not successfully revert back. |
I changed it back to latest version, and about print context there where many numbers under the generation, but when I got the error they where changed. Update:I'll revert back to v1.10.0 and copy the console before I get the error and after it |
no need to worry about extremely long output. I'm also renting a 3080. I have no idea what's going on and I cannot reproduce this error in whatever way in 4090. do not add opt-channelslast to command line arguments |
I understood: the context is removed when I turn the text into a code, damn. I'll leave it out then.
0%| | 0/22 [00:00<?, ?it/s][0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]
I'm testing without |
I see. Seems like not an update problem. I'm still investigating. I have no idea what's going on. |
Why do people start to pop up this issue so late? I don't understand. Are all people here new users? |
User-bat:
Webui options:
In the message above there is the error log with channelslast, down below without them:
0%| | 0/22 [00:00<?, ?it/s][0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]
And yes I'm a new user... sorry. I searched everywhere yesterday and before asking here though. |
I understand your situation, but I don't understand the problem. Wait patiently while I'm addressing it. |
I posted the new log just for completeness, in case something changed before and after the
is making me crazy because I can't replicate it with cu117 (only with cu118 that doesn't work). |
@GiusTex Do you mean that it works for cu117? I'm using cu117, so maybe it's really a problem of pytorch. |
Yeah for him it worked, but I can't find a way to download dependencies with cu117 |
If I use /cu117 it says it can't find a torchvision version from none, and online I can't find what torchvision version to use with cu117. |
https://download.pytorch.org/whl/nightly/cu117 ask pytorch website and/or gpt what you should do. Maybe you can find a way to re-install torchvision from this link. activate your venv before doing that. |
This is an extremely unexpecting problem, and I think it's most likely a problem of pytorch. |
I think so too. That or the cuda version. I'm trying 2.0.1+cu117: Update:
Damn, I have to use Update:2.0.1+cu117 install: |
version: v1.6.0 • python: 3.11.4 • torch: 2.1.0+cu121 • xformers: N/A • gradio: 3.41.2 • checkpoint: a1535d0a42 This setting on 3080 does not produce any problem, even with prompt travel. Please DO NOT USE torch 2.0.1+cu118. A: 5.32 GB, R: 6.25 GB, Sys: 6.5/9.77539 GB (66.9%) |
Damn. Forgot my charger at work, so I can't test this yet, but after looking at the responses - I'll just say I'm confident after setting those parameters to zero, I'll be in business. |
This does not solve the issue of SEED drifting (AnimateDiff uses a different seed when generating images, not the once provided by the user). ISSUE NOT SOLVED. |
@3dcinetv your issue is a different one. Not related to this thread. |
Where do i find these settings or section to change to 0? |
It's in the Stable Diffusion Webui under the Settings tab -> Optimizations. |
thanks mate |
I checked my settings for optimisations, but those four sliders were already at zero.
|
The problem persists where the image generator stops functioning midway. Additionally, the specified width and height parameters are not adhered to, and revert to 512. This causes a discrepancy in the appearance of the GIF when compared to the picture generated with a resolution of 512 x 768; it defaults to a resolution of 512 x 512 instead. |
Deleted venv
|
@pipa0979 This error is unrelated to this issue. It is xformer's bug. Go to |
For people coming here. This issue is related to you ONLY IF you switch from xformers to |
This is the exact same issue. I think you are talking about these settings right?
My Cmd
I am currently on |
No you should not use the first option. you should NOT use xformers |
Yes that worked @continue-revolution . For the error above, did deleting the venb dir from SD home and retrying running the |
也有相同問題 a. When you have --xformers in your command line args, you want AnimateDiff to > Optimize attention layers with sdp 或 (但出圖很慢!) Optimizations 歸零 > Cross attention optimization 自動 版本: |
From my testing, it looks like only |
是的, 我昨天也遇到了一样的问题, 在我的3060 12g上表现为AnimateDiff生成视频时百分之31就导致webui崩溃, 具体报错为:
经过我测试, 把版本回退到1.5.2然后升级到1.6.0正常使用AnimateDiff就没有问题, 但是我因为觉得我的设备生成速度不够快, 然后改动了配置
(另外我看到我仅仅改动上述两项配置时, 但保存的修改项足足有四十多个, 抱歉我这里没有截图, 如果可以复现请反馈) 问题复现步骤: 解决方案: |
或者webui是否可以提供恢复默认配置的方法, 以便于将config.json文件重置为未修改时的值 |
删除config.json就能恢复到默认 |
哦, 我的朋友, 你说的这个方法没准真的可以, 如果它解决了这个问题, 我决定授予你大佬的称号 |
昨天晚上回去找到原因了, 在秋叶的高级设置里面找到Pytorch配置, 不要选带有Cuda选项的配置, 直接选Pytorch就可以了, 但是现在又遇到一个问题, 生成速度很慢这个不晓得咋整哦, 按LCM采样器来看, 开启优化选项可以达到10s内出1024x1536图片, 但是我画16帧率图片, 需要十分钟, 按理来讲应该是3分钟的事情, 不知道为什么变得这么慢, 但是现在不爆显存了, 还是值得庆贺的, 希望可以优化下在LCM采样器下的生成速度 |
I agree. |
我也有这个问题,虽然我只是4g的1650ti,
|
it works, thank bro @FreeeFry |
我也有遇到同样的问题,请求的时候控制台报设备错误, GPU信息: +-----------------------------------------------------------------------------+ |
模型:https://huggingface.co/SimianLuo/LCM_Dreamshaper_v7/tree/main |
I'm experiencing this same error, but instantly when starting the generation. I'm on stable diffusion webui forge 29be1da7cf2b5dccfc70fbdd33eb35c56a31ffb7 (Currently the latest version, I also experienced this issue on older versions)
Steps to reproduce:
Torch versions: |
Sounds like you've selected the mm_sd15_v3_adapter (LORA) as Motion module? |
@FreeeFry |
Is there an existing issue for this?
Have you read FAQ on README?
What happened?
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.This error happens each time I try and run animatediff. I have tried at 256x256 and 512x512. I have tried xformers and opt-sdp. I have tried turning prompt travel off and on. I have tried all of the motion models, including the newest. I am not sure what the problem is?
Steps to reproduce the problem
What should have happened?
Create a working gif or an mp4.
Commit where the problem happens
webui: SD1111 - newest version
extension: animatediff
What browsers do you use to access the UI ?
Google Chrome
Command Line Arguments
opt-sdp-attention, medvram-sdxl, no-half-vae, xformers, force-enable-xformers, medvram-sdxl, no-half-vae, opt-channelslast I've also tried simpler command line args, without no half and channelslast.
Console logs
Additional information
No response
The text was updated successfully, but these errors were encountered: