Replies: 9 comments 6 replies
-
just buy nvidia holy shit |
Beta Was this translation helpful? Give feedback.
-
AMD is currently second class citizen involving AI and sadly it's all on them, there's not much any dev can do after hitting AMD's wall. If you want an "it just works" experience then Nvidia is a must. ZLUDA might be the better alternative for AMD owners eventually. |
Beta Was this translation helpful? Give feedback.
-
Blame the AMD, Developers, doing their best and we are using for free. I sold my old gpu on e bay and bought Nvidia everyting works fine. I tested almost every webui. With RTX 3060 12 GB Not too fast but does the job. |
Beta Was this translation helpful? Give feedback.
-
AMD's drivers were once a nightmare in gaming |
Beta Was this translation helpful? Give feedback.
-
I have a 6600 , while not the best experience it is working at least as good as comfyui for me atm. -- Do these changes : #58 (comment) With that custom model the initial gen starts with 30-40 secs but after a few generations it drops to around 7-8 sec / it, with 4 steps including tiled vae , it takes around 40 to 70 seconds for one image. It is a bit faster on comfyui but forge is slowly going there. |
Beta Was this translation helpful? Give feedback.
-
Ok, it works, just one big problem. It eats all of vram and ram(and freezes). How to make lowvram work for my 4gb vram. In standard automatic111 it works OK, but i thought maybe forge will be faster... |
Beta Was this translation helpful? Give feedback.
-
I also have RX 6600 XT. Not geat speed but not bad ether , I'm not complaining and also #58 Is working fine after so many updates. My files are unchanged. I didn't edit files and it works fine. When comes to live preview it does not work with me..i dont know why, |
Beta Was this translation helpful? Give feedback.
-
But that's the problem! Neither --always-low-vram or no-vram not working. Generation starts at first, but then quickly fills whole vram and freezes because of that. In standard automatic 111 i can even multi-task while generation is running (watching youtube etc.) in lowvram mode. |
Beta Was this translation helpful? Give feedback.
-
Just use zluda with sd.next |
Beta Was this translation helpful? Give feedback.
-
So here goes a story.
About almost year ago (maybe less) Automatic1111 came online and was public...but i couldn't run it. It kept saying no cuda cores....tensor stuff. WHYY?
I have RX 6600 XT GPU....that is AMD card not Nvidia....oh bummer, but it was decent GPU. At first i was disappointed i couldn't run at all. No cuda cores on AMD...but alas i was stubborn, so i went into adventure to find a fix to my issue... or make it work ...even if it was slow.
Searching over internet i found out that there was fork of automatic1111.....directml fork JUST FOR AMD.
I was happy as child....but i could not generate images...at 512x512. Kept saying out of memory. But i had 8GB of vram...how was i out of?
I was reading and people are using older graphics that run it fine....it was fine it was Nvidia GPU.
Alas i was stubborn again...and decided "If we can run it we can fix it"...
Image generation worked on SD 1.5 just it would fail 100 times before he could make single picture. Sometimes it will reach 100% but needed extra ram for VAE so it throw error out of memory. And that picture was generated at 20 steps about 15 mins.
A single image took 15 mins and i had to restart my PC every time it created...ouch. Looking at people with CPU that run it...they where much slower than me, but i was happy.
So i found out MiniSD model...that could generate images at 256x256 in lighting speeds. Like each generation would be in 2-3 sec.
Great...but image was poor quality.....so time passed and directml was improved.
Amount of tries to generate 512x512 was 70% of time working...i had to restart pc every 100 runs cuz i was out of memory and just refused to generate. And time to generate same image was reduced from 15 mins to 4 mins or so. So it was faster over time and that made me happy.
It came LCM models/loras that i currently use...that made my speed of generation cut by 80%.
It took 20 steps to make 4 mins now takes about 9-15 secs 4-8 steps per picture with 40% chance to refuse to run at all.
So Comfiui came....i was fun it worked fine for me even bit faster...just UI for me was unbearable...it was more modular, and i like that, but i don't like it. It feels like visual representation of how things works. Just nods and lines are just messy.
Speed was cut about 30% but i just can't use it... i was just i meh at it.
So we come to here forge. i saw on youtube 40%-75% for lower grade graphics.....on nvidia. I said it must be faster on AMD as well. And i was disapointed at first cuz it didnt work out of box as promised..
I downloaded it....1 click installer didn't work. Git clone url ...run...nah wont work..
I needed directml....installed it. New bug..."TypeError: 'NoneType' object is not iterable".
I was looking and found out fix for it...and it worked. Some people found out to set RNG to GPU to CPU and edit some files. Did that, restarted PC..and worked. First of all 512x512 worked 100% of time...at faster rate than directml fork or comfiui (they both sometimes just refuse to make it at 0% sometimes they will do 100% but be out of memory vae decode)
Forge was like "Here we go!...im out of memory but i will try something else"....that else later made me jump like child around my own room.
I said to myself "Something that actually works on my GPU with decent speed. YES!". So what i did was loading other models and generating images at really nice speed. I was impressed, and i wanted to see that would it work to load SDXL.....i knew that didn't definitely work on directml or comfiui, why it work here.....so i said this might not work but that is fine. Only thing ....it started generating image....."WHAT?!"I said to myself, this is going to tell me out of memory when reaches VAE decode form latent to image. I was giggling.
And i was right and i was sad...but second later it said "trying with TILED VAE"?!...what? I was confused... and it took about 4-8 secs to show me what was generated (preview didn't work for some reason). And i tried again...and again and it worked 100% of time with same error with tiled vae...and each time it would generate,
And picture was created 100% of time....1024x1024 took about 35-40 (8 steps LCM/Turbo)...but i was happy and confused...WHY does this work only here? What kinda magic did you guys do? And few days later Stable Cascade came out and i tried it....didn't work as expected. But hey this works!
Why did it work...should i complain or should i celebrate i still do not know....
And today was worried....now nothing works..due to latest commits.
I was playing with forge and since latest commit directml part broke. "TypeError: 'NoneType' object is not iterable" again showed up.
I tried reinstalling forge, fresh install edit files, unedit files, restart pc, change model..nothing works.
So i came to conclusion that it needs time to someone fix this...cuz 1-3 days ago it worked!
So my question to dev team? WHAT DID YOU BREAK!? hahahhaha
Beta Was this translation helpful? Give feedback.
All reactions