-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
can't start new thread error after running for a while #170
Comments
This looks like it may be a memory leak rather than a thread leak:
There are only 3 active threads, per the logs, and VRAM is reasonable, but virtual memory has been pretty much maxed out for all of the threads. This happened after 86 requests, using this repro script: test_images=0
while true;
do
curl 'http://10.2.2.79:5000/api/txt2img?cfg=16.00&steps=35&scheduler=deis-multi&seed=-1&prompt=an+astronaut+eating+a+hamburger&negativePrompt=&model=diffusion-snow-globe-v1&platform=any&upscaling=upscaling-real-esrgan-x2-plus&correction=correction-codeformer&lpw=false&width=512&height=512&upscaleOrder=correction-both' \
-X 'POST' \
--compressed \
--insecure || break;
((test_images++));
echo "waiting after $test_images";
sleep 35;
done |
Tested again with a fresh server, the error happened after 94 images:
Virtual memory was very high, as before: |
https://stackoverflow.com/a/66130494 suggests this could be related to/resolved by #159 |
Running under waitress lasted for 93 images and gives a more useful error:
with
|
This is not related to the requests or the server. If many jobs are queued up, the memory leak continues growing as they run, after the requests stop. |
This should be fixed by #205, which switches to a process worker pool. I ran the memory leak test script for 600 images with no problems. |
I've tested the solution more, up through 1051 images, made a few improvements and found a few issues:
|
With the v0.8.1 fixes, I've tested this up through 4242 images in single series. There is still a bug when sending multiple images in quick succession. If a worker is recycled in between, it may attempt to start a second job on the same device (given enough memory), and one of them is likely to fail in some way. |
Depending on how many requests are made, and maybe how many are cancelled, there is a thread leak that eventually exhausts the worker pool and then prevents Flask for responding to new requests:
The text was updated successfully, but these errors were encountered: