New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decrease memory usage #30
Comments
https://stackoverflow.com/a/40922528 may be a way to propagate backpressure in the threadpool |
one thing is to use tuple out of pandas and not dict. Less convenient but python is not doing any magic with dicts |
Need to close the tarfile. |
pandas tuple helped, closing the tarfile not so much but it's good practice will need more analysis to decrease more |
https://stackoverflow.com/a/47058399 using semaphore seems like a good idea to apply backpressure on the queue: making it so only up to N (probably N=nb threads x 1.5) images are waiting will cap the memory usage per process at average original image size x N (that would be 192MB per process for original images of size 1MB ; which is much better than the current uncapped size) identifying and fixing the memory leak until the process is killed wouldn't hurt (maybe that's the same reason: items previously in the queue didn't get clearer from memory ?) worth investigating This memory problem is particularly visible if the original images are large |
things to try (to fix the memleak in the process pool)
|
closing didn't help a lot |
this is mostly solved |
Currently the memory usage is about 1.5GB per core. That's way too much, it must be possible to decrease it.
Figure out what's using all that ram (is it because the resize queue is full ? should there be some backpressure on the downloader ,etc) and solve it
The text was updated successfully, but these errors were encountered: