Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decrease memory usage #30

Closed
rom1504 opened this issue Aug 24, 2021 · 8 comments
Closed

Decrease memory usage #30

rom1504 opened this issue Aug 24, 2021 · 8 comments

Comments

@rom1504
Copy link
Owner

rom1504 commented Aug 24, 2021

Currently the memory usage is about 1.5GB per core. That's way too much, it must be possible to decrease it.
Figure out what's using all that ram (is it because the resize queue is full ? should there be some backpressure on the downloader ,etc) and solve it

@rom1504
Copy link
Owner Author

rom1504 commented Aug 24, 2021

https://stackoverflow.com/a/40922528 may be a way to propagate backpressure in the threadpool

@rom1504
Copy link
Owner Author

rom1504 commented Aug 24, 2021

one thing is to use tuple out of pandas and not dict. Less convenient but python is not doing any magic with dicts

@rom1504
Copy link
Owner Author

rom1504 commented Aug 24, 2021

Need to close the tarfile.
May be related with #34

rom1504 added a commit that referenced this issue Aug 25, 2021
@rom1504
Copy link
Owner Author

rom1504 commented Aug 25, 2021

pandas tuple helped, closing the tarfile not so much but it's good practice

will need more analysis to decrease more

@rom1504
Copy link
Owner Author

rom1504 commented Sep 19, 2021

https://stackoverflow.com/a/47058399 using semaphore seems like a good idea to apply backpressure on the queue: making it so only up to N (probably N=nb threads x 1.5) images are waiting will cap the memory usage per process at average original image size x N (that would be 192MB per process for original images of size 1MB ; which is much better than the current uncapped size)

identifying and fixing the memory leak until the process is killed wouldn't hurt (maybe that's the same reason: items previously in the queue didn't get clearer from memory ?)

worth investigating

This memory problem is particularly visible if the original images are large

@rom1504
Copy link
Owner Author

rom1504 commented Sep 19, 2021

things to try (to fix the memleak in the process pool)

  • close the BytesIO (img_stream)
  • del the img

@rom1504
Copy link
Owner Author

rom1504 commented Sep 20, 2021

closing didn't help a lot
semaphores did (1.5x memory reduction)

@rom1504
Copy link
Owner Author

rom1504 commented Sep 20, 2021

this is mostly solved

@rom1504 rom1504 closed this as completed Sep 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant