Reduce memory consumption when digesting a huge file. #104

Closed
wants to merge 1 commit into
from

Projects

None yet

3 participants

@keedi
Contributor
keedi commented Jun 30, 2014

I usually use digest() method to check my downloaded files such like huge zip archives and ISO images. And I realized that it consumes memory as its file size too late. So out of memory is occurred too often when I tried to digesting big file in a low-end PC like virtual machine.

Current version of digest() method uses slurp_raw() to add content to Digest object(sorry, it was my fault). If you tried to digest with a huge file, then your memory will be used excessively as your file size. This patch solve the problem by reading 4KB unit each time.

I've used 4096 constant, so if it is not proper value then please adjust it. :)

@keedi keedi changed the title from Reduce memory consumption when diesting a huge file. to Reduce memory consumption when digesting a huge file. Jun 30, 2014
@keedi keedi Reduce memory consumption when digesting a huge file.
Current version of digest() method uses `slurp_raw()` to add content
to `Digest` object(sorry, it was my fault). If you tried to digest
with a huge file, then your memory will be used excessively as your
file size. This patch solve the problem by reading 4KB unit each time.
b7fc561
@dagolden

I'm inclined to put in either a size check or read in much
larger chunks, like a megabyte.

@karenetheridge
Contributor

I like the idea of a slurp_chunked interface.

@dagolden
dagolden commented Aug 7, 2014

I've put in a chunk_size option for digest, which lets the end-user change the chunking to suit the amount of memory they wish to devote to it.

It's all been cherry picked to master so I'm closing this PR.

@dagolden dagolden closed this Aug 7, 2014
@keedi keedi deleted the keedi:feature/digest-reduce-memory branch Mar 5, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment