-
Notifications
You must be signed in to change notification settings - Fork 32
Adjust CRC32 buffer size to 64 KB and use a buffering wrapper for IO #28
Conversation
This is great work. I was thinking of doing a proper benchmark myself, but didn't yet get around to it. The sweet spot for CPU intensive operations like this is often related to the L1 cache size, which is 32KB for data on the CPU I was testing on (Intel Core i7-5557U) and most modern desktop and server CPUs. |
Btw. there's a typo "Single-bute" vs "Single-byte" in the pasted benchmark results of |
There is. There is also a thing where we don't need to pre-generate this blob :-P that will be rectified |
Also: numbers get suspicious again
@io = io | ||
@uncompressed_size = 0 | ||
@compressed_size = 0 | ||
@io = ZipTricks::WriteAndTell.new(io) | ||
@started_at = @io.tell |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have removed @started_at
from #finish
so it can be removed here as well as it isn't referenced anywhere else in the class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This value is now maintained by WriteAndTell
Even though the size of the buffer is specific to the CRC32 implementation in zlib, the pattern of buffering writes is actually pretty common - and in the StreamCRC32 objects it is not very declarative. We reimplement it as an write proxy instead, which decouples the buffering stuff and makes it possible to use it in other scenarios as well.
This also adds a benchmark that proves, as correctly stated by @felixbuenemann that the 64KB buffer size is indeed the sweet spot as far as CRC32 is concerned (we intentionally perform 1-byte writes to get the slowest possible throughput and the smallest chunks). This will be beneficial to libraries like XSLX writers which are likely to be writing lots of small chunks in succession (as oposed to archive-from-file situations where
IO.copy_stream
can choose the right chunk size for us).