lzbuffer: add memlimit option #50

cccs-sadugas · 2020-07-09T21:23:53Z

Pull Request Overview

Adds a memlimit configuration option for decompression. If the dict
buffer's memory limit is exceeded, decompression will fail with an
LZMAError. Additional functions were added to reduce the amount of
breaking changes in the library.

Implements #49.

Testing Strategy

This pull request was tested by...

Added relevant unit tests.
Added relevant end-to-end tests (such as .lzma, .lzma2, .xz files).

Supporting Documentation and References

This idea comes from a port of the LZMA SDK integrated in libhtp and used by suricata here.

Adds a memlimit configuration option for decompression. If the dict buffer's memory limit is exceeded, decompression will fail with an LZMAError. Additional functions were added to reduce the amount of breaking changes in the library.

gendx

This idea sounds good. My main concern is how it affects performance, especially when the limit is never reached (or never set). Could you provide some numbers from running cargo bench?

If performance is affected too much, one possibility would be to make this opt-in via a Cargo feature, so that users that don't set a limit are not affected at all in terms of performance. A similar pattern could be used to address #27 by the way.

src/decode/lzbuffer.rs

Move the memlimit check so that it only occurs when we are resizing the buffer.

cccs-sadugas · 2020-07-13T15:56:01Z

This idea sounds good. My main concern is how it affects performance, especially when the limit is never reached (or never set). Could you provide some numbers from running cargo bench?

If performance is affected too much, one possibility would be to make this opt-in via a Cargo feature, so that users that don't set a limit are not affected at all in terms of performance. A similar pattern could be used to address #27 by the way.

Here are some benches after moving the memlimit check to the inner block. I'm not sure how accurate they are.

CPU: Intel(R) Core(TM) i7-1065G7 @ 1.30 GHz
RAM: 16Gb @ 4267 MHz
OS: Docker Windows Ubuntu 20.04 LTS

master

test compress_65536                  ... bench:   2,689,440 ns/iter (+/- 667,102)
test compress_empty                  ... bench:       1,629 ns/iter (+/- 346)
test compress_hello                  ... bench:       2,278 ns/iter (+/- 430)
test decompress_after_compress_65536 ... bench:   3,655,315 ns/iter (+/- 6,067,710)
test decompress_after_compress_empty ... bench:       3,977 ns/iter (+/- 1,797)
test decompress_after_compress_hello ... bench:       4,633 ns/iter (+/- 1,811)
test decompress_big_file             ... bench:   7,079,015 ns/iter (+/- 2,228,350)
test decompress_huge_dict            ... bench:       5,359 ns/iter (+/- 4,065)

memlimit

test compress_65536                  ... bench:   2,665,187 ns/iter (+/- 226,483)
test compress_empty                  ... bench:       1,585 ns/iter (+/- 479)
test compress_hello                  ... bench:       2,206 ns/iter (+/- 689)
test decompress_after_compress_65536 ... bench:   2,845,747 ns/iter (+/- 2,467,223)
test decompress_after_compress_empty ... bench:       3,406 ns/iter (+/- 548)
test decompress_after_compress_hello ... bench:       4,174 ns/iter (+/- 472)
test decompress_big_file             ... bench:   6,802,520 ns/iter (+/- 1,939,560)
test decompress_huge_dict            ... bench:       4,799 ns/iter (+/- 1,594)

gendx

Looks good! The errors could be improved when #6 is implemented.

gendx · 2020-07-14T13:40:25Z

My own benchmarks don't show any statistical difference either.

master

running 8 tests
test compress_65536                  ... bench:   2,617,572 ns/iter (+/- 212,392)
test compress_empty                  ... bench:       1,944 ns/iter (+/- 467)
test compress_hello                  ... bench:       2,538 ns/iter (+/- 76)
test decompress_after_compress_65536 ... bench:   3,737,242 ns/iter (+/- 144,283)
test decompress_after_compress_empty ... bench:       4,982 ns/iter (+/- 130)
test decompress_after_compress_hello ... bench:       5,862 ns/iter (+/- 134)
test decompress_big_file             ... bench:   6,648,801 ns/iter (+/- 265,644)
test decompress_huge_dict            ... bench:       5,938 ns/iter (+/- 341)

memlimit

running 8 tests
test compress_65536                  ... bench:   2,594,645 ns/iter (+/- 74,316)
test compress_empty                  ... bench:       1,913 ns/iter (+/- 41)
test compress_hello                  ... bench:       2,599 ns/iter (+/- 78)
test decompress_after_compress_65536 ... bench:   3,716,955 ns/iter (+/- 78,016)
test decompress_after_compress_empty ... bench:       5,120 ns/iter (+/- 194)
test decompress_after_compress_hello ... bench:       6,058 ns/iter (+/- 390)
test decompress_big_file             ... bench:   6,574,553 ns/iter (+/- 163,081)
test decompress_huge_dict            ... bench:       6,261 ns/iter (+/- 298)

cccs-sadugas mentioned this pull request Jul 9, 2020

support streaming read #10

Open

lzbuffer: add memlimit

fcbb11f

Adds a memlimit configuration option for decompression. If the dict buffer's memory limit is exceeded, decompression will fail with an LZMAError. Additional functions were added to reduce the amount of breaking changes in the library.

cccs-sadugas force-pushed the memlimit branch from f24a43b to fcbb11f Compare July 9, 2020 21:58

gendx requested changes Jul 13, 2020

View reviewed changes

src/decode/lzbuffer.rs Outdated Show resolved Hide resolved

lzbuffer: only check memlimit on resize

52792cd

Move the memlimit check so that it only occurs when we are resizing the buffer.

cccs-sadugas mentioned this pull request Jul 13, 2020

Streaming Decompressor #51

Closed

2 tasks

gendx approved these changes Jul 14, 2020

View reviewed changes

gendx merged commit 795bc5b into gendx:master Jul 14, 2020

gendx mentioned this pull request May 30, 2022

Enable no_std support #43

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lzbuffer: add memlimit option #50

lzbuffer: add memlimit option #50

cccs-sadugas commented Jul 9, 2020

gendx left a comment

cccs-sadugas commented Jul 13, 2020

gendx left a comment

gendx commented Jul 14, 2020

lzbuffer: add memlimit option #50

lzbuffer: add memlimit option #50

Conversation

cccs-sadugas commented Jul 9, 2020

Pull Request Overview

Testing Strategy

Supporting Documentation and References

gendx left a comment

Choose a reason for hiding this comment

cccs-sadugas commented Jul 13, 2020

gendx left a comment

Choose a reason for hiding this comment

gendx commented Jul 14, 2020