Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lzbuffer: add memlimit option #50

Merged
merged 2 commits into from
Jul 14, 2020
Merged

lzbuffer: add memlimit option #50

merged 2 commits into from
Jul 14, 2020

Conversation

cccs-sadugas
Copy link
Contributor

Pull Request Overview

Adds a memlimit configuration option for decompression. If the dict
buffer's memory limit is exceeded, decompression will fail with an
LZMAError. Additional functions were added to reduce the amount of
breaking changes in the library.

Implements #49.

Testing Strategy

This pull request was tested by...

  • Added relevant unit tests.
  • Added relevant end-to-end tests (such as .lzma, .lzma2, .xz files).

Supporting Documentation and References

  • This idea comes from a port of the LZMA SDK integrated in libhtp and used by suricata here.

Adds a memlimit configuration option for decompression. If the dict
buffer's memory limit is exceeded, decompression will fail with an
LZMAError. Additional functions were added to reduce the amount of
breaking changes in the library.
Copy link
Owner

@gendx gendx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This idea sounds good. My main concern is how it affects performance, especially when the limit is never reached (or never set). Could you provide some numbers from running cargo bench?

If performance is affected too much, one possibility would be to make this opt-in via a Cargo feature, so that users that don't set a limit are not affected at all in terms of performance. A similar pattern could be used to address #27 by the way.

src/decode/lzbuffer.rs Outdated Show resolved Hide resolved
Move the memlimit check so that it only occurs when we are resizing the
buffer.
@cccs-sadugas
Copy link
Contributor Author

This idea sounds good. My main concern is how it affects performance, especially when the limit is never reached (or never set). Could you provide some numbers from running cargo bench?

If performance is affected too much, one possibility would be to make this opt-in via a Cargo feature, so that users that don't set a limit are not affected at all in terms of performance. A similar pattern could be used to address #27 by the way.

Here are some benches after moving the memlimit check to the inner block. I'm not sure how accurate they are.

CPU: Intel(R) Core(TM) i7-1065G7 @ 1.30 GHz
RAM: 16Gb @ 4267 MHz
OS: Docker Windows Ubuntu 20.04 LTS

master

test compress_65536                  ... bench:   2,689,440 ns/iter (+/- 667,102)
test compress_empty                  ... bench:       1,629 ns/iter (+/- 346)
test compress_hello                  ... bench:       2,278 ns/iter (+/- 430)
test decompress_after_compress_65536 ... bench:   3,655,315 ns/iter (+/- 6,067,710)
test decompress_after_compress_empty ... bench:       3,977 ns/iter (+/- 1,797)
test decompress_after_compress_hello ... bench:       4,633 ns/iter (+/- 1,811)
test decompress_big_file             ... bench:   7,079,015 ns/iter (+/- 2,228,350)
test decompress_huge_dict            ... bench:       5,359 ns/iter (+/- 4,065)

memlimit

test compress_65536                  ... bench:   2,665,187 ns/iter (+/- 226,483)
test compress_empty                  ... bench:       1,585 ns/iter (+/- 479)
test compress_hello                  ... bench:       2,206 ns/iter (+/- 689)
test decompress_after_compress_65536 ... bench:   2,845,747 ns/iter (+/- 2,467,223)
test decompress_after_compress_empty ... bench:       3,406 ns/iter (+/- 548)
test decompress_after_compress_hello ... bench:       4,174 ns/iter (+/- 472)
test decompress_big_file             ... bench:   6,802,520 ns/iter (+/- 1,939,560)
test decompress_huge_dict            ... bench:       4,799 ns/iter (+/- 1,594)

@cccs-sadugas cccs-sadugas mentioned this pull request Jul 13, 2020
2 tasks
Copy link
Owner

@gendx gendx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! The errors could be improved when #6 is implemented.

@gendx
Copy link
Owner

gendx commented Jul 14, 2020

My own benchmarks don't show any statistical difference either.

  • master
running 8 tests
test compress_65536                  ... bench:   2,617,572 ns/iter (+/- 212,392)
test compress_empty                  ... bench:       1,944 ns/iter (+/- 467)
test compress_hello                  ... bench:       2,538 ns/iter (+/- 76)
test decompress_after_compress_65536 ... bench:   3,737,242 ns/iter (+/- 144,283)
test decompress_after_compress_empty ... bench:       4,982 ns/iter (+/- 130)
test decompress_after_compress_hello ... bench:       5,862 ns/iter (+/- 134)
test decompress_big_file             ... bench:   6,648,801 ns/iter (+/- 265,644)
test decompress_huge_dict            ... bench:       5,938 ns/iter (+/- 341)
  • memlimit
running 8 tests
test compress_65536                  ... bench:   2,594,645 ns/iter (+/- 74,316)
test compress_empty                  ... bench:       1,913 ns/iter (+/- 41)
test compress_hello                  ... bench:       2,599 ns/iter (+/- 78)
test decompress_after_compress_65536 ... bench:   3,716,955 ns/iter (+/- 78,016)
test decompress_after_compress_empty ... bench:       5,120 ns/iter (+/- 194)
test decompress_after_compress_hello ... bench:       6,058 ns/iter (+/- 390)
test decompress_big_file             ... bench:   6,574,553 ns/iter (+/- 163,081)
test decompress_huge_dict            ... bench:       6,261 ns/iter (+/- 298)

@gendx gendx merged commit 795bc5b into gendx:master Jul 14, 2020
@gendx gendx mentioned this pull request May 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants