Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix potential context leak, add exceptions, add lz4-style program for testing #6

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

nh2
Copy link

@nh2 nh2 commented Feb 14, 2020

See #5 and the commit messages.

The lz4 docs do not specify whether freeing multiple times
is a no-op or legal, so we better check the safer invariant.
@codecov-io
Copy link

codecov-io commented Feb 14, 2020

Codecov Report

Merging #6 into master will decrease coverage by 0.11%.
The diff coverage is 81.48%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master       #6      +/-   ##
==========================================
- Coverage   90.62%   90.51%   -0.12%     
==========================================
  Files           2        2              
  Lines         128      137       +9     
==========================================
+ Hits          116      124       +8     
- Misses         12       13       +1
Impacted Files Coverage Δ
include/lz4_stream.h 87.25% <81.48%> (+0.15%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6b015cb...adf65aa. Read the comment docs.

/**
* @brief RAII version of `LZ4F_compressionContext_t`.
*/
struct CompressionContext
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor thing. The other classes use snake_case naming. I don't have any strong feelings one way or the other, but I really like to be consistent.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

size_t ret = LZ4F_createCompressionContext(&ctx, LZ4F_VERSION);
if (LZ4F_isError(ret) != 0) {
throw std::runtime_error(std::string("Failed to create LZ4 compression context: ")
+ LZ4F_getErrorName(ret));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this change, but reminds me that I have thought or creating a lz4_category that inherits from std::error_category and the throw std::system_error instead.

That way a more specific exception could be thrown as well as not having to add the LZ4 error string every time an exception object is constructed. Makes sense?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think specific exceptions are good so that they can be caught explicitly.

I'll leave that to you though :)

Copy link
Owner

@laudrup laudrup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very, very nice work. Thanks a lot.

It's also nice to see some very useful commit messages. I really appreciate that.

@@ -62,3 +62,6 @@ target_link_libraries(lz4_compress ${LZ4_STREAM_LIBRARY_NAME})

add_executable(lz4_decompress lz4_decompress.cpp)
target_link_libraries(lz4_decompress ${LZ4_STREAM_LIBRARY_NAME})

add_executable(lz4_dropin lz4_dropin.cpp)
target_link_libraries(lz4_dropin ${LZ4_STREAM_LIBRARY_NAME})
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to remove the existing example executables once this compiles.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Example case:
The `output_buffer()` constructor can throw via `write_header()`.
Until now, the malloc'ed context would then not be freed
(because if the constructor throws, no destructor is called).

This commit fixes it by wrapping contexts in
RAII classes that always ensure destruction.
Motivating example:

Until now, the example `lz4_compress` application would silently
ignore errors like `no space left on device` (ENOSP) when compressing
continuing to churn through the input data and writing to the
output file even though every write fails (as evidenced by `strace`)
and the resulting .lz4 file is naturally corrupted.
To reproduce:

    mkdir ramdisk
    sudo mount -t tmpfs -o size=1m tmpfs ramdisk/
    ./lz4_compress /some/big/file ramdisk/out.lz4

Discussion:

C++ iostreams were invented before exceptions which is why they are
disabled by default and have to be opted in for each stream.

If exceptions are not enabled, the badbit should be checked before/after
every stream operation. If that is not done, errors go silently
and the user marvels over odd syscall behaviour and corrupt data.

This library does not check the badbit.

This commit fixes the situation it by enabling exceptions using

    stream.exceptions(std::ostream::failbit);

thus enforcing what the library's current assumption that streams
are not in an invalid state.

Further work:

An alternative approach might be to change the library to get rid
of that assumption and switch to the badbit-check everywhere,
but this would be more difficult. It might present good follow-up work
to this fix.

As a reminder:
An application might want to handle ENOSP by looping until more
space is available on the file system, and re-use the open LZ4
stream as soon as that is the case.

This commit still does not make that use case possible, because
if the user catches the exception and calls the library again,
the `input_buffer`/`output_buffer` would be advanced (e.g. by more
LZ4 operations), so the stream would not be valid.

A fix for that would be to observe any stream errors in the library,
and ensure that when the user calls it again, the existing buffers
are not invalidated.
@nh2 nh2 force-pushed the fix-context-leak-add-exceptions branch from c1210cb to 62a5ec0 Compare February 15, 2020 14:39
@nh2
Copy link
Author

nh2 commented Feb 15, 2020

I've force-pushed to address the feedback.

@nh2 nh2 force-pushed the fix-context-leak-add-exceptions branch from 62a5ec0 to adf65aa Compare February 15, 2020 16:37
@nh2
Copy link
Author

nh2 commented Feb 15, 2020

I've force-pushed a rework of the lz4_dropin command.

It can now accept - for using stdin/stdout, and otherwise named files.

Examples:

lz4_dropin -c myfile myfile.lz4
lz4_dropin -d myfile.lz4 myfile
lz4_dropin -c myfile - > myfile.lz4
echo hi | ./lz4_dropin -c - - | ./lz4_dropin -d - -

I did this because it makes it easier to run the example application through gdb when one does not have to use > and < for passing input files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants