Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues streamHC branch using new lz4hc streaming API #30

Closed
bcronje opened this issue Oct 22, 2014 · 8 comments
Closed

Issues streamHC branch using new lz4hc streaming API #30

bcronje opened this issue Oct 22, 2014 · 8 comments

Comments

@bcronje
Copy link

@bcronje bcronje commented Oct 22, 2014

@Cyan4973 First off, great work on this library. We are getting good results using it.

I'm testing the latest HC streaming API on the streamHC branch, but decompression results are invalid. Couple of questions:

  1. I understand the streamHC branch is still under development, that said is the new HC streaming API as it is implemented with latest source supposed to provide correct compression/decompression results at this stage? I'm not getting any exceptions or anything, the decompressed data just does not match the original data.
  2. There is no LZ4_decompressHC_safe_continue or any other HC decompress API functions. I'm assumption that we should use the standard LZ4_decompress_safe_continue API for decompression?
  3. My standard lz4 streaming code is working fine. Is my understanding correct that from the compression code, the only difference between the standard lz4 and lz4hc streaming API usage is changing to LZ4_streamHC_t and LZ4_compressHC_continue ? Or is there something else I need to implement to use lz4hc ?
@Cyan4973
Copy link
Member

@Cyan4973 Cyan4973 commented Oct 22, 2014

Hi Beyers

  1. is the new HC streaming API as it is implemented with latest source supposed to provide correct compression/decompression results at this stage?

It's still early days for this branch, so I can't promise good results yet. The API passes a lot of simple tests, but that does not mean it cover all situations. For example, I'm struggling currently with a round buffer scenario (tests & debugging ongoing).

So if you find a use case generating bad results, please don't hesitate to describe it, or even post a sample code. This is the right time to integrate those use cases within the automated test tool.

  1. There is no LZ4_decompressHC_safe_continue or any other HC decompress API functions. I'm assumption that we should use the standard LZ4_decompress_safe_continue API for decompression?

Yes

  1. Is my understanding correct that from the compression code, the only difference between the standard lz4 and lz4hc streaming API usage is changing to LZ4_streamHC_t and LZ4_compressHC_continue ?

Yes, it's the final objective. The streamHC branch will be merged into dev when it becomes fully compliant with this scenario.

Regards

@bcronje
Copy link
Author

@bcronje bcronje commented Oct 22, 2014

Hi Yann,

My current use case is for network packet compression. For testing purposes I read packets from a PCAP dump, compress the data from the transport layer and up of each packet, decompress the compressed data and create a new pcap dump file. To confirm everything is working as it should I diff the original pcap file with the new pcap file. I'm using a ring buffer similar to your example blockStreaming_ringBuffer.c implementation. Testing with standard lz4, this test works 100%. Changing to lz4hc streaming the new pcap file is not the same size as the original pcap file.

Here is a portion of the lz4hc streaming code I'm using, maybe you can spot something obvious I'm doing wrong:

// Following called once during initialization
_lz4hc_stream = new LZ4_streamHC_t();
LZ4_resetStreamHC(_lz4hc_stream, 0);
_lz4_buffer = new char[RING_BUFFER_BYTES];
_lz4_buffer_offset = 0;

// Following called per packet
lz4hc_compress(const char* data, int data_len)
{
    char compress_buffer[LZ4_COMPRESSBOUND(MESSAGE_MAX_BYTES)];
    char *stream_buffer = &_lz4_buffer[_lz4_buffer_offset];
    memcpy(stream_buffer, data, data_len);

    int compressed_len = LZ4_compressHC_continue(_lz4hc_stream, stream_buffer, compress_buffer, data_len);

    _lz4_buffer_offset += data_len;

    // Wraparound the ringbuffer offset
    if (_lz4_buffer_offset >= RING_BUFFER_BYTES - MESSAGE_MAX_BYTES)
        _lz4_buffer_offset = 0;

    // .... Omitted non-compression related stuff
}

The decompression code I'm assuming is implemented correctly seeing that standard lz4 works perfectly.

The HC compression code is also pretty much the same as the lz4 compression code apart from the hc API change.

Any idea?

@Cyan4973
Copy link
Member

@Cyan4973 Cyan4973 commented Oct 22, 2014

Yes,
ring buffer is currently a known weakness of lz4hc streaming. It won't work yet.

I've a first fix ongoing, but it won't solve the entire situation.
As a strange consequence, lz4hc is now simply "too much advanced" compared to the decompressor code for ring buffer scenario. So I'll have to upgrade the decompression code to get the final fix.
It happens the "fast lz4" and "lz4 decompressor" share the same weakness regarding ring_buffers, hence the reason why they are currently compatible. I may update the "fast lz4" later, although it will require some significant refactoring, and it's not really a "problem" : it just misses some opportunity to get an even higher compression ratio.

If you need a quick solution, please consider a "double buffer" methodology, which should work correctly today.

@bcronje
Copy link
Author

@bcronje bcronje commented Oct 22, 2014

Awesome, I can certainly refactor my code to use double buffer instead. Will report back if that workaround fixes things for me.

@bcronje
Copy link
Author

@bcronje bcronje commented Oct 24, 2014

@Cyan4973 I can confirm that the double buffer does indeed work with lz4hc. The compression ratio dropped significantly though, probably due to our small data packet size. I ended up with a slightly modified double buffer strategy, I process 256 packets per buffer, then copy the dictionary to the other buffer and vice versa. This seems to yield similar compression ratios than the circular buffer strategy and works with lz4hc.

Thank you for the assistance!

@Cyan4973
Copy link
Member

@Cyan4973 Cyan4973 commented Oct 24, 2014

You're welcomed.
This is the best work around for the time being

@bcronje bcronje closed this Oct 24, 2014
@Cyan4973
Copy link
Member

@Cyan4973 Cyan4973 commented Nov 4, 2014

OK, with latest round of tests, I believe the new streaming implementation available within LZ4 "Dev" branch is now fully compatible with ring buffers scenario, including ring buffers of different sizes on the compression and decompression side.
(provided decompression ring buffer is at least 64 KB or equal or larger than compression buffer)

@bcronje
Copy link
Author

@bcronje bcronje commented Nov 4, 2014

Awesome, will test on my side as well! Thank you @Cyan4973

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants