New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ZLib decompression support to DeflateStream #2906

Merged
merged 1 commit into from Aug 24, 2015

Conversation

Projects
None yet
5 participants
@ianhays
Member

ianhays commented Aug 20, 2015

The DeflateStream class currently uses ZLib for compression (if available) but doesn't use ZLib for decompression. This commit adds support to use either ZLib or the Managed implementation for decompression (inflation).

  • Resolves #2024; Added ZLib support in InflaterZLib.cs
  • Inflater.cs the old managed implementation is renamed to InflaterManaged.cs
  • Extracted interface of necessary Inflater methods into the IInflater interface from which InflaterManaged and InflaterZLib inherit.
  • Roughly a 3X speedup for decompression when using the ZLib library instead of the Managed implementation!
@@ -256,8 +279,6 @@ public override int Read(byte[] array, int offset, int count)
break;
}
Debug.Assert(_inflater.NeedsInput(), "We can only run into this case if we are short of input");

This comment has been minimized.

@ianhays

ianhays Aug 20, 2015

Member

This debug was causing exceptions because NeedsInput() was calling the handle that was at that time uninitialized. I considered adding a check around the handle, but determined the slowdown not worth the sake of this debug.

@ianhays

ianhays Aug 20, 2015

Member

This debug was causing exceptions because NeedsInput() was calling the handle that was at that time uninitialized. I considered adding a check around the handle, but determined the slowdown not worth the sake of this debug.

/// The stream will keep attributes that may have been set by inflateInit2.
/// </summary>
[SecurityCritical]
public ErrorCode InflateReset(int windowBits)

This comment has been minimized.

@ianhays

ianhays Aug 21, 2015

Member

I'm not 100% on the try/finally usage within this method - could use another look for sure.

@ianhays

ianhays Aug 21, 2015

Member

I'm not 100% on the try/finally usage within this method - could use another look for sure.

This comment has been minimized.

@stephentoub

stephentoub Aug 21, 2015

Member

This pattern of putting code in a finally block like this is done to deal with thread aborts, in particular because thread aborts are delayed across finally's so you don't have to worry about them interrupting a finally block. However, thread aborts aren't a concern for .NET Core, and even if this library ends up being used on the desktop, it's very unlikely the library is entirely hardened against them. I wouldn't worry about using try/finally's in this way.

@stephentoub

stephentoub Aug 21, 2015

Member

This pattern of putting code in a finally block like this is done to deal with thread aborts, in particular because thread aborts are delayed across finally's so you don't have to worry about them interrupting a finally block. However, thread aborts aren't a concern for .NET Core, and even if this library ends up being used on the desktop, it's very unlikely the library is entirely hardened against them. I wouldn't worry about using try/finally's in this way.

@ianhays

This comment has been minimized.

Show comment
Hide comment
@ianhays

ianhays Aug 21, 2015

Member

This commit is a work in progress on getting ZLib running as the default method of decompression on Windows and Unix systems that support it. The potential benefits are substantial; my initial perf benchmarks that measure the speed of the DeflateStream.ReadAsync (awaited) method when reading the test file "GZTestData/GZTestDocument.txt.gz" give the following measurements:

• DeflateStream.ReadAsync Windows 1000 iterations:
    ○ Windows Zlib: 42 +- 3 ms
    ○ Windows Managed: 128 +- 12 ms
• DeflateStream.ReadAsync Unix 50 iterations:
    ○ Unix Zlib: 132 +- 47 ms
    ○ Unix Managed: 360 +- 52 ms

To clarify, this means that my test did a ReadAsync 1000/50 times, and I ran the test n times. Across the n times, the average runtime for all iterations is given. I apologize for the convolution, but building my perf tests on top of xunit made things a bit messy :)

Regardless, a 3X DeflateStream.ReadAsync speedup is exciting!

Member

ianhays commented Aug 21, 2015

This commit is a work in progress on getting ZLib running as the default method of decompression on Windows and Unix systems that support it. The potential benefits are substantial; my initial perf benchmarks that measure the speed of the DeflateStream.ReadAsync (awaited) method when reading the test file "GZTestData/GZTestDocument.txt.gz" give the following measurements:

• DeflateStream.ReadAsync Windows 1000 iterations:
    ○ Windows Zlib: 42 +- 3 ms
    ○ Windows Managed: 128 +- 12 ms
• DeflateStream.ReadAsync Unix 50 iterations:
    ○ Unix Zlib: 132 +- 47 ms
    ○ Unix Managed: 360 +- 52 ms

To clarify, this means that my test did a ReadAsync 1000/50 times, and I ran the test n times. Across the n times, the average runtime for all iterations is given. I apologize for the convolution, but building my perf tests on top of xunit made things a bit messy :)

Regardless, a 3X DeflateStream.ReadAsync speedup is exciting!

@ianhays

This comment has been minimized.

Show comment
Hide comment
@ianhays

ianhays Aug 21, 2015

Member

One more thing, the reason I ran the tests on Unix with only 50 iterations vs 1000 on Windows is because my Unix VM is significantly slower and I didn't have the patience to run the tests that many times.

cc: @stephentoub

Member

ianhays commented Aug 21, 2015

One more thing, the reason I ran the tests on Unix with only 50 iterations vs 1000 on Windows is because my Unix VM is significantly slower and I didn't have the patience to run the tests that many times.

cc: @stephentoub

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Aug 21, 2015

Member

Regardless, a 3X DeflateStream.ReadAsync speedup is exciting!

Excellent!

Member

stephentoub commented Aug 21, 2015

Regardless, a 3X DeflateStream.ReadAsync speedup is exciting!

Excellent!

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub
Member

stephentoub commented Aug 21, 2015

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Aug 22, 2015

Member

A few more comments, but otherwise LGTM. Nice job, Ian.

Member

stephentoub commented Aug 22, 2015

A few more comments, but otherwise LGTM. Nice job, Ian.

Added ZLib decompression support to DeflateStream
The DeflateStream class currently uses ZLib for compression (if available) but doesn't use ZLib for decompression. This commit adds support to use either ZLib or the Managed implementation for decompression (inflation).

- Roughly a 3X speedup for decompression when using the ZLib library instead of the Managed implementation!
- Resolves #2024; Added ZLib support in InflaterZLib.cs
- Inflater.cs (the old managed implementation) is renamed to InflaterManaged.cs
- Extracted interface of necessary Inflater methods into the IInflater interface from which InflaterManaged and InflaterZLib inherit.
- Removed the SetFileFormatReader functions and changed the code to instead set the formatreader at initialization time, avoiding the need to create->destroy->create.
- Changed inputBuffer in SetInput for DeflateZLib and InflateZLib to use the fixed keyword instead of an extra GCHandle allocation.
- Added a test for reading sequentially from a DeflateStream created with an underlying MemoryStream.

ianhays added a commit that referenced this pull request Aug 24, 2015

Merge pull request #2906 from ianhays/compression
Added ZLib decompression support to DeflateStream

@ianhays ianhays merged commit 6941118 into dotnet:master Aug 24, 2015

1 check passed

default Build finished. No test results found.
Details

@ianhays ianhays deleted the ianhays:compression branch Oct 2, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment