New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build clrcompression.dll with optimized Zlib-Intel #5674

Merged
merged 1 commit into from Jan 30, 2016

Conversation

Projects
None yet
7 participants
@bjjones
Contributor

bjjones commented Jan 25, 2016

#3986 Zlib-Intel is an improved version of Zlib that contains optimizations for processors with the SSE4.2 instruction set. In the most common case of CompressionLevel Optimal, we see a +25-30% performance improvement.

Currently, the version of clrcompression.dll built in the repo is not copied into any of the tests, so to test I manually copied the binary into System.IO.Compression.Tests and ran the test manually resulting in no errors.

Please note that due to a gzip incompatibility I have modified the library to no longer use the agressive deflate_quick strategy and instead use the traditional deflate_fast strategy for CompressionLevel Fastest. This causes performance for CompressionLevel Fastest to be similar to Zlib-Adler.

For ARM platforms, clrcompression.dll is still built using Zlib-Adler.

Performance numbers gathered on an Intel Core i5-4670 CPU for the Canterbury Corpus Benchmark follow:

innerIterations filename compressLevel Adler Intel Intel / Adler
25 alice29.txt Optimal 11.12274 7.505396 67.48%
25 asyoulik.txt Optimal 10.22775 6.969144 68.14%
25 cp.html Optimal 1.28132 1.4261 110.30%
25 fields.c Optimal 0.81646 0.8408881 102.99%
25 grammar.lsp Optimal 0.7443761 0.670172 90.03%
25 kennedy.xls Optimal 48.62418 18.24115 37.51%
25 lcet10.txt Optimal 26.82975 17.67865 65.89%
25 plrabn12.txt Optimal 39.65364 25.09914 63.3%
25 ptt5 Optimal 14.31153 9.271148 64.78%
25 sum Optimal 10.1662 3.163316 31.12%
25 xargs.1 Optimal 0.831952 0.7740601 93.04%
innerIterations CompressLevel Average Intel / Adler
1 Optimal 75.69%
10 Optimal 70.62%
25 Optimal 70.51%
50 Optimal 71.75%
1 Fastest 101.53%
10 Fastest 100.38%
25 Fastest 95.73%
50 Fastest 99.54%
1 NoCompression 100.07%
10 NoCompression 100.78%
25 NoCompression 97.40%
50 NoCompression 99.26%

@ianhays @stephentoub @joshfree

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Jan 25, 2016

Member

Thanks, @bjjones. Does this / how does this affect compression levels / output file sizes?

Member

stephentoub commented Jan 25, 2016

Thanks, @bjjones. Does this / how does this affect compression levels / output file sizes?

@bjjones

This comment has been minimized.

Show comment
Hide comment
@bjjones

bjjones Jan 25, 2016

Contributor

For the Canterbury Corpus at CompressionLevel Optimal using deflateStream, I saw:

Library Initial Size Compressed Size Ratio (Compressed/Initial)
Zlib-Adler 2670KB 713KB .26704
Zlib-Intel 2670KB 733KB .27453

Also noted in the Zlib-Intel Whitepaper, there are very small increases in compression ratio: http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/zlib-compression-whitepaper-copy.pdf

Contributor

bjjones commented Jan 25, 2016

For the Canterbury Corpus at CompressionLevel Optimal using deflateStream, I saw:

Library Initial Size Compressed Size Ratio (Compressed/Initial)
Zlib-Adler 2670KB 713KB .26704
Zlib-Intel 2670KB 733KB .27453

Also noted in the Zlib-Intel Whitepaper, there are very small increases in compression ratio: http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/zlib-compression-whitepaper-copy.pdf

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Jan 25, 2016

Member

For the Canterbury Corpus at CompressionLevel Optimal using deflateStream

Thanks for the data. What about at Fastest?

Member

stephentoub commented Jan 25, 2016

For the Canterbury Corpus at CompressionLevel Optimal using deflateStream

Thanks for the data. What about at Fastest?

@bjjones

This comment has been minimized.

Show comment
Hide comment
@bjjones

bjjones Jan 25, 2016

Contributor

In the same scenario, I saw a slightly improved compression ratio using CompressionLevel Fastest.

Library Initial Size Compressed Size Ratio (Compressed/Initial)
Zlib-Adler 2670KB 842KB .31535
Zlib-Intel 2670KB 812KB .30412
Contributor

bjjones commented Jan 25, 2016

In the same scenario, I saw a slightly improved compression ratio using CompressionLevel Fastest.

Library Initial Size Compressed Size Ratio (Compressed/Initial)
Zlib-Adler 2670KB 842KB .31535
Zlib-Intel 2670KB 812KB .30412
@ianhays

This comment has been minimized.

Show comment
Hide comment
@ianhays

ianhays Jan 25, 2016

Member

How many of the zlib-intel files are different than their zlib-adler equivalents? For those that are the same, could we just modify the CMakeLists for zlib-intel to reference the ones in zlib-adler? So we don't have the duplication of identical files and it's also easier to see what zlib-intel changes.

It would mess up the MakeFile build, but we don't use that anyways.

Member

ianhays commented Jan 25, 2016

How many of the zlib-intel files are different than their zlib-adler equivalents? For those that are the same, could we just modify the CMakeLists for zlib-intel to reference the ones in zlib-adler? So we don't have the duplication of identical files and it's also easier to see what zlib-intel changes.

It would mess up the MakeFile build, but we don't use that anyways.

@@ -9,7 +9,7 @@ set __CMakeBinDir=""
set __IntermediatesDir=""
set __BuildArch=AnyCPU
set __VCBuildArch=x86_amd64
set CMAKE_BUILD_TYPE=Debug
set CMAKE_BUILD_TYPE=Release

This comment has been minimized.

@ianhays

ianhays Jan 25, 2016

Member

good call

@ianhays

ianhays Jan 25, 2016

Member

good call

This comment has been minimized.

@stephentoub

stephentoub Jan 29, 2016

Member

Shouldn't this be configured based on the type of build being done?

@stephentoub

stephentoub Jan 29, 2016

Member

Shouldn't this be configured based on the type of build being done?

This comment has been minimized.

@ianhays

ianhays Jan 29, 2016

Member

It is. This is the default setting. It is set on lines 19/20

@ianhays

ianhays Jan 29, 2016

Member

It is. This is the default setting. It is set on lines 19/20

This comment has been minimized.

@stephentoub

stephentoub Jan 29, 2016

Member

I see. Why change the default? Don't we normally default to debug unless release is explicitly chosen?

@stephentoub

stephentoub Jan 29, 2016

Member

I see. Why change the default? Don't we normally default to debug unless release is explicitly chosen?

if($ENV{__BuildArch} STREQUAL x86 OR $ENV{__BuildArch} STREQUAL x64)
set(NATIVECOMPRESSION_SOURCES
clrcompression.def
zlib-intel/adler32.c

This comment has been minimized.

@ianhays

ianhays Jan 25, 2016

Member

e.g. if zlib-intel/adler32.c == zlib/adler32.c then modify these sources to use zlib/adler32.c

@ianhays

ianhays Jan 25, 2016

Member

e.g. if zlib-intel/adler32.c == zlib/adler32.c then modify these sources to use zlib/adler32.c

This comment has been minimized.

@ianhays

ianhays Jan 25, 2016

Member

Also, are all of these files used? I know for zlib-adler there were a bunch of files that weren't getting used in the build so we commented out their inclusion. Can we do that here?

@ianhays

ianhays Jan 25, 2016

Member

Also, are all of these files used? I know for zlib-adler there were a bunch of files that weren't getting used in the build so we commented out their inclusion. Can we do that here?

@ianhays

This comment has been minimized.

Show comment
Hide comment
@ianhays

ianhays Jan 25, 2016

Member

I ran the perf&unit tests again on the zlibs produced by the cmake build and replicated your results.

Looks good other than a few minor comments about duplication :)

Member

ianhays commented Jan 25, 2016

I ran the perf&unit tests again on the zlibs produced by the cmake build and replicated your results.

Looks good other than a few minor comments about duplication :)

@ianhays

This comment has been minimized.

Show comment
Hide comment
@ianhays

ianhays Jan 25, 2016

Member

@mellinoe OSX_Release failure #4833 presumably
13:30:18 System.Numerics.Tests.Matrix4x4Tests.Matrix4x4CreateFromYawPitchRollTest2 [FAIL]
13:30:18 Yaw:155 Pitch:-20 Roll:50
13:30:18 Expected: True
13:30:18 Actual: False
13:30:18 Stack Trace:
13:30:18 at System.Numerics.Tests.Matrix4x4Tests.Matrix4x4CreateFromYawPitchRollTest2()

Member

ianhays commented Jan 25, 2016

@mellinoe OSX_Release failure #4833 presumably
13:30:18 System.Numerics.Tests.Matrix4x4Tests.Matrix4x4CreateFromYawPitchRollTest2 [FAIL]
13:30:18 Yaw:155 Pitch:-20 Roll:50
13:30:18 Expected: True
13:30:18 Actual: False
13:30:18 Stack Trace:
13:30:18 at System.Numerics.Tests.Matrix4x4Tests.Matrix4x4CreateFromYawPitchRollTest2()

Build clrcompression.dll with optimized Zlib-Intel
On x86/x64 platforms, use Zlib-Intel for native compression instead of
Zlib-Adler. Performance tests show a +30% improvement when compressing
files by using SSE4.2 features.

ARM platforms still use Zlib-Adler for clrcompression.dll

Fix #3986
@bjjones

This comment has been minimized.

Show comment
Hide comment
@bjjones

bjjones Jan 27, 2016

Contributor

Unnecessary/duplicate files have been removed and I added a README.txt to the Zlib-Intel folder.

Contributor

bjjones commented Jan 27, 2016

Unnecessary/duplicate files have been removed and I added a README.txt to the Zlib-Intel folder.

@ianhays

This comment has been minimized.

Show comment
Hide comment
@ianhays
Member

ianhays commented Jan 27, 2016

LGTM. @stephentoub ?

)
endif()
if($ENV{__BuildArch} STREQUAL arm)

This comment has been minimized.

@stephentoub

stephentoub Jan 29, 2016

Member

I don't know what other potential future values __BuildArch could/should be, but should this just be an else?

@stephentoub

stephentoub Jan 29, 2016

Member

I don't know what other potential future values __BuildArch could/should be, but should this just be an else?

@@ -0,0 +1,11 @@
Zlib-Intel is an optimized version of Zlib developed by Intel to take advantage of modern CPU features.

This comment has been minimized.

@stephentoub

stephentoub Jan 29, 2016

Member

Do we need to make any changes to the 3rd party notices file?
https://github.com/dotnet/corefx/blob/master/THIRD-PARTY-NOTICES#L65

cc: @richlander, @martinwoodward

@stephentoub

This comment has been minimized.

@martinwoodward

martinwoodward Jan 29, 2016

Member

Just to confirm - the notices file already includes the appropriate ZLib reference due to the code on which this PR is based so we're good as is. Thanks for checking.

@martinwoodward

martinwoodward Jan 29, 2016

Member

Just to confirm - the notices file already includes the appropriate ZLib reference due to the code on which this PR is based so we're good as is. Thanks for checking.

* For conditions of distribution and use, see copyright notice in zlib.h
*/
/* @(#) $Id$ */

This comment has been minimized.

@stephentoub

stephentoub Jan 29, 2016

Member

What is this @(#) $Id$ comment showing up in the files?

@stephentoub

stephentoub Jan 29, 2016

Member

What is this @(#) $Id$ comment showing up in the files?

This comment has been minimized.

@ellismg

ellismg Jan 29, 2016

Contributor

Other source control systems expand that on checkout to something meaningful (e.g. the filename, last modified date, etc).

@ellismg

ellismg Jan 29, 2016

Contributor

Other source control systems expand that on checkout to something meaningful (e.g. the filename, last modified date, etc).

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Jan 29, 2016

Member

A few nits, but generally looks good to me.

Member

stephentoub commented Jan 29, 2016

A few nits, but generally looks good to me.

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub
Member

stephentoub commented Jan 30, 2016

Thanks, @bjjones.

stephentoub added a commit that referenced this pull request Jan 30, 2016

Merge pull request #5674 from bjjones/zlib-intel
Build clrcompression.dll with optimized Zlib-Intel

@stephentoub stephentoub merged commit 1295b0a into dotnet:master Jan 30, 2016

10 checks passed

Innerloop CentOS7.1 Debug Build and Test Build finished. No test results found.
Details
Innerloop CentOS7.1 Release Build and Test Build finished. No test results found.
Details
Innerloop OSX Debug Build and Test Build finished. No test results found.
Details
Innerloop OSX Release Build and Test Build finished. No test results found.
Details
Innerloop OpenSUSE13.2 Debug Build and Test Build finished. No test results found.
Details
Innerloop OpenSUSE13.2 Release Build and Test Build finished. No test results found.
Details
Innerloop Ubuntu Debug Build and Test Build finished. No test results found.
Details
Innerloop Ubuntu Release Build and Test Build finished. No test results found.
Details
Innerloop Windows_NT Debug Build and Test Build finished. 111178 tests run, 34 skipped, 0 failed.
Details
Innerloop Windows_NT Release Build and Test Build finished. 111175 tests run, 34 skipped, 0 failed.
Details

@karelz karelz modified the milestone: 1.0.0-rtm Dec 3, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment