Home

Konstantin Nosov edited this page Jul 10, 2018 · 22 revisions
Clone this wiki locally

Test results

For test results, look here.

Compared versions

For comparison we used Win32 assembly-based optimized version (Asm), Win32 C-based version (C), Win32 DLL-based version (DLL) and Win32 C-based version built from original sources (Orig). "DLL based version" utilized prebuilt zlib 1.2.3 obtained from Winimage zlib page. These DLL files were built from zlib with official Assembly patches.

Also performance-oriented zlib fork, zlib-ng was tested.

Tests were built using Visual Studio 2013 with speed optimization. Test system is Windows 8.1 64-bit, Intel Core i7-4702MQ @ 2.2 GHz.

Used data sets

For testing we used 3 data sets.

  1. Unreal engine 4 source code (Engine/Source)
  2. Unreal engine 4 binaries (Engine/Binaries/Win64)
  3. Graphical files containing geometry and textures in tif and png formats.

All data sets contains up to 256 Mb of data (Unreal engine 4 source code is smaller, 137Mb).

Win64 notes

For 64-bit target, original C implementation (Orig) is 6-10% slower than 32-bit build of Orig. The same is for DLL build. However optimized (fast_zlib) C version (C) in 64 bits is faster than optimized 32-bit code. In other words: original code in 64 bits works slower than original 32-bit code, but 64-bit optimized code works faster than optimized 32-bit code. Therefore, code gets additional performance boost (compared to Orig version) just because of its 64-bitness.

For 32-bit target, optimized assembly version is ~5% faster than optimized C version, so 64-bit C version performance is somewhere in between 32-bit Asm and 32-bit C. This is the reason why I didn't provide 64-bit assembly version of algorithm.

Linux notes

I've tested library on 32-bit Ubuntu, compiled with GCC 5.4.0. Optimized code performs 1-4% slower than Win32 version. Original zlib implementation performs 7-8% slower.

Testing

Test mode is name of compared version, mentioned in Compared versions paragraph. The following number is compression level. Each table cell contains data in following format: <elapsed time> / <compressed size> / <compression speed>.

As you may see, Asm version is just a little bit faster than C version. Optimized version performs nearly 2.5-10 times faster than original C version. Thank slower compression, than more performance improvement achieved.

Test application was designed to exclude file access times as much as possible. Both compression and decompression are performed in-memory, file reading operations are excluded. So, the table below shows "clean" compression results. Test application source code could be found here.

Current tests are for zlib version 1.2.11.

Test results for Release 2

This release performs 15-35% faster than Release 1. Than slower compression, than more performance boost we'll get.

Zlib-ng compiled for Win64 performs is nearly 10-15% faster than zlib-ng for Win32. This zlib version has some gcc-specific optimizations, so probably if I'd build it with use of gcc, it would work faster. Visual C build didn't show any significant performance improvements, and even sometimes works slower than original zlib code.

Test mode Source code Binaries Geometry data
Asm -9 5.2s / 28858611 b / 26.49 Mb/s 12.3s / 54473779 b / 20.74 Mb/s 17.2s / 114915875 b / 14.85 Mb/s
C -9 5.1s / 28858611 b / 27.08 Mb/s 12.6s / 54473779 b / 20.29 Mb/s 18.2s / 114915875 b / 14.07 Mb/s
zlib-ng -9 9.5s / 28859066 b / 14.53 Mb/s 45.3s / 54485028 b / 5.65 Mb/s 128.1s / 114934697 b / 2.00 Mb/s
Dll -9 9.6s / 28859103 b / 14.37 Mb/s 41.7s / 54485836 b / 6.14 Mb/s 145.9s / 114942962 b / 1.75 Mb/s
Orig -9 10.7s / 28859103 b / 12.84 Mb/s 49.1s / 54485836 b / 5.21 Mb/s 180.0s / 114942962 b / 1.42 Mb/s

Test results for Release 1

Test mode Source code Binaries Geometry data
Asm -9 11.8s / 51136013 b / 21.65 Mb/s 15.4s / 54456154 b / 16.64 Mb/s 25.4s / 114917505 b / 10.07 Mb/s
C -9 12.3s / 51136013 b / 20.81 Mb/s 16.1s / 54456154 b / 15.88 Mb/s 26.6s / 114917505 b / 9.63 Mb/s
Dll -9 22.6s / 51145811 b / 11.35 Mb/s 42.0s / 54485836 b / 6.10 Mb/s 150.0s / 114942962 b / 1.71 Mb/s
Orig -9 26.3s / 51145811 b / 9.75 Mb/s 52.6s / 54485836 b / 4.87 Mb/s 184.4s / 114942962 b / 1.39 Mb/s