Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

igzip is 3x faster in decompressing that zlib-ng (6x than zlib) #986

Closed
ghuls opened this issue Jun 4, 2021 · 6 comments
Closed

igzip is 3x faster in decompressing that zlib-ng (6x than zlib) #986

ghuls opened this issue Jun 4, 2021 · 6 comments

Comments

@ghuls
Copy link

ghuls commented Jun 4, 2021

igzip is 3x faster in decompressing that zlib-ng (6x than zlib).
https://github.com/intel/isa-l/tree/master/igzip

test.gz ==> 5034242503 bytes (compressed)
test ==> 16396511939 bytes (decompressed)
# Gzip decompression speed.
$ module load gzip                                                                                                                                    
$ timeit gzip -cd test.gz > /dev/null                                                                                                                                                                                                         
Time output:                                                                                                                                            
------------                                                                                                                                            
  * Command: gzip -cd test.gz                                                                                                                                                                                                                   
  * Elapsed wall time: 1:57.76 = 117.76 seconds                                                                                                         
  * Elapsed CPU time:                                                                                                                                   
     - User: 115.85                                                                                                                                     
     - Sys: 1.45                                                                                                                                        
  * CPU usage: 99%                                                                                                                                      
  * Context switching:                                                                                                                                  
     - Voluntarily (e.g.: waiting for I/O operation): 9                                                                                                 
     - Involuntarily (time slice expired): 14919                                                                                                        
  * Maximum resident set size (RSS: memory) (kiB): 736                                                                                                  
  * Number of times the process was swapped out of main memory: 0                                                                                       
  * Filesystem:                                                                                                                                         
     - # of inputs: 192                                                                                                                                 
     - # of outputs: 0                                                                                                                                  
  * Exit status: 0

                                                                                                                                                                                                                          
# pigz decompression speed with zlib-ng.
module load zlib-ng
module load pigz

$ timeit pigz -cd  pigz -cd test.gz > /dev/null                                                                                                                                                                                                         
Time output:                                                                                                                                            
------------                                                                                                                                            
  * Command: pigz -cd test.gz                                                                                                                                                                                                                   
  * Elapsed wall time: 1:02.13 = 62.13 seconds                                                                                                          
  * Elapsed CPU time:                                                                                                                                   
     - User: 77.85                                                                                                                                      
     - Sys: 7.01                                                                                                                                        
  * CPU usage: 136%                                                                                                                                     
  * Context switching:                                                                                                                                  
     - Voluntarily (e.g.: waiting for I/O operation): 1155349                                                                                           
     - Involuntarily (time slice expired): 539                                                                                                          
  * Maximum resident set size (RSS: memory) (kiB): 1008                                                                                                 
  * Number of times the process was swapped out of main memory: 0                                                                                       
  * Filesystem:                                                                                                                                         
     - # of inputs: 0                                                                                                                                   
     - # of outputs: 0                                                                                                                                  
  * Exit status: 0                                                                                                                                      


# igzip decompression speed.
# igzip: https://github.com/intel/isa-l

module load ISA-L

$ timeit igzip -cd test.gz > /dev/null                                                                                                                                                                                                                                                                                                                                                                                                                              
Time output:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
------------                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
  * Command: igzip -cd test.gz                                                                                                                                                                                                                                                                                                                                                                                                                                        
  * Elapsed wall time: 0:20.53 = 20.53 seconds                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
  * Elapsed CPU time:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
     - User: 19.30                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
     - Sys: 1.17                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
  * CPU usage: 99%                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
  * Context switching:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
     - Voluntarily (e.g.: waiting for I/O operation): 6                                                                                                 
     - Involuntarily (time slice expired): 48                                                                                                           
  * Maximum resident set size (RSS: memory) (kiB): 2840                                                                                                 
  * Number of times the process was swapped out of main memory: 0                                                                                       
  * Filesystem:                                                                                                                                         
     - # of inputs: 0                                                                                                                                   
     - # of outputs: 0                                                                                                                                  
  * Exit status: 0  
@mtl1979
Copy link
Collaborator

mtl1979 commented Jun 4, 2021

zlib-ng doesn't try to be fastest... It does however include a lot of intrinsic functions to speed up things but doesn't contain hand-written assembly.

Some C compilers can't handle inline assembly so separate assembler is required for supporting hand-written assembly.

@neurolabusc
Copy link

neurolabusc commented Jun 11, 2021

As @sebpop has noted, a less efficient compression may lead to a file that is faster and easier to decompress. To compare decompression between tools, all tools should contribute to a corpus of compressed files and then each tool is timed while decompressing this common set of files. I used this script to compare igzip on an AMD 3900X. igzip's compression is fast but generates large files (see figure). When decompression is compared in a fair test, it is faster than zlib-ng, somewhat faster than libdeflate (which tends to have higher memory demands). For completeness, I also tested AArch64 binaries on an ARM-based Apple M1, where igzip and zlib-ng performed similarly (~284 mb/s) while libdeflate was faster (341 mb/s). igzip only supports compression levels 0..3, so it is not a drop-in replacement for scripts that expect gzip to support levels 1..9 (specifying a level greater than 3 results in a invalid compression level error). As an aside, here I test zlib-ng single threaded, pigz is much faster compressing using multiple threads, but only a bit faster decompressing than a single-threaded tool (I think the decompression uses just one thread, but pigz uses a separate thread for the CRC calculation). Therefore, for x86-64, igzip compression fills a niche, and the decompression is class leading (~2x zlib-ng).

DecompressMethod	Min	Mean	Max	mb/s
igzip	9040	9157	9233	492.32
libdeflate	9922	9974	10017	448.59
zlibNGclang	17295	17345	17390	257.34

To try this yourself, you just need to install igzip on your computer and append one line of code around line 369:

exes.append({'exe': 'igzip', 'uncompress': ' -q -f -q -k -d ', 'compress': ' -q -f -k -', 'max_level': 3, 'ext': '.gz' })

silesia_speed_sizeX

@mtl1979
Copy link
Collaborator

mtl1979 commented Jun 11, 2021

It's not fair to compare zlib-ng+pigz combination if others are executed directly.... zlib-ng is only fast when it is given large enough buffer in memory and any disk I/O is avoided... Until today there was known issue with pigz and zlib-ng which did result in skewed results and possible data corruption.

@neurolabusc
Copy link

@nmoinvaz you are correct that igzip can use multiple threads (e.g. -T 4) for compression. However, I tested it in the default single-threaded mode against other single-threaded tools.

@mtl1979 I did test on a RAM disk to minimize disk I/O. I notice that minigzip.c defines a 16kb/48kb read/write buffer size. Perhaps we could allow the user to specify this from the command line, which might improve performance and remove this source of variability when comparing to other tools.

@ghuls
Copy link
Author

ghuls commented Jun 17, 2021

For me the most appealing part of igzip is the decompression speed. (compression speed is fast too, but compression ratio suffers of course). Probably I compiled igzip wrong, but specifying -T 4 when compression a file with igzip never resulted in a CPU usage higher than 100%, so seems like it couldn't find the threading library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants