Skip to content
gtoubassi edited this page Jul 26, 2021 · 22 revisions

Performance

Note as of 7/2021, if you are optimizing for performance you may consider Zstd as it is optimized for speed, and supports dictionary compression. It does not appear to be as effective at dictionary compression of small payloads as FemtoZip (see Tutorial).

As of 4/5/2011, performance of femtozip vs gzip/deflate and gzip/deflate+dictionary:

Algorithm Compression time (millis) Decompression time (millis) Compression ratio
FemtoZip 439 57 31.58%
FemtoZip Level 3 (faster) 240 75 32.92%
FemtoZip No Dict* 189 116 94.65%
GZip 340 98 92.92%
GZip+Dict 2998 382 53.08%
Pure Java FemtoZip 812 263 31.38%
JNI FemtoZip 659 232 31.58%
  • FemtoZip No Dict is simply femtozip with no dictionary, which although in practice should not occur, gives an idea how the core algorithm performs when compared with vanilla gzip (comparing FZ with GZ+Dict isn't great apples and apples because GZip+Dict has such poor performance on the compression side

Conclusions

  • FemtoZip is faster on decompression. This is attributed to the fact that windowing complexity is eliminated, and more importantly the fact that a huffman tree does not have to be computed on the fly (in fact gzip computes huffman trees 2 ways: custom and default, in order to compare storage tradeoffs since the cost of a custom tree impacts compression rate).
  • Default FemtoZip is faster than GZip+Dict on compression, but slower than GZip. The existance of a dictionary slows down compression because more matches need to be pursued. This is to be expected but gives an idea of practical compression performance vs vanilla GZip. FemtoZip No Dict shows what the core compression algorithm does without a dictionary for a more apples/apples comparison. In this case FZ is faster.
  • FemtoZip with a compressionLevel of 3 trades off very little compression ratio but outperforms GZip.
  • The Java JNI interface needs a serious hosedown. Right now lots of data copies are happening in between. The performance should be much closer to raw FemtoZip then to pure Java.

Methodology

Assuming your femtozip source clone is in ~/femtozip:

% cd ~/femtozip/scripts
% rm -rf data
% ./perfGenData data
% ./perfBuildModels data
% ./perfRun data
% rm -rf data