Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues when using density inside the Blosc meta-compressor #47

Closed
FrancescAlted opened this issue Jun 20, 2015 · 14 comments
Closed

Issues when using density inside the Blosc meta-compressor #47

FrancescAlted opened this issue Jun 20, 2015 · 14 comments
Assignees

Comments

@FrancescAlted
Copy link

Hi, I am trying to add support for DENSITY into the Blosc meta-compressor. So, right now, my attempt lives here: https://github.com/FrancescAlted/c-blosc/tree/density, and in particular, you can see how DENSITY is called here: https://github.com/FrancescAlted/c-blosc/blob/density/blosc/blosc.c#L504

However, I am running into issues when selecting the DENSITY codec:

$ bench/bench density
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
  BloscLZ: 1.0.5
  LZ4: 1.7.0
  Snappy: 1.1.1
  Zlib: 1.2.8
  DENSITY: 0.12.5
Using compressor: density
Running suite: single
--> 4, 2097152, 8, 19, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 19 significant bits (out of 32)
Dataset size: 2097152 bytes     Type size: 8 bytes
Working set: 256.0 MB           Number of threads: 4
********************** Running benchmarks *********************
memcpy(write):            515.7 us, 3878.2 MB/s
memcpy(read):             247.1 us, 8095.3 MB/s
Compression level: 0
comp(write):      339.5 us, 5890.4 MB/s   Final bytes: 2097168  Ratio: 1.00
decomp(read):     252.4 us, 7925.2 MB/s   OK
Compression level: 1
comp(write):     13871.5 us, 144.2 MB/s   Final bytes: 1204240  Ratio: 1.74
decomp(read):     143.1 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 2
comp(write):     10653.2 us, 187.7 MB/s   Final bytes: 1204240  Ratio: 1.74
decomp(read):     230.5 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 3
comp(write):     10549.5 us, 189.6 MB/s   Final bytes: 1204240  Ratio: 1.74
decomp(read):     149.4 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 4
comp(write):     7510.8 us, 266.3 MB/s    Final bytes: 1159184  Ratio: 1.81
decomp(read):     143.1 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 5
comp(write):     5459.7 us, 366.3 MB/s    Final bytes: 1159184  Ratio: 1.81
decomp(read):     149.7 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 6
comp(write):     3324.9 us, 601.5 MB/s    Final bytes: 1136656  Ratio: 1.85
decomp(read):     148.9 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 7
comp(write):     2294.0 us, 871.8 MB/s    Final bytes: 1125520  Ratio: 1.86
decomp(read):     152.4 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 8
comp(write):     2570.4 us, 778.1 MB/s    Final bytes: 1125520  Ratio: 1.86
decomp(read):     174.4 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 9
comp(write):     1798.7 us, 1111.9 MB/s   Final bytes: 1119824  Ratio: 1.87
decomp(read):     252.0 us, -0.0 MB/s     FAILED.  Error code: -1
OK

Round-trip compr/decompr on 7.5 GB
Elapsed time:      23.4 s, 721.2 MB/s

The above is with 'master' branch (refreshed some minutes ago). With the 'dev' of DENSITY, I get somewhat better results:

$ bench/bench density
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
  BloscLZ: 1.0.5
  LZ4: 1.7.0
  Snappy: 1.1.1
  Zlib: 1.2.8
  DENSITY: 0.12.5
Using compressor: density
Running suite: single
--> 4, 2097152, 8, 19, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 19 significant bits (out of 32)
Dataset size: 2097152 bytes     Type size: 8 bytes
Working set: 256.0 MB           Number of threads: 4
********************** Running benchmarks *********************
memcpy(write):            511.3 us, 3911.4 MB/s
memcpy(read):             246.3 us, 8121.1 MB/s
Compression level: 0
comp(write):      282.7 us, 7074.0 MB/s   Final bytes: 2097168  Ratio: 1.00
decomp(read):     213.6 us, 9365.2 MB/s   OK
Compression level: 1
comp(write):     10286.4 us, 194.4 MB/s   Final bytes: 1206288  Ratio: 1.74
decomp(read):    10408.2 us, 192.2 MB/s   OK
Compression level: 2
comp(write):     10418.7 us, 192.0 MB/s   Final bytes: 1206288  Ratio: 1.74
decomp(read):    11595.9 us, 172.5 MB/s   OK
Compression level: 3
comp(write):     10521.3 us, 190.1 MB/s   Final bytes: 1206288  Ratio: 1.74
decomp(read):    10770.5 us, 185.7 MB/s   OK
Compression level: 4
comp(write):     5685.5 us, 351.8 MB/s    Final bytes: 1160208  Ratio: 1.81
decomp(read):    6010.8 us, 332.7 MB/s    OK
Compression level: 5
comp(write):     5879.1 us, 340.2 MB/s    Final bytes: 1160208  Ratio: 1.81
decomp(read):    6056.7 us, 330.2 MB/s    OK
Compression level: 6
comp(write):     3476.0 us, 575.4 MB/s    Final bytes: 1137168  Ratio: 1.84
decomp(read):    3381.0 us, 591.5 MB/s    OK
Compression level: 7
comp(write):     2396.0 us, 834.7 MB/s    Final bytes: 1125776  Ratio: 1.86
decomp(read):     194.8 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 8
comp(write):     2266.4 us, 882.5 MB/s    Final bytes: 1125776  Ratio: 1.86
decomp(read):     164.7 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 9
comp(write):     2056.3 us, 972.6 MB/s    Final bytes: 1119952  Ratio: 1.87
decomp(read):    1559.9 us, 1282.1 MB/s   OK

Round-trip compr/decompr on 7.5 GB
Elapsed time:      40.1 s, 421.2 MB/s

So, I suppose DENSITY is still in beta, but please consider c-blosc as a another testing bench. Second, I wonder why the speed is so low. For example, by using the LZ4 codec I am getting this:

$ bench/bench lz4
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
  BloscLZ: 1.0.5
  LZ4: 1.7.0
  Snappy: 1.1.1
  Zlib: 1.2.8
  DENSITY: 0.12.5
Using compressor: lz4
Running suite: single
--> 4, 2097152, 8, 19, lz4
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 19 significant bits (out of 32)
Dataset size: 2097152 bytes     Type size: 8 bytes
Working set: 256.0 MB           Number of threads: 4
********************** Running benchmarks *********************
memcpy(write):            536.8 us, 3725.7 MB/s
memcpy(read):             250.7 us, 7978.5 MB/s
Compression level: 0
comp(write):      328.2 us, 6093.7 MB/s   Final bytes: 2097168  Ratio: 1.00
decomp(read):     244.8 us, 8170.1 MB/s   OK
Compression level: 1
comp(write):      499.4 us, 4005.2 MB/s   Final bytes: 554512  Ratio: 3.78
decomp(read):     268.6 us, 7445.3 MB/s   OK
Compression level: 2
comp(write):      472.6 us, 4231.5 MB/s   Final bytes: 498960  Ratio: 4.20
decomp(read):     278.4 us, 7184.0 MB/s   OK
Compression level: 3
comp(write):      449.4 us, 4450.4 MB/s   Final bytes: 520824  Ratio: 4.03
decomp(read):     323.2 us, 6188.7 MB/s   OK
Compression level: 4
comp(write):      440.6 us, 4539.6 MB/s   Final bytes: 332112  Ratio: 6.31
decomp(read):     321.1 us, 6227.8 MB/s   OK
Compression level: 5
comp(write):      421.8 us, 4741.8 MB/s   Final bytes: 327112  Ratio: 6.41
decomp(read):     309.7 us, 6458.5 MB/s   OK
Compression level: 6
comp(write):      465.4 us, 4297.5 MB/s   Final bytes: 226308  Ratio: 9.27
decomp(read):     395.4 us, 5058.2 MB/s   OK
Compression level: 7
comp(write):      631.9 us, 3165.3 MB/s   Final bytes: 211880  Ratio: 9.90
decomp(read):     564.4 us, 3543.5 MB/s   OK
Compression level: 8
comp(write):      602.6 us, 3318.9 MB/s   Final bytes: 220464  Ratio: 9.51
decomp(read):     568.9 us, 3515.8 MB/s   OK
Compression level: 9
comp(write):      645.2 us, 3099.9 MB/s   Final bytes: 132154  Ratio: 15.87
decomp(read):     694.6 us, 2879.3 MB/s   OK

Round-trip compr/decompr on 7.5 GB
Elapsed time:       3.8 s, 4497.0 MB/s

which is roughly 10x faster.

In case you want to experiment by yourself, the support for DENSITY in c-blosc is via a shared library for now (requiring C99 is not supported right now in c-blosc because it has to have support for other codecs that are non-C99 compliant code). So, in case the shared libraries for DENSITY are installed in the system (say /usr/local/lib and /usr/local/include for headers), here it is how to compile c-blosc:

$ mkdir build
$ cd build
$ CC="clang-3.5" CXX="clang++-3.5" CFLAGS="-O3" cmake ..
$ CC="clang-3.5" CXX="clang++-3.5" CFLAGS="-O3" make
$ bench/bench density    # bench executable ready to be used

Thanks!

@g1mv
Copy link
Owner

g1mv commented Jun 20, 2015

Hey @FrancescAlted

Thanks for your issue !
A lot of things appear strange to me at first sight in these results :

  • the compression levels you are referring to are blosc compression levels, right ? You only use density's chameleon otherwise ?
  • the higher the compression ratio, the faster the speed... that's really odd
  • even the lz4 results look odd, because lz4 is heavily asymetric and usually 10x faster at decompressing than compressing.

Do you have an idea on these ?
Otherwise thanks for the links I'll give c-blosc a try with static libraries to check out if anything's wrong.
BTW I just released the final 0.12.5 beta seconds ago (its the current master branch), it might already fix some problems.

@g1mv
Copy link
Owner

g1mv commented Jun 20, 2015

This is what I get on OS/X with the latest dev version :

Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
  BloscLZ: 1.0.5
  LZ4: 1.7.0
  Snappy: unknown
  Zlib: 1.2.5
  DENSITY: 0.12.6
Using compressor: density
Running suite: single
--> 4, 2097152, 8, 19, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 19 significant bits (out of 32)
Dataset size: 2097152 bytes Type size: 8 bytes
Working set: 256.0 MB       Number of threads: 4
********************** Running benchmarks *********************
memcpy(write):        595.1 us, 3360.7 MB/s
memcpy(read):         218.6 us, 9149.1 MB/s
Compression level: 0
comp(write):      331.4 us, 6034.9 MB/s   Final bytes: 2097168  Ratio: 1.00
decomp(read):     214.7 us, 9313.6 MB/s   OK
Compression level: 1
comp(write):     2216.0 us, 902.5 MB/s    Final bytes: 1204240  Ratio: 1.74
decomp(read):     537.3 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 2
comp(write):     2206.0 us, 906.6 MB/s    Final bytes: 1204240  Ratio: 1.74
decomp(read):     699.4 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 3
comp(write):     2218.4 us, 901.5 MB/s    Final bytes: 1204240  Ratio: 1.74
decomp(read):     737.3 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 4
comp(write):     1621.4 us, 1233.5 MB/s   Final bytes: 1159184  Ratio: 1.81
decomp(read):    1165.2 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 5
comp(write):     1390.6 us, 1438.2 MB/s   Final bytes: 1159184  Ratio: 1.81
decomp(read):    1189.5 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 6
comp(write):      949.2 us, 2106.9 MB/s   Final bytes: 1136656  Ratio: 1.85
decomp(read):    1355.1 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 7
comp(write):      743.6 us, 2689.6 MB/s   Final bytes: 1125520  Ratio: 1.86
decomp(read):    1497.2 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 8
comp(write):      761.6 us, 2626.1 MB/s   Final bytes: 1125520  Ratio: 1.86
decomp(read):    1562.7 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 9
comp(write):      785.8 us, 2545.2 MB/s   Final bytes: 1119824  Ratio: 1.87
decomp(read):    1980.3 us, -0.0 MB/s     FAILED.  Error code: -1
OK

Round-trip compr/decompr on 7.5 GB
Elapsed time:       9.6 s, 1751.7 MB/s

So it's very similar to your results. I'll need to check what's going on.
Can you give me a quick heads up about the way blosc operates (the logic) ? To avoid digging in your source code and to verify there's nothing incompatible with how density functions ?

@FrancescAlted
Copy link
Author

Thanks for the speedy response. Yes, what blosc does is basically split the data to be compressed in small blocks (in order to use L1 as efficiently as possible, but also for leveraging multi-threading). It then applies a shuffle filter (it does not compress as such, but it helps compressors to achieve better compression ratios in many scenarios of binary data) and then pass the shuffled data to the compressor. More info about how it works in the 10 first minutes of this presentation: https://www.youtube.com/watch?v=E9q33wbPCGU

Regarding the size of the blocks (I suppose this is important for density), they are typically between 8 KB and up to around 1 MB, depending on the compression level, the data type size and the compressor that is going to be used. See the algorithm that computes block sizes here: https://github.com/FrancescAlted/c-blosc/blob/density/blosc/blosc.c#L918

Please tell me if you need more clarifications. I am eager to use DENSITY inside Blosc because I think it is a good fit, but I am trying to understand it first (then I will need to figure out how to use C89 and C99 code in the same project ;)

@FrancescAlted
Copy link
Author

Oh, and regarding the question of using just Chameleon is because I am trying. If everything goes well, the idea is to use Chameleon for low compression levels and Cheetah for higher ones. Then, depending on how slow compression is, I might decide to use Lion for the highest compression level. I suppose I can use density_buffer_decompress() for decompressing any of these, right?

@g1mv
Copy link
Owner

g1mv commented Jun 20, 2015

Ok I got everything to work properly using the following patch applied to your density tree : https://gist.github.com/gpnuma/e159fb6b505ef9b11e00 .

Here is a test run :

Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
  BloscLZ: 1.0.5
  LZ4: 1.7.0
  Snappy: unknown
  Zlib: 1.2.5
  DENSITY: 0.12.6
Using compressor: density
Running suite: single
--> 4, 8388608, 8, 32, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes Type size: 8 bytes
Working set: 256.0 MB       Number of threads: 4
********************** Running benchmarks *********************
memcpy(write):       2526.0 us, 3167.0 MB/s
memcpy(read):        1291.0 us, 6196.7 MB/s
Compression level: 0
comp(write):     1101.3 us, 7264.3 MB/s   Final bytes: 8388624  Ratio: 1.00
decomp(read):    1313.1 us, 6092.6 MB/s   OK
Compression level: 1
comp(write):     2871.6 us, 2785.9 MB/s   Final bytes: 4566672  Ratio: 1.84
decomp(read):    2388.5 us, 3349.3 MB/s   OK
Compression level: 2
comp(write):     2750.1 us, 2909.0 MB/s   Final bytes: 4566672  Ratio: 1.84
decomp(read):    2395.7 us, 3339.3 MB/s   OK
Compression level: 3
comp(write):     2749.2 us, 2910.0 MB/s   Final bytes: 4566672  Ratio: 1.84
decomp(read):    2407.5 us, 3323.0 MB/s   OK
Compression level: 4
comp(write):     2977.3 us, 2687.0 MB/s   Final bytes: 4511568  Ratio: 1.86
decomp(read):    2269.7 us, 3524.7 MB/s   OK
Compression level: 5
comp(write):     3043.9 us, 2628.2 MB/s   Final bytes: 4511568  Ratio: 1.86
decomp(read):    2270.0 us, 3524.2 MB/s   OK
Compression level: 6
comp(write):     4438.5 us, 1802.4 MB/s   Final bytes: 3622608  Ratio: 2.32
decomp(read):    4439.0 us, 1802.2 MB/s   OK
Compression level: 7
comp(write):     4256.3 us, 1879.6 MB/s   Final bytes: 3601120  Ratio: 2.33
decomp(read):    4279.2 us, 1869.5 MB/s   OK
Compression level: 8
comp(write):     4248.0 us, 1883.2 MB/s   Final bytes: 3601120  Ratio: 2.33
decomp(read):    4408.4 us, 1814.7 MB/s   OK
Compression level: 9
comp(write):     11095.0 us, 721.0 MB/s   Final bytes: 1887328  Ratio: 4.44
decomp(read):    12044.7 us, 664.2 MB/s   OK

Round-trip compr/decompr on 7.5 GB
Elapsed time:       7.9 s, 2141.1 MB/s

I set the significant bits to 32 otherwise the data to compress isn't very interesting (it's like processing a file full of zeroes).
Compression ratios are more contained than the lz4 run (they never go below 1.84), as I saw you're using the accel parameter for lz4_fast which can lead to near zero compression but much greater speed.

Here is a sample run with snappy, which exhibits a similar - although lower (1.60) - containment in compression ratio :

Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
  BloscLZ: 1.0.5
  LZ4: 1.7.0
  Snappy: unknown
  Zlib: 1.2.5
  DENSITY: 0.12.6
Using compressor: snappy
Running suite: single
--> 4, 8388608, 8, 32, snappy
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes Type size: 8 bytes
Working set: 256.0 MB       Number of threads: 4
********************** Running benchmarks *********************
memcpy(write):       2402.9 us, 3329.3 MB/s
memcpy(read):        1203.4 us, 6648.0 MB/s
Compression level: 0
comp(write):     1345.3 us, 5946.4 MB/s   Final bytes: 8388624  Ratio: 1.00
decomp(read):    1285.3 us, 6224.3 MB/s   OK
Compression level: 1
comp(write):     6389.5 us, 1252.1 MB/s   Final bytes: 5232684  Ratio: 1.60
decomp(read):    2433.4 us, 3287.5 MB/s   OK
Compression level: 2
comp(write):     4867.7 us, 1643.5 MB/s   Final bytes: 5232684  Ratio: 1.60
decomp(read):    2394.4 us, 3341.1 MB/s   OK
Compression level: 3
comp(write):     4901.1 us, 1632.3 MB/s   Final bytes: 5232684  Ratio: 1.60
decomp(read):    2389.7 us, 3347.6 MB/s   OK
Compression level: 4
comp(write):     5716.6 us, 1399.4 MB/s   Final bytes: 3990010  Ratio: 2.10
decomp(read):    2806.1 us, 2850.9 MB/s   OK
Compression level: 5
comp(write):     5746.6 us, 1392.1 MB/s   Final bytes: 3990010  Ratio: 2.10
decomp(read):    2786.3 us, 2871.2 MB/s   OK
Compression level: 6
comp(write):     6050.9 us, 1322.1 MB/s   Final bytes: 3339270  Ratio: 2.51
decomp(read):    2944.6 us, 2716.8 MB/s   OK
Compression level: 7
comp(write):     6181.5 us, 1294.2 MB/s   Final bytes: 3012514  Ratio: 2.78
decomp(read):    3119.4 us, 2564.6 MB/s   OK
Compression level: 8
comp(write):     6235.0 us, 1283.1 MB/s   Final bytes: 3012514  Ratio: 2.78
decomp(read):    3143.5 us, 2544.9 MB/s   OK
Compression level: 9
comp(write):     5757.8 us, 1389.4 MB/s   Final bytes: 2558737  Ratio: 3.28
decomp(read):    3115.5 us, 2567.8 MB/s   OK

Round-trip compr/decompr on 7.5 GB
Elapsed time:       8.1 s, 2097.7 MB/s

The workaround for output buffer size I used in the aforementioned patch will be fixed in 0.12.6 as a set of function which precisely define the minimum output buffer size for compression/decompression will appear.

@g1mv
Copy link
Owner

g1mv commented Jun 20, 2015

Oh yeah, I forgot to mention : this was compiled and run against the latest dev branch version.

Overall, if I may add, I think you should test blosc against a real file instead of synthetic data. Your current method has the advantage of creating very precise entropy levels but its drawback is that it does not represent anything real.

@FrancescAlted
Copy link
Author

Hmm, something is going wrong in my machine (Ubuntu 14.10 / clang 3.5):

$ bench/bench density single 4 8388608 8 32
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
  BloscLZ: 1.0.5
  LZ4: 1.7.0
  Snappy: 1.1.1
  Zlib: 1.2.8
  DENSITY: 0.12.6
Using compressor: density
Running suite: single
--> 4, 8388608, 8, 32, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes     Type size: 8 bytes
Working set: 256.0 MB           Number of threads: 4
********************** Running benchmarks *********************
memcpy(write):           1875.6 us, 4265.2 MB/s
memcpy(read):            1351.2 us, 5920.8 MB/s
Compression level: 0
comp(write):     1312.5 us, 6095.1 MB/s   Final bytes: 8388624  Ratio: 1.00
decomp(read):    1438.6 us, 5561.0 MB/s   OK
Compression level: 1
comp(write):     55510.6 us, 144.1 MB/s   Final bytes: 5334032  Ratio: 1.57
decomp(read):     177.3 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 2
comp(write):     40168.2 us, 199.2 MB/s   Final bytes: 5334032  Ratio: 1.57
decomp(read):     170.1 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 3
comp(write):     39445.4 us, 202.8 MB/s   Final bytes: 5334032  Ratio: 1.57
decomp(read):     167.2 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 4
comp(write):     28557.8 us, 280.1 MB/s   Final bytes: 4895248  Ratio: 1.71
decomp(read):     157.2 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 5
comp(write):     21233.2 us, 376.8 MB/s   Final bytes: 4895248  Ratio: 1.71
decomp(read):     173.6 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 6
comp(write):     12465.3 us, 641.8 MB/s   Final bytes: 4675856  Ratio: 1.79
decomp(read):     177.4 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 7
comp(write):     8179.7 us, 978.0 MB/s    Final bytes: 4566672  Ratio: 1.84
decomp(read):     191.6 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 8
comp(write):     8064.7 us, 992.0 MB/s    Final bytes: 4566672  Ratio: 1.84
decomp(read):     166.2 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 9
comp(write):     6451.6 us, 1240.0 MB/s   Final bytes: 4511568  Ratio: 1.86
decomp(read):     205.7 us, -0.0 MB/s     FAILED.  Error code: -1
OK

Round-trip compr/decompr on 7.5 GB
Elapsed time:      21.9 s, 772.1 MB/s
faltet@francesc-Latitude-E6430:~/blosc/c-blosc-francesc/build$ ldd bench/bench 
        linux-vdso.so.1 =>  (0x00007ffe2ea55000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f0dfefe6000)
        libblosc.so.1 => /home/faltet/blosc/c-blosc-francesc/build/blosc/libblosc.so.1 (0x00007f0dfedc2000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f0dfeba4000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f0dfe98b000)
        libdensity.so => /usr/local/lib/libdensity.so (0x00007f0dfe46b000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0dfe0a7000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f0dfdd98000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f0dfda91000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f0dfd87a000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f0dff21b000)
        libspookyhash.so => /usr/local/lib/libspookyhash.so (0x00007f0dfd676000)

The above is with the dev branch. With master:

$ bench/bench density single 4 8388608 8 32
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
  BloscLZ: 1.0.5
  LZ4: 1.7.0
  Snappy: 1.1.1
  Zlib: 1.2.8
  DENSITY: 0.12.5
Using compressor: density
Running suite: single
--> 4, 8388608, 8, 32, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes     Type size: 8 bytes
Working set: 256.0 MB           Number of threads: 4
********************** Running benchmarks *********************
memcpy(write):           1786.6 us, 4477.7 MB/s
memcpy(read):            1331.7 us, 6007.2 MB/s
Compression level: 0
comp(write):     1306.5 us, 6123.3 MB/s   Final bytes: 8388624  Ratio: 1.00
decomp(read):    1400.4 us, 5712.8 MB/s   OK
Compression level: 1
comp(write):     54855.8 us, 145.8 MB/s   Final bytes: 5334032  Ratio: 1.57
decomp(read):     180.6 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 2
comp(write):     39616.5 us, 201.9 MB/s   Final bytes: 5334032  Ratio: 1.57
decomp(read):     150.2 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 3
comp(write):     41280.2 us, 193.8 MB/s   Final bytes: 5334032  Ratio: 1.57
decomp(read):     146.5 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 4
comp(write):     28674.1 us, 279.0 MB/s   Final bytes: 4895248  Ratio: 1.71
decomp(read):     160.3 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 5
comp(write):     21312.7 us, 375.4 MB/s   Final bytes: 4895248  Ratio: 1.71
decomp(read):     163.8 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 6
comp(write):     12716.5 us, 629.1 MB/s   Final bytes: 4675856  Ratio: 1.79
decomp(read):     183.4 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 7
comp(write):     8138.4 us, 983.0 MB/s    Final bytes: 4566672  Ratio: 1.84
decomp(read):     187.6 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 8
comp(write):     8028.2 us, 996.5 MB/s    Final bytes: 4566672  Ratio: 1.84
decomp(read):     188.3 us, -0.0 MB/s     FAILED.  Error code: -1
OK
Compression level: 9
comp(write):     6376.3 us, 1254.7 MB/s   Final bytes: 4511568  Ratio: 1.86
decomp(read):     183.5 us, -0.0 MB/s     FAILED.  Error code: -1
OK

Round-trip compr/decompr on 7.5 GB
Elapsed time:      22.0 s, 769.6 MB/s

So that's not any better.

@g1mv
Copy link
Owner

g1mv commented Jun 21, 2015

Did you try to apply the patch I provided to c-blosc ?
I multiplied blocksize by 8, added a bogus size for the output buffer, and did a switch/case to select the various algorithms.

@FrancescAlted
Copy link
Author

Ah, nope. I applied (part of) it here: FrancescAlted/c-blosc@f505fd8 . With this, I am no getting segfaults anymore:

$ bench/bench density single 4
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
  BloscLZ: 1.0.5
  LZ4: 1.7.0
  Snappy: 1.1.1
  Zlib: 1.2.8
  DENSITY: 0.12.5
Using compressor: density
Running suite: single
--> 4, 2097152, 8, 19, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 19 significant bits (out of 32)
Dataset size: 2097152 bytes     Type size: 8 bytes
Working set: 256.0 MB           Number of threads: 4
********************** Running benchmarks *********************
memcpy(write):            504.7 us, 3962.4 MB/s
memcpy(read):             242.4 us, 8250.7 MB/s
Compression level: 0
comp(write):      277.7 us, 7203.2 MB/s   Final bytes: 2097168  Ratio: 1.00
decomp(read):     217.5 us, 9194.7 MB/s   OK
Compression level: 1
comp(write):     2857.1 us, 700.0 MB/s    Final bytes: 1125520  Ratio: 1.86
decomp(read):    2587.3 us, 773.0 MB/s    OK
Compression level: 2
comp(write):     2846.0 us, 702.7 MB/s    Final bytes: 1125520  Ratio: 1.86
decomp(read):    2657.4 us, 752.6 MB/s    OK
Compression level: 3
comp(write):     2844.4 us, 703.1 MB/s    Final bytes: 1125520  Ratio: 1.86
decomp(read):    2668.1 us, 749.6 MB/s    OK
Compression level: 4
comp(write):     2073.3 us, 964.6 MB/s    Final bytes: 1119824  Ratio: 1.87
decomp(read):    1901.5 us, 1051.8 MB/s   OK
Compression level: 5
comp(write):     2081.0 us, 961.1 MB/s    Final bytes: 1119824  Ratio: 1.87
decomp(read):    1905.1 us, 1049.8 MB/s   OK
Compression level: 6
comp(write):     3007.8 us, 664.9 MB/s    Final bytes: 508336  Ratio: 4.13
decomp(read):    3583.5 us, 558.1 MB/s    OK
Compression level: 7
comp(write):     2442.5 us, 818.8 MB/s    Final bytes: 506016  Ratio: 4.14
decomp(read):    2812.5 us, 711.1 MB/s    OK
Compression level: 8
comp(write):     2366.7 us, 845.0 MB/s    Final bytes: 506016  Ratio: 4.14
decomp(read):    2819.0 us, 709.5 MB/s    OK
Compression level: 9
comp(write):     4928.5 us, 405.8 MB/s    Final bytes: 207086  Ratio: 10.13
decomp(read):    5828.6 us, 343.1 MB/s    OK

Round-trip compr/decompr on 7.5 GB
Elapsed time:      20.5 s, 822.2 MB/s

BTW, I am not changing the block size in benchmark because the current one (2 MB) is already a bit large for chunked datasets (for a hint on why small data chunks are important to us, see http://bcolz.blosc.org/).

Curiously enough, density works best without threading:

$ bench/bench density single 1   # use a single thread
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
  BloscLZ: 1.0.5
  LZ4: 1.7.0
  Snappy: 1.1.1
  Zlib: 1.2.8
  DENSITY: 0.12.5
Using compressor: density
Running suite: single
--> 1, 2097152, 8, 19, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 19 significant bits (out of 32)
Dataset size: 2097152 bytes     Type size: 8 bytes
Working set: 256.0 MB           Number of threads: 1
********************** Running benchmarks *********************
memcpy(write):            513.8 us, 3892.3 MB/s
memcpy(read):             251.5 us, 7953.0 MB/s
Compression level: 0
comp(write):      292.3 us, 6841.5 MB/s   Final bytes: 2097168  Ratio: 1.00
decomp(read):     267.0 us, 7491.4 MB/s   OK
Compression level: 1
comp(write):     1974.5 us, 1012.9 MB/s   Final bytes: 1125520  Ratio: 1.86
decomp(read):    1492.9 us, 1339.7 MB/s   OK
Compression level: 2
comp(write):     1902.8 us, 1051.1 MB/s   Final bytes: 1125520  Ratio: 1.86
decomp(read):    1507.6 us, 1326.6 MB/s   OK
Compression level: 3
comp(write):     1918.5 us, 1042.5 MB/s   Final bytes: 1125520  Ratio: 1.86
decomp(read):    1483.9 us, 1347.8 MB/s   OK
Compression level: 4
comp(write):     1709.0 us, 1170.2 MB/s   Final bytes: 1119824  Ratio: 1.87
decomp(read):    1265.1 us, 1580.9 MB/s   OK
Compression level: 5
comp(write):     1706.0 us, 1172.3 MB/s   Final bytes: 1119824  Ratio: 1.87
decomp(read):    1271.0 us, 1573.6 MB/s   OK
Compression level: 6
comp(write):     2314.7 us, 864.0 MB/s    Final bytes: 508336  Ratio: 4.13
decomp(read):    2700.3 us, 740.7 MB/s    OK
Compression level: 7
comp(write):     2402.9 us, 832.3 MB/s    Final bytes: 506016  Ratio: 4.14
decomp(read):    2859.0 us, 699.5 MB/s    OK
Compression level: 8
comp(write):     2443.8 us, 818.4 MB/s    Final bytes: 506016  Ratio: 4.14
decomp(read):    2844.4 us, 703.1 MB/s    OK
Compression level: 9
comp(write):     4945.3 us, 404.4 MB/s    Final bytes: 207086  Ratio: 10.13
decomp(read):    5818.2 us, 343.8 MB/s    OK

Round-trip compr/decompr on 7.5 GB
Elapsed time:      16.9 s, 1001.4 MB/s

Not sure exactly why.

@FrancescAlted
Copy link
Author

Regarding your suggestion of testing Blosc on actual data, well, the gist of it is to work as a compressor for binary data, where zero bytes are, by far, the most common used. Also, the whole point about using the shuffle filter is to increase the probability of finding a run of zeroed bytes in buffers.

The fact is that Blosc works pretty well in practice as you can see for example in: https://www.youtube.com/watch?v=TZdqeEd7iTM or https://www.youtube.com/watch?v=kLP83HZvbfQ

@g1mv
Copy link
Owner

g1mv commented Jun 21, 2015

That is very strange in regards to threading. On my test platform (Core i7 OS/X) here is what I get :

1 thread

$ bench/bench density single 1
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
  BloscLZ: 1.0.5
  LZ4: 1.7.0
  Snappy: unknown
  Zlib: 1.2.5
  DENSITY: 0.12.6
Using compressor: density
Running suite: single
--> 1, 8388608, 8, 32, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes Type size: 8 bytes
Working set: 256.0 MB       Number of threads: 1
********************** Running benchmarks *********************
memcpy(write):       2366.5 us, 3380.5 MB/s
memcpy(read):        1228.9 us, 6509.6 MB/s
Compression level: 0
comp(write):     1268.7 us, 6305.8 MB/s   Final bytes: 8388624  Ratio: 1.00
decomp(read):    1374.2 us, 5821.7 MB/s   OK
Compression level: 1
comp(write):     8289.4 us, 965.1 MB/s    Final bytes: 4566672  Ratio: 1.84
decomp(read):    6334.8 us, 1262.9 MB/s   OK
Compression level: 2
comp(write):     8155.4 us, 980.9 MB/s    Final bytes: 4566672  Ratio: 1.84
decomp(read):    6509.8 us, 1228.9 MB/s   OK
Compression level: 3
comp(write):     8433.1 us, 948.6 MB/s    Final bytes: 4566672  Ratio: 1.84
decomp(read):    6459.7 us, 1238.4 MB/s   OK
Compression level: 4
comp(write):     6900.0 us, 1159.4 MB/s   Final bytes: 4511568  Ratio: 1.86
decomp(read):    4903.2 us, 1631.6 MB/s   OK
Compression level: 5
comp(write):     6945.7 us, 1151.8 MB/s   Final bytes: 4511568  Ratio: 1.86
decomp(read):    4941.9 us, 1618.8 MB/s   OK
Compression level: 6
comp(write):     8646.8 us, 925.2 MB/s    Final bytes: 3622608  Ratio: 2.32
decomp(read):    9722.9 us, 822.8 MB/s    OK
Compression level: 7
comp(write):     7820.2 us, 1023.0 MB/s   Final bytes: 3601120  Ratio: 2.33
decomp(read):    8835.1 us, 905.5 MB/s    OK
Compression level: 8
comp(write):     7845.3 us, 1019.7 MB/s   Final bytes: 3601120  Ratio: 2.33
decomp(read):    8817.7 us, 907.3 MB/s    OK
Compression level: 9
comp(write):     21697.2 us, 368.7 MB/s   Final bytes: 1887328  Ratio: 4.44
decomp(read):    23950.2 us, 334.0 MB/s   OK

Round-trip compr/decompr on 7.5 GB
Elapsed time:      16.5 s, 1022.6 MB/s

2 threads

$ bench/bench density single 2
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
  BloscLZ: 1.0.5
  LZ4: 1.7.0
  Snappy: unknown
  Zlib: 1.2.5
  DENSITY: 0.12.6
Using compressor: density
Running suite: single
--> 2, 8388608, 8, 32, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes Type size: 8 bytes
Working set: 256.0 MB       Number of threads: 2
********************** Running benchmarks *********************
memcpy(write):       2292.8 us, 3489.3 MB/s
memcpy(read):        1232.9 us, 6488.8 MB/s
Compression level: 0
comp(write):     1088.8 us, 7347.3 MB/s   Final bytes: 8388624  Ratio: 1.00
decomp(read):    1307.0 us, 6120.7 MB/s   OK
Compression level: 1
comp(write):     4619.7 us, 1731.7 MB/s   Final bytes: 4566672  Ratio: 1.84
decomp(read):    3784.3 us, 2114.0 MB/s   OK
Compression level: 2
comp(write):     4642.2 us, 1723.3 MB/s   Final bytes: 4566672  Ratio: 1.84
decomp(read):    3688.3 us, 2169.0 MB/s   OK
Compression level: 3
comp(write):     4585.2 us, 1744.7 MB/s   Final bytes: 4566672  Ratio: 1.84
decomp(read):    3743.4 us, 2137.1 MB/s   OK
Compression level: 4
comp(write):     3968.9 us, 2015.7 MB/s   Final bytes: 4511568  Ratio: 1.86
decomp(read):    2929.8 us, 2730.5 MB/s   OK
Compression level: 5
comp(write):     3946.0 us, 2027.4 MB/s   Final bytes: 4511568  Ratio: 1.86
decomp(read):    2964.6 us, 2698.5 MB/s   OK
Compression level: 6
comp(write):     5236.9 us, 1527.6 MB/s   Final bytes: 3622608  Ratio: 2.32
decomp(read):    5659.9 us, 1413.5 MB/s   OK
Compression level: 7
comp(write):     6199.0 us, 1290.5 MB/s   Final bytes: 3601120  Ratio: 2.33
decomp(read):    6393.8 us, 1251.2 MB/s   OK
Compression level: 8
comp(write):     6170.7 us, 1296.4 MB/s   Final bytes: 3601120  Ratio: 2.33
decomp(read):    6286.6 us, 1272.5 MB/s   OK
Compression level: 9
comp(write):     10581.0 us, 756.1 MB/s   Final bytes: 1887328  Ratio: 4.44
decomp(read):    11585.6 us, 690.5 MB/s   OK

Round-trip compr/decompr on 7.5 GB
Elapsed time:       9.9 s, 1699.6 MB/s

4 threads

$ bench/bench density single 4
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
Supported compression libraries:
  BloscLZ: 1.0.5
  LZ4: 1.7.0
  Snappy: unknown
  Zlib: 1.2.5
  DENSITY: 0.12.6
Using compressor: density
Running suite: single
--> 4, 8388608, 8, 32, density
********************** Run info ******************************
Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
Using synthetic data with 32 significant bits (out of 32)
Dataset size: 8388608 bytes Type size: 8 bytes
Working set: 256.0 MB       Number of threads: 4
********************** Running benchmarks *********************
memcpy(write):       2379.6 us, 3362.0 MB/s
memcpy(read):        1199.0 us, 6672.4 MB/s
Compression level: 0
comp(write):     1090.6 us, 7335.2 MB/s   Final bytes: 8388624  Ratio: 1.00
decomp(read):    1305.6 us, 6127.5 MB/s   OK
Compression level: 1
comp(write):     2906.1 us, 2752.9 MB/s   Final bytes: 4566672  Ratio: 1.84
decomp(read):    2453.8 us, 3260.3 MB/s   OK
Compression level: 2
comp(write):     2772.4 us, 2885.6 MB/s   Final bytes: 4566672  Ratio: 1.84
decomp(read):    2427.6 us, 3295.4 MB/s   OK
Compression level: 3
comp(write):     2786.6 us, 2870.9 MB/s   Final bytes: 4566672  Ratio: 1.84
decomp(read):    2404.4 us, 3327.3 MB/s   OK
Compression level: 4
comp(write):     2714.1 us, 2947.5 MB/s   Final bytes: 4511568  Ratio: 1.86
decomp(read):    2168.6 us, 3689.0 MB/s   OK
Compression level: 5
comp(write):     2717.3 us, 2944.1 MB/s   Final bytes: 4511568  Ratio: 1.86
decomp(read):    2152.0 us, 3717.5 MB/s   OK
Compression level: 6
comp(write):     4490.2 us, 1781.7 MB/s   Final bytes: 3622608  Ratio: 2.32
decomp(read):    4443.0 us, 1800.6 MB/s   OK
Compression level: 7
comp(write):     4247.7 us, 1883.4 MB/s   Final bytes: 3601120  Ratio: 2.33
decomp(read):    4253.4 us, 1880.9 MB/s   OK
Compression level: 8
comp(write):     4250.4 us, 1882.2 MB/s   Final bytes: 3601120  Ratio: 2.33
decomp(read):    4271.5 us, 1872.9 MB/s   OK
Compression level: 9
comp(write):     11015.6 us, 726.2 MB/s   Final bytes: 1887328  Ratio: 4.44
decomp(read):    12085.9 us, 661.9 MB/s   OK

Round-trip compr/decompr on 7.5 GB
Elapsed time:       7.8 s, 2166.7 MB/s

So threading is visibly improving things, apart maybe for the 4-thread lion vs 2-thread.

@g1mv
Copy link
Owner

g1mv commented Jun 21, 2015

But after further comparisons yes you're right, it seems like snappy for example scales better with multithreading (goes from 25.5s on 1-thread to 8.2s on 4-thread which is 3 times faster).

BTW there is a slight overhead in setting up a buffer in density as buffer initialization involves some malloc, that's why I had increased the blocksize and maybe that's the reason heavy multithreading is not helping a lot with small block sizes (the small overhead in setting up compression is probably what actually limits the scalability).
I'll look into this further when I get more time, there might be a way to avoid all overhead by slightly modifying the API, and I think it could be worth it in use cases like yours, so thanks for pointing it out 😄

In regards to blosc and binary data, yes I understand what you are trying to do ! The only problem with random data is that you actually deny any obvious "patterns" in non-zero data which inevitably appear when manipulating "human" data.
Since the function you're using is perfectly random, on one side you'll get a predictable number of zeroes in unpredictable order, but on the other all non-zero data will essentially be pattern-free which is not very realistic.
What I mean is that blosc could be very good with this synthetic data - I'm sure it's the case - but less performing with real data as it might "break" some non-zero patterns by "splitting" them which could lead to a compression ratio downgrade.
For example, let's say you want to compress :
ABCDEABCDEABCDEABCDEABCD (24 symbols)
A good compression algorithm will spot a pattern and go :
ABCDEABCDEABCDEABCDEABCD => that's 4 x ABCDE and 1 x ABCD => easy and efficient compression.
However if you split it in 3 blocks of 8 (blosc processing) you get : ABCDEABC DEABCDEA BCDEABCD
Now, each individual block doesn't exhibit any obvious pattern and the same compression algorithm will actually generate very poor results.

@FrancescAlted
Copy link
Author

Yes, the malloc call inside density could be the root of poor threading scalability. Thanks for willing to tackle this.

Blosc does not shuffle using 8 bytes blocks by default, but rather the size of the datatype that you are compressing (2 for short int, 4 for int and float32, 8 for long int and float64 and other sizes for structs too). Using this datatype size is critical for the reasons explained in the talks above.

Regarding real data, you may want to have a look at this notebook:

http://nbviewer.ipython.org/github/Blosc/movielens-bench/blob/master/querying-ep14.ipynb

where real data is being used and where you can see that compression ratio can reach a 20x in this case. Also, it can be seen that some operations takes less time (on decent modern computer) on compressed datasets than in uncompressed ones.

@g1mv g1mv self-assigned this Jun 22, 2015
@g1mv g1mv modified the milestone: 0.13.0 beta Jun 29, 2015
@g1mv
Copy link
Owner

g1mv commented Jan 16, 2018

Needs retesting with 0.14.0

@g1mv g1mv closed this as completed Jan 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants