unique process of compression #20

mounamouna · 2015-05-08T22:30:45Z

Hi,
I succeed to build the package FastPFor in my machine and compile the example.cpp. So, i change the integers in the vector data:
std::vector<uint32_t> mydata(N);
mydata[0] = 4294967295;
mydata[1] = 4294967295;
i display compressed data and decompressed data
std::cout<<"Compressed data " << compressed_output.data()<<std::endl;
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
codec.decodeArray(compressed_output.data(),
compressed_output.size(), mydataback.data(), recoveredsize);
std::cout<<"Decompressed data 1 " <<mydataback.data()[0]<<std::endl;
std::cout<<"Decompressed data 2 " <<mydataback.data()[1]<<std::endl;
the result is in first execution
Compressed data 0x1b99a80
You are using 0.109 bits per integer.
Decompressed data 1 4294967295
Decompressed data 2 4294967295
////////////////////////////////////////////////////////////////////////////////
in second execution
Compressed data 0xd0da80
You are using 0.109 bits per integer.
Decompressed data 1 4294967295
Decompressed data 2 4294967295
---> i obtain a different compressed data. Perhaps this is the address of the compressed data, for thus i add
std::cout<<"Compressed data " << compressed_output.data()[0]<<std::endl;
but the result of the compressed data is the same when i change mydata[0] .
How can i distinguish between two processes of compression?? Is the compressed data unique for each compression???What is the information which makes compression process unique and unchangeable??

Thanks in advance.

lemire · 2015-05-08T23:53:42Z

Can you provide a test case? (source code)

I do not understand the issue you are reporting.

mounamouna · 2015-05-09T00:27:00Z

I want to compress two ipv4 addresses, so i define a vector contained two
32 bits integers. After that, i need to know if the compressed data is
writen in 32 bits max or no, for thus i need to display the compressed data
in form of decimal. The later will be considered as another address (writen
in maximum 32 bits). It is possible ?? Transform two address on one can be
performed with this process of compression?? I can deal with the original
addresses differently (vector contained eight 8 bits integers), my goal is
to compress the information in max 32 bits word.

2015-05-09 1:53 GMT+02:00 Daniel Lemire notifications@github.com:

Can you provide a test case? (source code)

I do not understand the issue you are reporting.

—
Reply to this email directly or view it on GitHub
#20 (comment).

lemire · 2015-05-09T00:42:25Z

@mounamouna

This library is ill-suited for the purpose you describe. Though you can certainly can encode an array containing two 32-bit integers, it is unlikely that the result will be a single 32-bit integer in general.

This library is meant for computing arrays containing many integers. Please see the example:

https://github.com/lemire/FastPFor/blob/master/example.cpp

I am closing this issue as invalid.

If you do find a bug, please provide a reproducible test case.

mounamouna · 2015-05-09T01:17:41Z

Please Sir ,just a final question, "computing arrays containing many
integers",
that means we are able to compute each integer in the initial vector
through the compressed data???? so the compressed data is a vector contains
integers smaller then integers in the initial vector???That is right??

Thanks in advance Sir.

2015-05-09 2:42 GMT+02:00 Daniel Lemire notifications@github.com:

@mounamouna https://github.com/mounamouna

This library is ill-suited for the purpose you describe. Though you can
certainly can encode an array containing two 32-bit integers, it is
unlikely that the result will be a single 32-bit integer in general.

This library is meant for computing arrays containing many integers.
Please see the example:

https://github.com/lemire/FastPFor/blob/master/example.cpp

I am closing this issue as invalid.

If you do find a bug, please provide a reproducible test case.

—
Reply to this email directly or view it on GitHub
#20 (comment).

lemire · 2015-05-09T01:33:06Z

that means we are able to compute each integer in the initial vector
through the compressed data?

Of course.

so the compressed data is a vector contains integers smaller then integers in the initial vector?

The goal of the library is to have fewer integers in the compressed vector. Yes.

mounamouna · 2015-05-09T01:48:02Z

It isn't logical to have the same compressed vector for two different
initial vectors. that's right? I tested with two different initial vectors
(a,b) and (a1,b) but the compressed vector compressed_output.data()[] is
the same. Is it a bug ??

2015-05-09 3:33 GMT+02:00 Daniel Lemire notifications@github.com:

that means we are able to compute each integer in the initial vector
through the compressed data?

Of course.

so the compressed data is a vector contains integers smaller then integers
in the initial vector?

The goal of the library is to have fewer integers in the compressed
vector. Yes.

—
Reply to this email directly or view it on GitHub
#20 (comment).

lemire · 2015-05-09T01:51:45Z

Yes it is a bug. It is most likely a bug in your code.

mounamouna · 2015-05-09T02:14:08Z

mouna@ubuntu:~/newtmp/FastPFor$ ./example

Compressed data 19984

Compressed data 23

You are using 0.109 bits per integer.

Decompressed data 1 4294967295

Decompressed data 2 4294967295

mouna@ubuntu:~/newtmp/FastPFor$ make example

[ 85%] Built target FastPFor

Scanning dependencies of target example

[100%] Building CXX object CMakeFiles/example.dir/example.cpp.o

Linking CXX executable example

[100%] Built target example

mouna@ubuntu:~/newtmp/FastPFor$ ./example

Compressed data 19984

Compressed data 23

You are using 0.109 bits per integer.

Decompressed data 1 4967295

Decompressed data 2 4294967295

What is the problem? I used 32 bits integers, i changed the first integer
but the compressed vector still the same.

2015-05-08 18:51 GMT-07:00 Daniel Lemire notifications@github.com:

Yes it is a bug. It is most likely a bug in your code.

—
Reply to this email directly or view it on GitHub
#20 (comment).

mounamouna · 2015-05-09T02:15:53Z

mouna@ubuntu:~/newtmp/FastPFor$ ./example

Compressed data 19984

Compressed data 23

You are using 0.109 bits per integer.

Decompressed data 1 4294967295

Decompressed data 2 4294967295

mouna@ubuntu:~/newtmp/FastPFor$ make example

[ 85%] Built target FastPFor

Scanning dependencies of target example

[100%] Building CXX object CMakeFiles/example.dir/example.cpp.o

Linking CXX executable example

[100%] Built target example

mouna@ubuntu:~/newtmp/FastPFor$ ./example

Compressed data 19984

Compressed data 23

You are using 0.109 bits per integer.

Decompressed data 1 4967295

Decompressed data 2 4294967295

What is the problem? I used 32 bits integers, i changed the first one but
the compressed vector still (19984 , 23).

2015-05-09 3:51 GMT+02:00 Daniel Lemire notifications@github.com:

Yes it is a bug. It is most likely a bug in your code.

—
Reply to this email directly or view it on GitHub
#20 (comment).

lemire · 2015-05-09T02:30:30Z

If you think you have found a bug, please submit a test case.

mounamouna · 2015-05-09T02:37:18Z

test case
first:
mydata[0] = 4294967295;
mydata[1] = 4294967295;
second:
mydata[0] = 4967295;
mydata[1] = 4294967295;

2015-05-08 19:30 GMT-07:00 Daniel Lemire notifications@github.com:

If you think you have found a bug, please submit a test case.

—
Reply to this email directly or view it on GitHub
#20 (comment).

lemire · 2015-05-09T03:09:53Z

The size of the compressed vector is most certainly more than two words. Probably four words. That is, the "compressed" vector is probably larger than the input vector.

These arrays you provide are not compressible using this library. They are too short.

Please read the papers, study carefully the code and the examples.

Daniel Lemire and Leonid Boytsov, Decoding billions of integers per second through vectorization, Software Practice & Experience 45 (1), 2015. http://arxiv.org/abs/1209.2137 http://onlinelibrary.wiley.com/doi/10.1002/spe.2203/abstract
Daniel Lemire, Leonid Boytsov, Nathan Kurz, SIMD Compression and the Intersection of Sorted Integers, Software Practice & Experience (to appear) http://arxiv.org/abs/1401.6399
Jeff Plaisance, Nathan Kurz, Daniel Lemire, Vectorized VByte Decoding, International Symposium on Web Algorithms 2015, 2015. http://arxiv.org/abs/1503.07387
Wayne Xin Zhao, Xudong Zhang, Daniel Lemire, Dongdong Shan, Jian-Yun Nie, Hongfei Yan, Ji-Rong Wen, A General SIMD-based Approach to Accelerating Compression Algorithms, ACM Transactions on Information Systems 33 (3), 2015. http://arxiv.org/abs/1502.01916

I am not going to be able to help you further.

mounamouna · 2015-05-09T09:18:16Z

Thank you Sir.

2015-05-09 5:09 GMT+02:00 Daniel Lemire notifications@github.com:

The size of the compressed vector is most certainly more than two words.
Probably four words. That is, the "compressed" vector is probably larger
than the input vector.

These arrays you provide are not compressible using this library. They are
too short.

Please read the papers, study carefully the code and the examples.

Daniel Lemire and Leonid Boytsov, Decoding billions of integers per
second through vectorization, Software Practice & Experience 45 (1), 2015.
http://arxiv.org/abs/1209.2137
http://onlinelibrary.wiley.com/doi/10.1002/spe.2203/abstract

Daniel Lemire, Leonid Boytsov, Nathan Kurz, SIMD Compression and the
Intersection of Sorted Integers, Software Practice & Experience (to appear)
http://arxiv.org/abs/1401.6399

Jeff Plaisance, Nathan Kurz, Daniel Lemire, Vectorized VByte
Decoding, International Symposium on Web Algorithms 2015, 2015.
http://arxiv.org/abs/1503.07387

Wayne Xin Zhao, Xudong Zhang, Daniel Lemire, Dongdong Shan, Jian-Yun
Nie, Hongfei Yan, Ji-Rong Wen, A General SIMD-based Approach to
Accelerating Compression Algorithms, ACM Transactions on Information
Systems 33 (3), 2015. http://arxiv.org/abs/1502.01916

I am not going to be able to help you further.

—
Reply to this email directly or view it on GitHub
#20 (comment).

lemire closed this as completed May 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unique process of compression #20

unique process of compression #20

mounamouna commented May 8, 2015

lemire commented May 8, 2015

mounamouna commented May 9, 2015

lemire commented May 9, 2015

mounamouna commented May 9, 2015

lemire commented May 9, 2015

mounamouna commented May 9, 2015

lemire commented May 9, 2015

mounamouna commented May 9, 2015

mounamouna commented May 9, 2015

lemire commented May 9, 2015

mounamouna commented May 9, 2015

lemire commented May 9, 2015

mounamouna commented May 9, 2015

unique process of compression #20

unique process of compression #20

Comments

mounamouna commented May 8, 2015

lemire commented May 8, 2015

mounamouna commented May 9, 2015

lemire commented May 9, 2015

mounamouna commented May 9, 2015

lemire commented May 9, 2015

mounamouna commented May 9, 2015

lemire commented May 9, 2015

mounamouna commented May 9, 2015

mounamouna commented May 9, 2015

lemire commented May 9, 2015

mounamouna commented May 9, 2015

lemire commented May 9, 2015

mounamouna commented May 9, 2015