Skip to content

Commit

Permalink
Preparing new release
Browse files Browse the repository at this point in the history
  • Loading branch information
lemire committed Feb 15, 2014
1 parent 60351c5 commit 629577a
Show file tree
Hide file tree
Showing 15 changed files with 193 additions and 27 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG
@@ -1,3 +1,7 @@
0.0.11 (Feb. 14th 2014)
- Fix rare bug in FastPFOR (reported by Stefan Ackermann (https://github.com/Stivo))
- Improved API documentation

0.0.10 (Jan. 25th 2014) 0.0.10 (Jan. 25th 2014)
- cleaning the code and improving the documentation - cleaning the code and improving the documentation


Expand Down
21 changes: 21 additions & 0 deletions src/main/java/me/lemire/integercompression/BinaryPacking.java
Expand Up @@ -8,9 +8,30 @@


/** /**
* Scheme based on a commonly used idea: can be extremely fast. * Scheme based on a commonly used idea: can be extremely fast.
* It encodes integers in blocks of 128 integers. For arrays containing
* an arbitrary number of integers, you should use it in conjunction
* with another CODEC:
*
* <pre>IntegerCODEC ic =
* new Composition(new BinaryPacking(), new VariableByte()).</pre>
* *
* Note that this does not use differential coding: if you are working on sorted * Note that this does not use differential coding: if you are working on sorted
* lists, use IntegratedBinaryPacking instead. * lists, use IntegratedBinaryPacking instead.
*
* <p>
* For details, please see
* </p>
* <p>
* Daniel Lemire and Leonid Boytsov, Decoding billions of integers per second
* through vectorization Software: Practice &amp; Experience
* http://onlinelibrary.wiley.com/doi/10.1002/spe.2203/abstract
* http://arxiv.org/abs/1209.2137
* </p>
* <p>
* Daniel Lemire, Leonid Boytsov, Nathan Kurz,
* SIMD Compression and the Intersection of Sorted Integers
* http://arxiv.org/abs/1401.6399
* </p>
* *
* @author Daniel Lemire * @author Daniel Lemire
*/ */
Expand Down
Expand Up @@ -8,6 +8,13 @@
/** /**
* BinaryPacking with Delta+Zigzag Encoding. * BinaryPacking with Delta+Zigzag Encoding.
* *
* It encodes integers in blocks of 128 integers. For arrays containing
* an arbitrary number of integers, you should use it in conjunction
* with another CODEC:
*
* <pre>IntegerCODEC ic = new Composition(new DeltaZigzagBinaryPacking(),
* new DeltaZigzagVariableByte()).</pre>
*
* @author MURAOKA Taro http://github.com/koron * @author MURAOKA Taro http://github.com/koron
*/ */
public final class DeltaZigzagBinaryPacking implements IntegerCODEC { public final class DeltaZigzagBinaryPacking implements IntegerCODEC {
Expand Down
9 changes: 7 additions & 2 deletions src/main/java/me/lemire/integercompression/FastPFOR.java
Expand Up @@ -11,6 +11,11 @@


/** /**
* This is a patching scheme designed for speed. * This is a patching scheme designed for speed.
* It encodes integers in blocks of 128 integers. For arrays containing
* an arbitrary number of integers, you should use it in conjunction
* with another CODEC:
*
* IntegerCODEC ic = new Composition(new FastPFOR(), new VariableByte()).
* <p/> * <p/>
* For details, please see * For details, please see
* <p/> * <p/>
Expand Down Expand Up @@ -189,7 +194,7 @@ private void encodePage(int[] in, IntWrapper inpos, int thissize,
bitmap |= (1 << (k - 1)); bitmap |= (1 << (k - 1));
} }
out[tmpoutpos++] = bitmap; out[tmpoutpos++] = bitmap;
for (int k = 1; k <= 31; ++k) { for (int k = 1; k <= 32; ++k) {
if (dataPointers[k] != 0) { if (dataPointers[k] != 0) {
out[tmpoutpos++] = dataPointers[k];// size out[tmpoutpos++] = dataPointers[k];// size
for (int j = 0; j < dataPointers[k]; j += 32) { for (int j = 0; j < dataPointers[k]; j += 32) {
Expand Down Expand Up @@ -242,7 +247,7 @@ private void decodePage(int[] in, IntWrapper inpos, int[] out,
inexcept += bytesize / 4; inexcept += bytesize / 4;


final int bitmap = in[inexcept++]; final int bitmap = in[inexcept++];
for (int k = 1; k <= 31; ++k) { for (int k = 1; k <= 32; ++k) {
if ((bitmap & (1 << (k - 1))) != 0) { if ((bitmap & (1 << (k - 1))) != 0) {
int size = in[inexcept++]; int size = in[inexcept++];
if (dataTobePacked[k].length < size) if (dataTobePacked[k].length < size)
Expand Down
Expand Up @@ -13,6 +13,29 @@
* You should only use this scheme on sorted arrays. Use BinaryPacking if you * You should only use this scheme on sorted arrays. Use BinaryPacking if you
* have unsorted arrays. * have unsorted arrays.
* *
* It encodes integers in blocks of 128 integers. For arrays containing
* an arbitrary number of integers, you should use it in conjunction
* with another CODEC:
*
* <pre>IntegratedIntegerCODEC is =
* new IntegratedComposition(new IntegratedBinaryPacking(),
* new IntegratedVariableByte())</pre>
*
* <p>
* For details, please see
* </p>
* <p>
* Daniel Lemire and Leonid Boytsov, Decoding billions of integers per second
* through vectorization Software: Practice &amp; Experience
* http://onlinelibrary.wiley.com/doi/10.1002/spe.2203/abstract
* http://arxiv.org/abs/1209.2137
* </p>
* <p>
* Daniel Lemire, Leonid Boytsov, Nathan Kurz,
* SIMD Compression and the Intersection of Sorted Integers
* http://arxiv.org/abs/1401.6399
* </p>
*
* @author Daniel Lemire * @author Daniel Lemire
* *
*/ */
Expand Down
16 changes: 14 additions & 2 deletions src/main/java/me/lemire/integercompression/IntegratedFastPFOR.java
Expand Up @@ -15,6 +15,13 @@
* differential coding as part of the compression. * differential coding as part of the compression.
* </p> * </p>
* *
* It encodes integers in blocks of 128 integers. For arrays containing
* an arbitrary number of integers, you should use it in conjunction
* with another CODEC:
* <pre>IntegratedIntegerCODEC is =
* new IntegratedComposition(new IntegratedFastPFOR(),
* new IntegratedVariableByte())</pre>
*
* <p> * <p>
* For details, please see * For details, please see
* </p> * </p>
Expand All @@ -25,6 +32,11 @@
* http://arxiv.org/abs/1209.2137 * http://arxiv.org/abs/1209.2137
* </p> * </p>
* <p> * <p>
* Daniel Lemire, Leonid Boytsov, Nathan Kurz,
* SIMD Compression and the Intersection of Sorted Integers
* http://arxiv.org/abs/1401.6399
* </p>
* <p>
* For multi-threaded applications, each thread should use its own * For multi-threaded applications, each thread should use its own
* IntegratedFastPFOR object. * IntegratedFastPFOR object.
* </p> * </p>
Expand Down Expand Up @@ -199,7 +211,7 @@ private void encodePage(int[] constin, IntWrapper constinpos,
bitmap |= (1 << (k - 1)); bitmap |= (1 << (k - 1));
} }
out[tmpoutpos++] = bitmap; out[tmpoutpos++] = bitmap;
for (int k = 1; k <= 31; ++k) { for (int k = 1; k <= 32; ++k) {
if (dataPointers[k] != 0) { if (dataPointers[k] != 0) {
out[tmpoutpos++] = dataPointers[k];// size out[tmpoutpos++] = dataPointers[k];// size
for (int j = 0; j < dataPointers[k]; j += 32) { for (int j = 0; j < dataPointers[k]; j += 32) {
Expand Down Expand Up @@ -253,7 +265,7 @@ private void decodePage(int[] in, IntWrapper inpos, int[] out,
inexcept += bytesize / 4; inexcept += bytesize / 4;


final int bitmap = in[inexcept++]; final int bitmap = in[inexcept++];
for (int k = 1; k <= 31; ++k) { for (int k = 1; k <= 32; ++k) {
if ((bitmap & (1 << (k - 1))) != 0) { if ((bitmap & (1 << (k - 1))) != 0) {
int size = in[inexcept++]; int size = in[inexcept++];
if (dataTobePacked[k].length < size) if (dataTobePacked[k].length < size)
Expand Down
Expand Up @@ -16,7 +16,6 @@
* You should only use this scheme on sorted arrays. Use VariableByte if you * You should only use this scheme on sorted arrays. Use VariableByte if you
* have unsorted arrays. * have unsorted arrays.
* *
*
* @author Daniel Lemire * @author Daniel Lemire
*/ */
public class IntegratedVariableByte implements IntegratedIntegerCODEC, public class IntegratedVariableByte implements IntegratedIntegerCODEC,
Expand Down
7 changes: 7 additions & 0 deletions src/main/java/me/lemire/integercompression/NewPFD.java
Expand Up @@ -17,6 +17,13 @@
* <p/> * <p/>
* using Simple16 as the secondary coder. * using Simple16 as the secondary coder.
* *
* It encodes integers in blocks of 128 integers. For arrays containing
* an arbitrary number of integers, you should use it in conjunction
* with another CODEC:
*
* <pre>IntegerCODEC ic =
* new Composition(new NewPDF(), new VariableByte()).</pre>
*
* Note that this does not use differential coding: if you are working on sorted * Note that this does not use differential coding: if you are working on sorted
* lists, you must compute the deltas separately. (Yes, this is true even though * lists, you must compute the deltas separately. (Yes, this is true even though
* the "D" at the end of the name probably stands for delta.) * the "D" at the end of the name probably stands for delta.)
Expand Down
7 changes: 7 additions & 0 deletions src/main/java/me/lemire/integercompression/NewPFDS16.java
Expand Up @@ -17,6 +17,13 @@
* <p/> * <p/>
* using Simple16 as the secondary coder. * using Simple16 as the secondary coder.
* *
* It encodes integers in blocks of 128 integers. For arrays containing
* an arbitrary number of integers, you should use it in conjunction
* with another CODEC:
*
* <pre>IntegerCODEC ic =
* new Composition(new PDFS16(), new VariableByte()).</pre>
*
* Note that this does not use differential coding: if you are working on sorted * Note that this does not use differential coding: if you are working on sorted
* lists, you must compute the deltas separately. * lists, you must compute the deltas separately.
* *
Expand Down
6 changes: 6 additions & 0 deletions src/main/java/me/lemire/integercompression/NewPFDS9.java
Expand Up @@ -17,6 +17,12 @@
* <p/> * <p/>
* using Simple9 as the secondary coder. * using Simple9 as the secondary coder.
* *
* It encodes integers in blocks of 128 integers. For arrays containing
* an arbitrary number of integers, you should use it in conjunction
* with another CODEC:
*
* <pre>IntegerCODEC ic = new Composition(new PDFS9(), new VariableByte()).</pre>
*
* Note that this does not use differential coding: if you are working on sorted * Note that this does not use differential coding: if you are working on sorted
* lists, you must compute the deltas separately. * lists, you must compute the deltas separately.
* *
Expand Down
6 changes: 6 additions & 0 deletions src/main/java/me/lemire/integercompression/OptPFD.java
Expand Up @@ -16,6 +16,12 @@
* <p/> * <p/>
* using Simple16 as the secondary coder. * using Simple16 as the secondary coder.
* *
* It encodes integers in blocks of 128 integers. For arrays containing
* an arbitrary number of integers, you should use it in conjunction
* with another CODEC:
*
* <pre>IntegerCODEC ic = new Composition(new OptPFD(), new VariableByte()).</pre>
*
* Note that this does not use differential coding: if you are working on sorted * Note that this does not use differential coding: if you are working on sorted
* lists, you must compute the deltas separately. (Yes, this is true even though * lists, you must compute the deltas separately. (Yes, this is true even though
* the "D" at the end of the name probably stands for delta.) * the "D" at the end of the name probably stands for delta.)
Expand Down
6 changes: 6 additions & 0 deletions src/main/java/me/lemire/integercompression/OptPFDS16.java
Expand Up @@ -17,6 +17,12 @@
* <p/> * <p/>
* using Simple16 as the secondary coder. * using Simple16 as the secondary coder.
* *
* It encodes integers in blocks of 128 integers. For arrays containing
* an arbitrary number of integers, you should use it in conjunction
* with another CODEC:
*
* <pre>IntegerCODEC ic = new Composition(new OptPFDS16(), new VariableByte()).</pre>
*
* Note that this does not use differential coding: if you are working on sorted * Note that this does not use differential coding: if you are working on sorted
* lists, you must compute the deltas separately. * lists, you must compute the deltas separately.
* *
Expand Down
5 changes: 5 additions & 0 deletions src/main/java/me/lemire/integercompression/OptPFDS9.java
Expand Up @@ -17,6 +17,11 @@
* <p/> * <p/>
* using Simple9 as the secondary coder. * using Simple9 as the secondary coder.
* *
* It encodes integers in blocks of 128 integers. For arrays containing
* an arbitrary number of integers, you should use it in conjunction
* with another CODEC:
*
* <pre> IntegerCODEC ic = new Composition(new OptPFDS9(), new VariableByte()).</pre>
* *
* Note that this does not use differential coding: if you are working on sorted * Note that this does not use differential coding: if you are working on sorted
* lists, you must compute the deltas separately. * lists, you must compute the deltas separately.
Expand Down
Expand Up @@ -5,7 +5,10 @@
package me.lemire.integercompression; package me.lemire.integercompression;


/** /**
* XOR + BinaryPacking. * BinaryPacking over XOR differential.
*
* <pre>IntegratedIntegerCODEC is =
* new Composition(new XorBinaryPacking(), new VariableByte())</pre>
* *
* @author MURAOKA Taro http://github.com/koron * @author MURAOKA Taro http://github.com/koron
*/ */
Expand Down

0 comments on commit 629577a

Please sign in to comment.