Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge #3

Merged
merged 26 commits into from
Feb 17, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
7b1a908
[COLLECTIONS-746] Add
garydgregory Feb 11, 2020
f416d45
Update org.easymock:easymock 4.1 -> 4.2.
garydgregory Feb 13, 2020
433229f
Sort members.
garydgregory Feb 13, 2020
c534839
Format tweaks. Consistently use 'this.' in ctors.
garydgregory Feb 16, 2020
f95c852
Use final.
garydgregory Feb 16, 2020
38c36b5
Sort methods in AB order.
garydgregory Feb 16, 2020
4d6946c
Cast to long to workaround a bug in animal-sniffer.
garydgregory Feb 16, 2020
c665cdb
Fix formatting.
garydgregory Feb 16, 2020
9fd0804
Javadoc.
garydgregory Feb 16, 2020
7d06bd7
[COLLECTIONS-747] MultiKey.getKeys class cast exception.
garydgregory Feb 16, 2020
a1ce1c2
Formatting.
garydgregory Feb 16, 2020
87497d0
Use the stock JRE Objects.requireNonNull() for parameter validation.
garydgregory Feb 16, 2020
d5bf768
[COLLECTIONS-748] Let
garydgregory Feb 17, 2020
a3e2ea2
Remove methods from the javadoc that are not implemented.
aherbert Feb 17, 2020
28b3810
Eliminate extra lines.
aherbert Feb 17, 2020
2a0e867
Remove javadoc from override method.
aherbert Feb 17, 2020
82273e9
Added orCardinality to BitSetBloomFilter.
aherbert Feb 17, 2020
5f70948
Remove whitespace around parentheses.
aherbert Feb 17, 2020
b377f59
Remove extra lines.
aherbert Feb 17, 2020
fa02826
Remove unthrown exception from test setup().
aherbert Feb 17, 2020
373a241
Removed invalid javadoc.
aherbert Feb 17, 2020
4033ff6
Test code clean-up.
aherbert Feb 17, 2020
1f17189
Remove unthrown exception.
aherbert Feb 17, 2020
7aaf396
Correct test javadoc headers.
aherbert Feb 17, 2020
d31ebdd
Javadoc clean-up.
aherbert Feb 17, 2020
39f0955
Removed spurious javadoc tag.
aherbert Feb 17, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -453,7 +453,7 @@
<dependency>
<groupId>org.easymock</groupId>
<artifactId>easymock</artifactId>
<version>4.1</version>
<version>4.2</version>
<scope>test</scope>
</dependency>
<dependency>
Expand Down
9 changes: 9 additions & 0 deletions src/changes/changes.xml
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,15 @@
<action dev="ggregory" type="update" due-to="Gary Gregory">
[build] Update Jacoco from 0.8.4 to 0.8.5.
</action>
<action dev="ggregory" type="update" due-to="Gary Gregory">
[test] Update org.easymock:easymock 4.1 -> 4.2.
</action>
<action issue="COLLECTIONS-747" dev="ggregory" type="fix" due-to="Gary Gregory, Walter Laan">
MultiKey.getKeys class cast exception.
</action>
<action issue="COLLECTIONS-748" dev="ggregory" type="update" due-to="Gary Gregory">
Let org.apache.commons.collections4.properties.[Sorted]PropertiesFactory accept XML input.
</action>
</release>
<release version="4.4" date="2019-07-05" description="Maintenance release.">
<action issue="COLLECTIONS-710" dev="ggregory" type="fix" due-to="Yu Shi, Gary Gregory">
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,7 @@
* This abstract class provides additional functionality not declared in the interface.
* Specifically:
* <ul>
* <li>orCardinality</li>
* <li>jaccardSimilarity</li>
* <li>jaccardDistance</li>
* <li>cosineSimilarity</li>
* <li>cosineDistance</li>
* <li>estimateSize</li>
* <li>estimateUnionSize</li>
* <li>estimateIntersectionSize</li>
* <li>isFull</li>
* <li>{@link #isFull()}</li>
* </ul>
*
* @since 4.5
Expand All @@ -54,67 +46,101 @@ public abstract class AbstractBloomFilter implements BloomFilter {
private final Shape shape;

/**
* Gets an array of little-endian long values representing the on bits of this filter.
* bits 0-63 are in the first long.
* Construct a Bloom filter with the specified shape.
*
* @return the LongBuffer representation of this filter.
* @param shape The shape.
*/
@Override
public abstract long[] getBits();
protected AbstractBloomFilter(final Shape shape) {
this.shape = shape;
}

/**
* Creates a StaticHasher that contains the indexes of the bits that are on in this
* filter.
* Performs a logical "AND" with the other Bloom filter and returns the cardinality of
* the result.
*
* @return a StaticHasher for that produces this Bloom filter.
* @param other the other Bloom filter.
* @return the cardinality of the result of {@code ( this AND other )}.
*/
@Override
public abstract StaticHasher getHasher();
public int andCardinality(final BloomFilter other) {
verifyShape(other);
final long[] mine = getBits();
final long[] theirs = other.getBits();
final int limit = Integer.min(mine.length, theirs.length);
final long[] result = new long[limit];
for (int i = 0; i < limit; i++) {
result[i] = mine[i] & theirs[i];
}
return BitSet.valueOf(result).cardinality();
}

/**
* Construct a Bloom filter with the specified shape.
* Gets the cardinality of this Bloom filter.
*
* @param shape The shape.
* @return the cardinality (number of enabled bits) in this filter.
*/
protected AbstractBloomFilter(final Shape shape) {
this.shape = shape;
@Override
public int cardinality() {
return BitSet.valueOf(getBits()).cardinality();
}

/**
* Verify the other Bloom filter has the same shape as this Bloom filter.
* Performs a contains check. Effectively this AND other == other.
*
* @param other the other filter to check.
* @throws IllegalArgumentException if the shapes are not the same.
* @param other the Other Bloom filter.
* @return true if this filter matches the other.
*/
protected void verifyShape(final BloomFilter other) {
verifyShape(other.getShape());
@Override
public boolean contains(final BloomFilter other) {
verifyShape(other);
return other.cardinality() == andCardinality(other);
}

/**
* Verify the specified shape has the same shape as this Bloom filter.
* Performs a contains check against a decomposed Bloom filter. The shape must match
* the shape of this filter. The hasher provides bit indexes to check for. Effectively
* decomposed AND this == decomposed.
*
* @param shape the other shape to check.
* @throws IllegalArgumentException if the shapes are not the same.
* @param hasher The hasher containing the bits to check.
* @return true if this filter contains the other.
* @throws IllegalArgumentException if the shape argument does not match the shape of
* this filter, or if the hasher is not the specified one
*/
protected void verifyShape(final Shape shape) {
if (!this.shape.equals(shape)) {
throw new IllegalArgumentException(String.format("Shape %s is not the same as %s", shape, this.shape));
@Override
public boolean contains(final Hasher hasher) {
verifyHasher(hasher);
final long[] buff = getBits();

final OfInt iter = hasher.getBits(shape);
while (iter.hasNext()) {
final int idx = iter.nextInt();
final int buffIdx = idx / Long.SIZE;
final int pwr = Math.floorMod(idx, Long.SIZE);
final long buffOffset = 1L << pwr;
if ((buff[buffIdx] & buffOffset) == 0) {
return false;
}
}
return true;
}

/**
* Verifies that the hasher has the same name as the shape.
* Gets an array of little-endian long values representing the on bits of this filter.
* bits 0-63 are in the first long.
*
* @param hasher the Hasher to check
* @return the LongBuffer representation of this filter.
*/
protected void verifyHasher(final Hasher hasher) {
if (shape.getHashFunctionIdentity().getSignature() != hasher.getHashFunctionIdentity().getSignature()) {
throw new IllegalArgumentException(
String.format("Hasher (%s) is not the hasher for shape (%s)",
HashFunctionIdentity.asCommonString(hasher.getHashFunctionIdentity()),
shape.toString()));
}
}
@Override
public abstract long[] getBits();

/**
* Creates a StaticHasher that contains the indexes of the bits that are on in this
* filter.
*
* @return a StaticHasher for that produces this Bloom filter.
*/
@Override
public abstract StaticHasher getHasher();

/**
* Gets the shape of this filter.
Expand All @@ -126,6 +152,16 @@ public final Shape getShape() {
return shape;
}

/**
* Determines if the bloom filter is "full". Full is defined as having no unset
* bits.
*
* @return true if the filter is full.
*/
public final boolean isFull() {
return cardinality() == getShape().getNumberOfBits();
}

/**
* Merge the other Bloom filter into this one.
*
Expand All @@ -145,36 +181,6 @@ public final Shape getShape() {
@Override
abstract public void merge(Hasher hasher);

/**
* Gets the cardinality of this Bloom filter.
*
* @return the cardinality (number of enabled bits) in this filter.
*/
@Override
public int cardinality() {
return BitSet.valueOf(getBits()).cardinality();
}

/**
* Performs a logical "AND" with the other Bloom filter and returns the cardinality of
* the result.
*
* @param other the other Bloom filter.
* @return the cardinality of the result of {@code ( this AND other )}.
*/
@Override
public int andCardinality(final BloomFilter other) {
verifyShape(other);
final long[] mine = getBits();
final long[] theirs = other.getBits();
final int limit = Integer.min(mine.length, theirs.length);
final long[] result = new long[limit];
for (int i = 0; i < limit; i++) {
result[i] = mine[i] & theirs[i];
}
return BitSet.valueOf(result).cardinality();
}

@Override
public int orCardinality(final BloomFilter other) {
verifyShape(other);
Expand All @@ -194,13 +200,48 @@ public int orCardinality(final BloomFilter other) {
for (int i = 0; i < limit; i++) {
result[i] = mine[i] | theirs[i];
}
if (limit<result.length)
{
System.arraycopy(remainder, limit, result, limit, result.length-limit);
if (limit < result.length) {
System.arraycopy(remainder, limit, result, limit, result.length - limit);
}
return BitSet.valueOf(result).cardinality();
}

/**
* Verifies that the hasher has the same name as the shape.
*
* @param hasher the Hasher to check
*/
protected void verifyHasher(final Hasher hasher) {
if (shape.getHashFunctionIdentity().getSignature() != hasher.getHashFunctionIdentity().getSignature()) {
throw new IllegalArgumentException(
String.format("Hasher (%s) is not the hasher for shape (%s)",
HashFunctionIdentity.asCommonString(hasher.getHashFunctionIdentity()),
shape.toString()));
}
}

/**
* Verify the other Bloom filter has the same shape as this Bloom filter.
*
* @param other the other filter to check.
* @throws IllegalArgumentException if the shapes are not the same.
*/
protected void verifyShape(final BloomFilter other) {
verifyShape(other.getShape());
}

/**
* Verify the specified shape has the same shape as this Bloom filter.
*
* @param shape the other shape to check.
* @throws IllegalArgumentException if the shapes are not the same.
*/
protected void verifyShape(final Shape shape) {
if (!this.shape.equals(shape)) {
throw new IllegalArgumentException(String.format("Shape %s is not the same as %s", shape, this.shape));
}
}

/**
* Performs a logical "XOR" with the other Bloom filter and returns the cardinality of
* the result.
Expand All @@ -227,61 +268,9 @@ public int xorCardinality(final BloomFilter other) {
for (int i = 0; i < limit; i++) {
result[i] = mine[i] ^ theirs[i];
}
if (limit<result.length)
{
System.arraycopy(remainder, limit, result, limit, result.length-limit);
if (limit < result.length) {
System.arraycopy(remainder, limit, result, limit, result.length - limit);
}
return BitSet.valueOf(result).cardinality();
}

/**
* Performs a contains check. Effectively this AND other == other.
*
* @param other the Other Bloom filter.
* @return true if this filter matches the other.
*/
@Override
public boolean contains(final BloomFilter other) {
verifyShape(other);
return other.cardinality() == andCardinality(other);
}

/**
* Performs a contains check against a decomposed Bloom filter. The shape must match
* the shape of this filter. The hasher provides bit indexes to check for. Effectively
* decomposed AND this == decomposed.
*
* @param hasher The hasher containing the bits to check.
* @return true if this filter contains the other.
* @throws IllegalArgumentException if the shape argument does not match the shape of
* this filter, or if the hasher is not the specified one
*/
@Override
public boolean contains(final Hasher hasher) {
verifyHasher( hasher );
final long[] buff = getBits();

final OfInt iter = hasher.getBits(shape);
while (iter.hasNext()) {
final int idx = iter.nextInt();
final int buffIdx = idx / Long.SIZE;
final int pwr = Math.floorMod(idx, Long.SIZE);
final long buffOffset = 1L << pwr;
if ((buff[buffIdx] & buffOffset) == 0) {
return false;
}
}
return true;
}

/**
* Determines if the bloom filter is "full". Full is defined as having no unset
* bits.
*
* @return true if the filter is full.
*/
public final boolean isFull() {
return cardinality() == getShape().getNumberOfBits();
}

}
Loading