-
Notifications
You must be signed in to change notification settings - Fork 509
[Collections-834][bloom filters] bit count producer operation is not clearly defined #335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Collections-834][bloom filters] bit count producer operation is not clearly defined #335
Conversation
Codecov Report
@@ Coverage Diff @@
## master #335 +/- ##
============================================
- Coverage 85.98% 81.27% -4.72%
+ Complexity 4671 4601 -70
============================================
Files 289 289
Lines 13447 13445 -2
Branches 1977 1973 -4
============================================
- Hits 11563 10927 -636
- Misses 1323 1920 +597
- Partials 561 598 +37
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
|
@aherbert Described the BitCountProducer contract in the BitCountProducer javadoc, and added tests to show that they work as expected. I hope I have not made too much of a hash of it. |
aherbert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Claudenw,
Thanks for all the new test data. I had failed to spot previously that the expected indices from an IndexProducer are not actually checked. I think these changes address the issue with the disabled test in BitCountProducerFromIndexProducerTest and strengthen the test suite.
You test the expected indices using an array that mandates the order. However we already test the order behaviour separately. I suggested a change to only use the expected indices to detect correct/incorrect actual indices. The remaining behaviour tests will detect ordering and duplication problems.
I provided an implementation for AbstractBitCountProducerTest that matches the AbstractIndexProducerTest. This checks consistency between the indices output from the two interface forEach methods. Currently it includes new behaviour flags for forEachCount. However the test suite passes on the assumption the forEachIndex and forEachCount have the same behaviour. So maybe we can drop them. They could be added in the future if an implementation is created with different behaviour for the two methods.
I would change some of the tests from using IncrementingHasher(0, 1) to something else. This avoids a sequence from 0 which is simplistic test data. The data is easy to create with a stream:
// For: IncrementingHasher(3, 2) and Shape.fromKM(17, 72)
IntStream.iterate(3, i -> i + 2).limit(17).toArray();For multiple hashers you can collate them:
// For: IncrementingHasher(3, 2), IncrementingHasher(0, 1), and Shape.fromKM(17, 72)
IntStream.concat(
IntStream.iterate(3, i -> i + 2).limit(17),
IntStream.iterate(0, i -> i + 1).limit(17)
).distinct().toArray();
src/main/java/org/apache/commons/collections4/bloomfilter/BitCountProducer.java
Outdated
Show resolved
Hide resolved
src/main/java/org/apache/commons/collections4/bloomfilter/BitCountProducer.java
Outdated
Show resolved
Hide resolved
src/main/java/org/apache/commons/collections4/bloomfilter/BitCountProducer.java
Outdated
Show resolved
Hide resolved
src/main/java/org/apache/commons/collections4/bloomfilter/BitCountProducer.java
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractBitCountProducerTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/DefaultIndexProducerTest.java
Outdated
Show resolved
Hide resolved
...rg/apache/commons/collections4/bloomfilter/BitCountProducerFromDefaultIndexProducerTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/BitCountProducerFromIntArrayTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/BitCountProducerFromHasherTest.java
Outdated
Show resolved
Hide resolved
...a/org/apache/commons/collections4/bloomfilter/BitCountProducerFromSparseBloomFilterTest.java
Outdated
Show resolved
Hide resolved
348b6a9 to
c52ebfb
Compare
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractIndexProducerTest.java
Outdated
Show resolved
Hide resolved
aherbert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested some changes to the AbstractBitCountProducerTest to assert on the behaviour flags separately from the bit count totals. Can you add these? Please see this note.
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractIndexProducerTest.java
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractBitCountProducerTest.java
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractIndexProducerTest.java
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractIndexProducerTest.java
Outdated
Show resolved
Hide resolved
|
|
||
| @Test | ||
| public final void testAsIndexArrayValues() { | ||
| List<Integer> lst = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is asserting the distinctness of indices using lst.contains (which is inefficient). If you only wish to test distinctness then use a BitSet not a List. Use of a list (over a set) would be to test the ordering.
Given we have a test for distinctness and ordering then I think this test should be:
@Test
public final void testAsIndexArrayValues() {
BitSet bs = new BitSet();
Arrays.stream(createProducer().asIndexArray()).forEach(bs::set);
for (int i : getExpectedIndices()) {
assertTrue(bs.get(i), () -> "Missing " + i);
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have not removed the redundant part of this test. The distinctness is tested in testBehaviourAsIndexArray.
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractIndexProducerTest.java
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractBitCountProducerTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractIndexProducerTest.java
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/IndexProducerFromIntArrayTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/BitCountProducerFromIntArrayTest.java
Show resolved
Hide resolved
aherbert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Claudenw
I've not looked at everything here. I found an issue where the update to the documentation of BitCountProducer is not correct (given the current implementations). See my test that is checking the count of indices output from the IndexProducer. It fails for ArrayCountingBloomFilter.
I don't think we want to change the implementations. We should change the documentation to be less strict. You can then remove some of the updates to the tests as we do not need int[] getExpectedForEach. It is only overridden in BitCountProducerFromHasherTest anyway.
Since you have not changed any of src/main besides javadoc I can only assume this was to pass the test that I think is not required if we are less strict with the interface definition.
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractIndexProducerTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractIndexProducerTest.java
Show resolved
Hide resolved
| @Test | ||
| public final void testBehaviourForEachCount() { | ||
| int flags = getBehaviour(); | ||
| IntList list = new IntList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could do with:
assumeTrue((flags & (FOR_EACH_COUNT_ORDERED | FOR_EACH_COUNT_DISTINCT)) != 0);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As before, assumeTrue should be in the line after you get the flags.
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractBitCountProducerTest.java
Outdated
Show resolved
Hide resolved
aherbert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Claudenw
This is looking good. I suggested a bit more text for the BitCountProducer interface. There is also a simplification we can make in the AbstractBitCountProducerTest to lower maintenance by hiding the new flag constants (for now).
| @Test | ||
| public final void testBehaviourForEachCount() { | ||
| int flags = getBehaviour(); | ||
| IntList list = new IntList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As before, assumeTrue should be in the line after you get the flags.
| /** | ||
| * Creates a BitCountProducer from an IndexProducer. The resulting | ||
| * producer will count each enabled bit once. | ||
| * producer will return every index from the IndexProducer with a count of 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was considering why do we need to do this: convert an IndexProducer to a BitCountProducer. It would be to add it to a CountingBloomFilter. In this case the conversion does not remove duplicate indices. This means that the BitCountProducer can increment the count of an index more than 1 when adding a single IndexProducer.
This has the following consequences:
- The count at each index of a CountingBloomFilter does not represent the number of individual hashes added to the filter that had that index enabled. The count is bounded by the number of hashes as a lower limit, i.e. it may be higher.
- Removal of an IndexProducer from a counting bloom filter must use the same method to construct the BitCountProducer as the addition in order to use the exact same index->count mapping.
I am fine with this behaviour. Filtering an IndexProducer to unique indices is extra work. The IndexFilter class helps with this but requires a Shape. So it cannot be done in the BitCountProducer interface which has no Shape; other mechanisms to filter the IndexProducer without a bound on the index would be inefficient.
So perhaps we require a bit of extra documentation here to serve as a user-beware warning:
* <p>Note that the BitCountProducer does not remove duplicates. Any use of the
* BitCountProducer to create an aggregate mapping of index to counts, such as a
* CountingBloomFilter, should use the same BitCountProducer in both add and
* subtract operations to maintain consistency.
I believe this is consistent with the new wording in the javadoc header for the BitCountProducer interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
| public abstract class AbstractBitCountProducerTest extends AbstractIndexProducerTest { | ||
|
|
||
| /** Flag to indicate the {@link BitCountProducer#forEachCount(BitCountConsumer)} is ordered. */ | ||
| protected static final int FOR_EACH_COUNT_ORDERED = 0x10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the interest of simplicity, can these be dropped?
I previously stated that we should only have them if the behaviour of a BitCountProducer will be different from an IndexProducer. From what I can see in the tests these are used in 3 tests, and only when FOR_EACH_ORDERED and FOR_EACH_DISTINCT from the AbstractIndexProducerTest are also used. Dropping these constants will reduce maintenance and ensure the flags are correctly set for BitCount and Index producers.
You can add a note in the test to state that the same flags are reused.
I think the simplest way to do this is to update the flags to be private:
/** Flag to indicate the {@link BitCountProducer#forEachCount(BitCountConsumer)} is ordered.
* This flag currently reuses the value from IndexProducer. At present no implementations
* exhibit different behaviour for the two interfaces. This may change in the future and
* an explicit flag value will be required. */
private static final int FOR_EACH_COUNT_ORDERED = FOR_EACH_ORDERED;
/** Flag to indicate the {@link BitCountProducer#forEachCount(BitCountConsumer)} is distinct.
* This flag currently reuses the value from IndexProducer. At present no implementations
* exhibit different behaviour for the two interfaces. This may change in the future and
* an explicit flag value will be required. */
private static final int FOR_EACH_COUNT_DISTINCT = FOR_EACH_DISTINCT;An alternative would be to have a flag for unspecified behaviour. This must be explicitly set by implementing tests:
FOR_EACH_COUNT_UNSPECIFIED = 0x10;
FOR_EACH_COUNT_ORDERED = 0x20;
FOR_EACH_COUNT_DISTINCT = 0x40;
The abstract test can then check if the behaviour is unspecified. If true then the other flags must be unset. If false then at least one of the other flags must be set. I think this is more work and not required, especially since we are testing the behaviour that is not a part of the contract given in the javadoc of the main implementations. This is more of a behaviour sense check of the implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like this, but I can't think of a good counter argument. I could do the hand waving and say IndexProducer and BitCountProducer produce different orders but I can't see a reasonable example in any possible implementation I can come up with. So, changes made.
| assertTrue(createEmptyProducer().forEachCount(FALSE_CONSUMER), "empty should be true"); | ||
| assertTrue(createEmptyProducer().forEachCount(TRUE_CONSUMER), "empty should be true"); | ||
| @Test | ||
| public final void testBehaviourForEachCount() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a javadoc note:
/**
* Test the behaviour of {@link BitCountProducer#forEachCount(BitCountConsumer)} with respect
* to ordered and distinct indices. Currently the behaviour is assumed to be the same as
* {@link IndexProducer#forEachIndex(java.util.function.IntPredicate)}.
*/There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
|
I discovered issues with the HasherCollection and Hasher generated IndexProducer and BitCountProducers created from those IndexProducers. Additional tests and corrections included in this latest push. |
|
Last commit was on Oct 21st. Is this correct or do you need to push again? |
|
Let me see what happened. In pushed this AM. (or so I thought)
…On Mon, Oct 24, 2022, 11:57 Alex Herbert ***@***.***> wrote:
Last commit was on Oct 21st. Is this correct or do you need to push again?
—
Reply to this email directly, view it on GitHub
<#335 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AASTVHQUXZFZCYJQ2RTT523WEZTPXANCNFSM6AAAAAAQJQDJ7M>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
…er_operation_is_not_clearly_defined' into Collections-384_BitCountProducer_operation_is_not_clearly_defined
Found the issue on my system. Should be updated now. |
| } | ||
|
|
||
| @Override | ||
| public int[] asIndexArray() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume this change is required otherwise the test suite fails.
aherbert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. However there is a bug in the new HasherCollection.absoluteUniqueIndices method due to how the IndexFilter is created. I've suggested two quick fixes to get around this.
src/main/java/org/apache/commons/collections4/bloomfilter/Hasher.java
Outdated
Show resolved
Hide resolved
src/main/java/org/apache/commons/collections4/bloomfilter/HasherCollection.java
Outdated
Show resolved
Hide resolved
src/main/java/org/apache/commons/collections4/bloomfilter/HasherCollection.java
Outdated
Show resolved
Hide resolved
src/main/java/org/apache/commons/collections4/bloomfilter/HasherCollection.java
Outdated
Show resolved
Hide resolved
| public IndexProducer absoluteUniqueIndices(final Shape shape) { | ||
| return consumer -> { | ||
| Objects.requireNonNull(consumer, "consumer"); | ||
| return uniqueIndices(shape).forEachIndex(IndexFilter.create(shape, consumer)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this can exceed the capacity of the IndexFilter. The uniqueIndices(shape) will output up to k indices for each hasher in the collection (eliminating duplicates). Thus if we have two hashers with shape(k=5, n=1000) we could output 10 indices from the combined unique indices. Then we create an IndexFilter for shape which could be optimised only to check up to k=5 indices.
This throws an index out of bounds exception from IndexFilter.ArrayTracker.
@Test
public void testAbsoluteUniqueIndices() {
int[] actual = new HasherCollection(
new IncrementingHasher(1, 1),
new IncrementingHasher(10, 1)
).absoluteUniqueIndices(Shape.fromKM(5, 1000)).asIndexArray();
int[] expected = IntStream.concat(
IntStream.range(1, 1 + 5),
IntStream.range(10, 10 + 5)
).toArray();
Assertions.assertArrayEquals(expected, actual);
}When I update HasherCollection absoluteUniqueIndices to use a fake shape that forces the IndexTracker.BitMapTracker implementation then the above tests passes:
public IndexProducer absoluteUniqueIndices(final Shape shape) {
return consumer -> {
Objects.requireNonNull(consumer, "consumer");
return uniqueIndices(shape).forEachIndex(IndexFilter.create(
Shape.fromKM(shape.getNumberOfBits(), shape.getNumberOfBits()), consumer));
};
}I think that this can be easily solved by having a new package-private factory constructor in IndexFilter to choose the implementation that can handle exceeding the number of hash functions in the shape:
static IntPredicate create(int numberOfBits, IntPredicate consumer) {
// Choose the BitMapTracker implementation to handle any number of hash functionsOr you could create a new Shape as:
public IndexProducer absoluteUniqueIndices(final Shape shape) {
return consumer -> {
Objects.requireNonNull(consumer, "consumer");
return uniqueIndices(shape).forEachIndex(IndexFilter.create(
Shape.fromKM(shape.getNumberOfHashFunctions() * hashers.size(),
shape.getNumberOfBits()), consumer));
};
}Perhaps the later is simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implemented later version with a check for hashers.size() == 0
...t/java/org/apache/commons/collections4/bloomfilter/BitCountProducerFromUniqueHasherTest.java
Outdated
Show resolved
Hide resolved
.../apache/commons/collections4/bloomfilter/BitCountProducerFromUniqueHasherCollectionTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/EnhancedDoubleHasherTest.java
Outdated
Show resolved
Hide resolved
...e/commons/collections4/bloomfilter/IndexProducerFromAbsoluteUniqueHasherCollectionTest2.java
Outdated
Show resolved
Hide resolved
...org/apache/commons/collections4/bloomfilter/UniqueIndexProducerFromHasherCollectionTest.java
Outdated
Show resolved
Hide resolved
|
In all cases where In all cases where I propose collapsing this into 2 statics: By "replaces" I mean remove all instances of the replaced constants (not make them equivalent). Also, since the @aherbert unless you have an objection, I'll proceed with these changes. |
|
Hi @Claudenw, consolidating all the behaviour flags is a good idea. It will avoid future tests failing to set them correctly. I'll wait until you have done that for a review. Note: I have reservations about the HasherCollection and the requirement for an
|
|
@aherbert The changes are in. Have a look please |
aherbert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Claudenw
I think the consolidation of the ordered and distinct flags has reduced the flexibility of the test suite. It also led to you duplicating the flags in all test classes that override getBehaviour().
However the previous flexibility can be reinstated with my suggestion to have a method to get the behaviour for each of: asIndexArray; forEachIndex; forEachCount. By default this would return the same behaviour for all of them. But it can be configured to change and allow testing different behaviour for the methods such as that done in the default implementation of IndexProducer.
src/main/java/org/apache/commons/collections4/bloomfilter/HasherCollection.java
Outdated
Show resolved
Hide resolved
src/main/java/org/apache/commons/collections4/bloomfilter/IndexProducer.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractBitCountProducerTest.java
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractBitCountProducerTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractBitCountProducerTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/BitCountProducerFromIntArrayTest.java
Outdated
Show resolved
Hide resolved
...a/org/apache/commons/collections4/bloomfilter/BitCountProducerFromSimpleBloomFilterTest.java
Outdated
Show resolved
Hide resolved
...a/org/apache/commons/collections4/bloomfilter/BitCountProducerFromSparseBloomFilterTest.java
Outdated
Show resolved
Hide resolved
|
|
||
| @Override | ||
| protected int getBehaviour() { | ||
| // The default method streams a BitSet so is distinct and ordered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This test has removed some testing of the default methods in IndexProducer.
It should be named IndexProducerFromIndexArrayTest. It is specifically targeting the default implementation of the IndexProducer when created from an index array. In that case the asIndexArray() method returns a clone of the original indices used to create it.
A DefaultIndexProducerTest would test the default implementation of asIndexArray() and look like this.
public class DefaultIndexProducerTest extends AbstractIndexProducerTest {
/** Make forEachIndex unordered and contain duplicates. */
private int[] values = {10, 1, 10, 1};
@Override
protected int[] getExpectedIndices() {
return values;
}
@Override
protected IndexProducer createProducer() {
return new IndexProducer() {
@Override
public boolean forEachIndex(IntPredicate predicate) {
Objects.requireNonNull(predicate);
for (int i : values) {
if (!predicate.test(i)) {
return false;
}
}
return true;
}
};
}
@Override
protected IndexProducer createEmptyProducer() {
return new IndexProducer() {
@Override
public boolean forEachIndex(IntPredicate predicate) {
Objects.requireNonNull(predicate);
return true;
}
};
}
@Override
protected int getBehaviour() {
// The default method streams a BitSet so is distinct and ordered.
// However the forEachIndex may not be distinct and ordered and
// the test cannot distinguish the two cases.
return 0;
}
}Note that this highlights a problem with consolidating all the behaviour flags. Currently the test suite cannot be used to test the asIndexArray behaviour is different from the forEachIndex behaviour.
I think we should revert the consolidation of the flags. However to avoid the duplication of flag setting through the various test cases we can change the AbstractIndexProducerTest test to do this:
public abstract class AbstractIndexProducerTest {
protected static final int ORDERED = 0x1;
protected static final int DISTINCT = 0x2;
protected abstract int getAsIndexArrayBehaviour();
protected int getForEachIndexBehaviour() {
return getAsIndexArrayBehaviour();
}
}And AbtractBitCountProducerTest:
public abstract class AbstractBitCountProducerTest extends AbstractIndexProducerTest {
protected int getForEachCountBehaviour() {
return getAsIndexArrayBehaviour();
}
}Then get the correct flag in the relevant test. It provides the ability to override each test behaviour individually but will default to the same behaviour if not explicitly changed.
Doing this change (DefaultIndexProducerTest and IndexProducerFromIndexArrayTest) should remove the requirement for the additional test testFromIndexArray. The method testFromBitMapProducer is already covered in IndexProducerFromBitMapProducerTest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected. I also added DefaultBitCountProducerTest and removed duplicate test UniqueIndexProducerFromHashCollectionTest which was effectively a duplicate of BitCountProducerFromUniqueHasherCollectionTest
...st/java/org/apache/commons/collections4/bloomfilter/IndexProducerFromBitmapProducerTest.java
Outdated
Show resolved
Hide resolved
Added DefaultBitCountProducerTest removed duplicate test UniqueIndexProducerFromHashCollectionTest was effectivly a duplicate of BitCountProducerFromUniqueHasherCollectionTest
aherbert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Claudenw
Thanks for the changes. They look good. I requested a few comment changes to indicate what the tests are intentionally trying to test.
src/test/java/org/apache/commons/collections4/bloomfilter/DefaultBitCountProducerTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/DefaultBitCountProducerTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/commons/collections4/bloomfilter/DefaultIndexProducerTest.java
Outdated
Show resolved
Hide resolved
…che#335) Document the expected behaviour of the BitCountProducer's mapping of indices to counts. Updated IndexProducer and BitCountProducer tests to verify the expected indices and counts; and optionally verify the encounter order is sorted and indices are distinct.
fix for COLLECTIONS-384
Document bit count producer operation in the BitCountProducer.java file.
Add unit tests for all BitCountProducers.