New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ISPN-4544 testCombinerForDistributedReduction issue #2750
Conversation
task.combinedWith(new DivideByZeroReducer()); | ||
//In distributed execution mode combiner will not get invoked | ||
//unless we have at least MapReduceTask#MAX_COLLECTOR_SIZE | ||
//entries (by default 1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vblagoje This sounds wasteful, why insert 999 intermediary values into the intermediary cache if we could insert just one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No no, the default collector size is 1000 so combine is never invoked for small tasks on distributed map/combine/reduce setup! If combine is never invoked we can not test this issue properly. Recall we use queue of collectors only for distributed map/reduce (large tasks) otherwise we use simpler algorithm that invokes combine always. Hence this test is a bit convoluted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is that it shouldn't be convoluted, we should always be invoking the combiner before sending the intermediate values to the originator (non-distributed reduce) or inserting them in the intermediate cache (distributed reduce).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We always do invoke but in distributed reduce we need 1000 entries to trigger combine. I'll see how to adjust the algorithm to invoke the combiner at least once even if we have less than 1000 entries.
@danberindei ok I have in fact realized that we never invoke combine on the final collector and we should. See the diff. This resolves the issue as well |
@@ -326,21 +327,24 @@ public Integer collate(Map<String, Integer> reducedResults) { | |||
assertPartialWordCount(totalWords); | |||
} | |||
|
|||
@Test(expectedExceptions = CacheException.class) | |||
@Test(expectedExceptions=CacheException.class) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if having it more like what you had before would be better where you checked the exception in the test. I think verifying the nested exception is present might be best. Unfortunately I don't trust expectedException for CacheException when we wrap everything in it :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-100 to removing the spaces
The @Test
annotation also has a expectedExceptionsMessageRegExp
parameter that you could use, but I would also rather check the exception cause in the test itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha ok, will add spaces. I will go back to test the cause. Thanks guys.
@vblagoje The combiner not being called on the final collector should be a separate issue + commit. |
@danberindei ok for final collector call. Makes sense. I'll open another PR/issue and adjust this one |
8465ee1
to
8c3c08c
Compare
…ionWithException doesn't actually test reduce exception
Rebased and updated. |
Integrated, thanks Vladimir! |
@vblagoje Don't forget to link the pull request to the issue in JIRA! |
Master only. Will should review. See https://issues.jboss.org/browse/ISPN-4544 for more details