Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISPN-4544 testCombinerForDistributedReduction issue #2750

Closed
wants to merge 1 commit into from

Conversation

vblagoje
Copy link

Master only. Will should review. See https://issues.jboss.org/browse/ISPN-4544 for more details

@vblagoje vblagoje changed the title ISPN-4544 BaseWordCountMapReduceTest#testCombinerForDistributedReduction... ISPN-4544 testCombinerForDistributedReduction issue Jul 26, 2014
task.combinedWith(new DivideByZeroReducer());
//In distributed execution mode combiner will not get invoked
//unless we have at least MapReduceTask#MAX_COLLECTOR_SIZE
//entries (by default 1000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vblagoje This sounds wasteful, why insert 999 intermediary values into the intermediary cache if we could insert just one?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No no, the default collector size is 1000 so combine is never invoked for small tasks on distributed map/combine/reduce setup! If combine is never invoked we can not test this issue properly. Recall we use queue of collectors only for distributed map/reduce (large tasks) otherwise we use simpler algorithm that invokes combine always. Hence this test is a bit convoluted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that it shouldn't be convoluted, we should always be invoking the combiner before sending the intermediate values to the originator (non-distributed reduce) or inserting them in the intermediate cache (distributed reduce).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always do invoke but in distributed reduce we need 1000 entries to trigger combine. I'll see how to adjust the algorithm to invoke the combiner at least once even if we have less than 1000 entries.

@vblagoje
Copy link
Author

@danberindei ok I have in fact realized that we never invoke combine on the final collector and we should. See the diff. This resolves the issue as well

@@ -326,21 +327,24 @@ public Integer collate(Map<String, Integer> reducedResults) {
assertPartialWordCount(totalWords);
}

@Test(expectedExceptions = CacheException.class)
@Test(expectedExceptions=CacheException.class)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if having it more like what you had before would be better where you checked the exception in the test. I think verifying the nested exception is present might be best. Unfortunately I don't trust expectedException for CacheException when we wrap everything in it :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-100 to removing the spaces

The @Test annotation also has a expectedExceptionsMessageRegExp parameter that you could use, but I would also rather check the exception cause in the test itself.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha ok, will add spaces. I will go back to test the cause. Thanks guys.

@danberindei
Copy link
Member

@vblagoje The combiner not being called on the final collector should be a separate issue + commit.

@vblagoje
Copy link
Author

@danberindei ok for final collector call. Makes sense. I'll open another PR/issue and adjust this one

…ionWithException doesn't actually test reduce exception
@vblagoje
Copy link
Author

vblagoje commented Oct 7, 2014

Rebased and updated.

@danberindei
Copy link
Member

Integrated, thanks Vladimir!

@danberindei danberindei closed this Oct 8, 2014
@danberindei
Copy link
Member

@vblagoje Don't forget to link the pull request to the issue in JIRA!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants