FLINK-3179 Combiner is not injected if Reduce or GroupReduce input is explicitly partitioned (Ram) #1553

ramkrish86 · 2016-01-27T16:33:43Z

Followed the guidance given in the description in order to fix this. Is the approach correct here? Also using this to learn the code.
Once we see that a partition node is the input of a reduce node or group reduce node - we try to inject the combiner to the source node (the data source node) and the reducer node will take the actual partition node as the input.
So now the structure would be DataSource->Combine->Partition->Reduce.
Suggestions and feedback welcome as am not sure if I have covered all the cases here.

explicitly partitioned (Ram)

ramkrish86 · 2016-01-27T16:34:23Z

Also ensured that the related test cases passes and also the Wordcount program output with and without partition remains the same.

fhueske · 2016-01-28T08:44:51Z

Thanks for the PR!
I'll have a look at it and give feedback hopefully today or tomorrow.

fhueske · 2016-01-29T13:17:09Z

...zer/src/main/java/org/apache/flink/optimizer/operators/GroupReduceWithCombineProperties.java

@@ -102,36 +107,72 @@ public SingleInputPlanNode instantiate(Channel in, SingleInputNode node) {
 											DriverStrategy.SORTED_GROUP_REDUCE, this.keyList);
 		} else {
 			// non forward case. all local properties are killed anyways, so we can safely plug in a combiner


The else branch will not be entered if the GroupReduce's predecessor is a Partition operator.
You need to add an if else branch to the condition.

fhueske · 2016-01-29T13:30:27Z

You identified the right classes and methods for the fix, but the place within the method is not correct. Let me explain the issue.

In the common case as for example in a WordCount program, the operator order looks like this:

[Map] --hash-partition--> [Reduce]

in this case, a combiner will be append to the Map to reduce the data before it is partitioned over the network. This looks like:

[Map] --local-fwd--> [Combine] --hash-partition--> [Reduce]

In some cases, Flink knows that the data is already appropriately partitioned (e.g. after a join):

[Join] --local-fwd--> [Reduce]

in this case, the data is already local and no combiner needs to injected. The check is based on the shipping strategy of the input channel (this is the if case in instantiate()).

In case of an explicit partition operator, the operators look as follows:

[Map] --partition--> [Partition] --local-fwd--> [Reduce]

hence, the code enters the if case, because the input shipping strategy is FORWARD.
Instead we would like to inject a combiner between Map and Partition as follows:

[Map] --local-fwd--> [Combine] --partition--> [Partition] --local-fwd--> [Reduce]

Hence, we should adapt the condition to inject a combiner if the input strategy of Reduce is FORWARD and the input operator is a PartitionNode.

We should add appropriate tests for this feature. I suggest:

a unit test case in GroupReduceCompilationTest
a unit test case in ReduceCompilationTest
an end-to-end integration test in javaApiOperators.GroupReduceITCase
an end-to-end integration test in javaApiOperators.ReduceITCase

ramkrish86 · 2016-01-30T06:03:47Z

Thank you very much for the feedback. Let me try to understand this thing better and update the PR sooner. I will reach out here in case of any questions or doubts that I have. Thanks a lot.

ramkrish86 · 2016-02-01T12:52:33Z

I went through the code. In both cases of WordCount program with and without explicit partition
[Map] --hash-partition--> [Reduce]
[Map] --partition--> [Partition] --local-fwd--> [Reduce]
I see that it goes to the 'else' part of the GroupReduceWithCombineProperties#instantiate() method. I debugged once again for both cases.

hence, the code enters the if case, because the input shipping strategy is FORWARD.
So with an explicit partitioner I don't see that happening. The input shipping strategy of the PARTITION node seems to have PARTITION_HASH as the strategy.
So what am I missing here?

fhueske · 2016-02-01T14:38:13Z

The GroupReduceWithCombineProperties.instanciate() method checks the shipping strategy of the input channel. In case of the WordCount example without explicit hash combiner, the shipping strategy is PARTITION_HASH and the else branch will inject a combiner. If you add an explicit partition operator, the input shipping strategy of the Reduce operator is FORWARD and the if branch is executed and does not add a combiner.

Hence the logic has to into the if branch and not into the else branch. Or even better add an additional condition to the if case !(in.getSource().getOptimizerNode() instanceof PartitionNode) and add an if else branch to handle the special case of the explicit partition operator.

tillrohrmann · 2016-02-11T17:37:41Z

Might be a stupid question, but what if the partitioner depends on the number of elements. E.g. if you use partitionCustom with Partitioner which counts internally the elements and assigns the output channel number with respect to this count. In such a case, a combiner would change the result.

fhueske · 2016-02-11T19:12:51Z

If a Partitioner is implemented such that is does not partition based on the key attribute, it cannot be used for a Reduce or GroupReduce transformation anyways. Also users should expect that a combiner is applied if a ReduceFunction or a GroupReduceFunction that implements a combine interface is used.

ramkrish86 · 2016-02-12T06:03:48Z

@fhueske
I got the problem that I was making. My bad. I was not applying the partition function on the Key ie. the String part returned from the flat map and hence the flow was going to the 'else' case always. Now that I got what was the issue I was having I think I can update this PR shortly. Thanks for the patient review.

fhueske · 2016-02-12T08:44:33Z

@ramkrish86, no worries :-)
I guess the issue description lacked a bit of detail. Flink's optimizer checks, if the partitioning produced by the explicit partitioning operator (hash, range, custom) can be reused for the Reduce. If not, the data is partitioned again and this time the combiner can be applied, since it is the regular.

Thanks for working on this..

ramkrish86 · 2016-02-17T05:08:38Z

A new push request has been submitted. JYFI @fhueske .

fhueske · 2016-02-17T11:29:26Z

...s/flink-examples-batch/src/main/java/org/apache/flink/examples/java/wordcount/WordCount.java

@@ -66,7 +66,7 @@ public static void main(String[] args) throws Exception {

 		DataSet<Tuple2<String, Integer>> counts = 
 				// split up the lines in pairs (2-tuples) containing: (word,1)
-				text.flatMap(new Tokenizer())
+				(text.flatMap(new Tokenizer()))


Unnecessary change

fhueske · 2016-02-17T11:56:15Z

Hi @ramkrish86, thanks for the update.
In addition to my comments inline we also need to extend the ReduceITCase.

Also we must take care of the case where the result of the partition operator goes into more than one function. Consider the following case:

                                     /--fwd--> [Reduce]
[Input] --shuffle--> [Partitioner] -<
                                     \--fwd--> [Map]

which should be translated to:

           /--fwd--> [Combine] --shuffle--> [Partitioner] --fwd--> [Reduce]
[Input] --<
           \--shuffle--> [Partitioner] --fwd--> [Map]

Both translation tests need to be extended to cover this case.

Thanks, Fabian

ramkrish86 · 2016-02-27T07:17:37Z

@fhueske
Could you give some examples for the above use case where the partition is an input to more than one function?

fhueske · 2016-02-29T09:41:42Z

I do not have a concrete use case in mind, but it is certainly possible to implement such a job in the DataSet API. Hence, it should be correctly translated.
You can do this for example like this:

DataSet<Tuple2<String, Long>> data = ...
DataSet<Tuple2<String, Long>> pData = data.partitionByHash(0);
pData.map(new SomeMapFunc())
     .output(new DiscardingOutputFormat<Tuple2<String, Integer>>());
pData.groupReduce(new SomeCombinableGReduceFunc())
     .output(new DiscardingOutputFormat<Tuple2<String, Integer>>());

ramkrish86 · 2016-03-03T12:43:05Z

@fhueske
So for doing the above example where the partioned input goes both to the map and reducer as input
should this class AllGroupWithPartialPreGroupProperties be changed like GroupReduceCombineProperties? I can find both having similar code and the code flow too goes there only?

fhueske · 2016-03-03T13:00:50Z

Sorry, I forgot a groupBy() in my example.
It should be

DataSet<Tuple2<String, Long>> data = ...
DataSet<Tuple2<String, Long>> pData = data.partitionByHash(0);
pData.map(new SomeMapFunc())
     .output(new DiscardingOutputFormat<Tuple2<String, Integer>>());
pData.groupBy(0)
     .groupReduce(new SomeCombinableGReduceFunc())
     .output(new DiscardingOutputFormat<Tuple2<String, Integer>>());

ramkrish86 · 2016-03-04T06:26:53Z

New PR submitted @fhueske . Thanks for helping me thro this code review. It is was more of a beginner and there is a lot to learn from my side.

fhueske · 2016-03-07T13:50:08Z

...zer/src/main/java/org/apache/flink/optimizer/operators/GroupReduceWithCombineProperties.java

 			return new SingleInputPlanNode(node, "Reduce ("+node.getOperator().getName()+")",
 											toReducer, DriverStrategy.SORTED_GROUP_REDUCE, this.keyList);
 		}
 	}

+	private SingleInputPlanNode injectCombinerBeforPartitioner(Channel in, SingleInputNode node) {


Typo in method name injectCombinerBeforePartitioner

Please use meaningful parameter name: in -> toReducer, node -> reduceNode

fhueske · 2016-03-07T14:32:44Z

Hi @ramkrish86,
I just realized that the approach taken here might not work. We are modifying the plan while it is enumerated. There might be cases, where this leads to compiler errors or wrong plans. I have to check which side effects the plan modification might have.

I would suggest we put this PR for a few days on ice and I check whether it is possible to continue or if we have to find another approach.

ramkrish86 · 2016-03-08T04:09:23Z

@fhueske
I was looking into the comments and the refactoring I can avoid by creating a new patch altogether. But seeing the last comment I think I can hold this off for some time. Felt sad as I wanted to bring this to closure. Anyway you are the expert so I need to adhere to your words here. No problem. Will check if I can take up some other JIRA or do you have any suggestions till then?

fhueske · 2016-03-08T08:23:20Z

Hi @ramkrish86, I totally understand that you are disappointed. I'm very sorry to raise these concerns this late after you put a lot of effort into this PR. I should have noticed this issue much earlier :-(

Touching the optimizer is always a little bit like open heart surgery and must be done very carefully with the whole picture in mind. I have not completely investigated the possible side effects yet, but will definitely let you know once I have.

Would you like to work on a different issue in the meantime?

ramkrish86 · 2016-03-08T08:44:56Z

@fhueske
Everthing is a learning. good that I got to know some flows out of this issue. Ya am interested to take up some other JIRA in the meantime.

fhueske · 2016-03-18T13:39:44Z

Hi @ramkrish86, I thought about this PR and came to the conclusion that we should not continue. The optimizer's design does not allow to modify operators in or inject operators into enumerated subplans. This might cause invalid execution plans and in worst case wrong results without somebody noticing it.

I would simply log a WARN message that a combiner was not added if the optimizer identifies a Partition operator in front of a Reduce or combinable GroupReduce operator and give a hint that an explicit CombinerFunction can be added with groupCombine in front of the partition operator.

Sorry again @ramkrish86 that I lead you into a dead end with this PR.

…ot added in front of PartitionOperator This closes apache#1822 This closes apache#1553

fhueske · 2016-03-22T12:55:38Z

Closed this PR in favor of PR #1822

…ot added in front of PartitionOperator This closes apache#1822 This closes apache#1553

FLINK-3179 Combiner is not injected if Reduce or GroupReduce input is

ccf6e18

explicitly partitioned (Ram)

fhueske reviewed Jan 29, 2016
View reviewed changes

Updated commit based on the feedback

80a1f38

fhueske reviewed Feb 17, 2016
View reviewed changes

ramkrish86 added 2 commits March 4, 2016 11:13

Fixing review comments and adding suitable test cases

497f03f

Remove extra braces in Wordcount

9ae03a5

fhueske reviewed Mar 7, 2016
View reviewed changes

fhueske pushed a commit to fhueske/flink that referenced this pull request Mar 22, 2016

[FLINK-3179] [dataSet][optimizer] Log a WARN message if combiner is n…

2df53df

…ot added in front of PartitionOperator This closes apache#1822 This closes apache#1553

asfgit closed this in 76da442 Mar 22, 2016

fijolekProjects pushed a commit to fijolekProjects/flink that referenced this pull request May 1, 2016

[FLINK-3179] [dataSet][optimizer] Log a WARN message if combiner is n…

597d82b

…ot added in front of PartitionOperator This closes apache#1822 This closes apache#1553

rmetzger added the component=API/DataSet label Mar 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FLINK-3179 Combiner is not injected if Reduce or GroupReduce input is explicitly partitioned (Ram) #1553

FLINK-3179 Combiner is not injected if Reduce or GroupReduce input is explicitly partitioned (Ram) #1553

ramkrish86 commented Jan 27, 2016

ramkrish86 commented Jan 27, 2016

fhueske commented Jan 28, 2016

fhueske Jan 29, 2016

fhueske commented Jan 29, 2016

ramkrish86 commented Jan 30, 2016

ramkrish86 commented Feb 1, 2016

fhueske commented Feb 1, 2016

tillrohrmann commented Feb 11, 2016

fhueske commented Feb 11, 2016

ramkrish86 commented Feb 12, 2016

fhueske commented Feb 12, 2016

ramkrish86 commented Feb 17, 2016

fhueske Feb 17, 2016

fhueske commented Feb 17, 2016

ramkrish86 commented Feb 27, 2016

fhueske commented Feb 29, 2016

ramkrish86 commented Mar 3, 2016

fhueske commented Mar 3, 2016

ramkrish86 commented Mar 4, 2016

fhueske Mar 7, 2016

fhueske Mar 7, 2016

fhueske commented Mar 7, 2016

ramkrish86 commented Mar 8, 2016

fhueske commented Mar 8, 2016

ramkrish86 commented Mar 8, 2016

fhueske commented Mar 18, 2016

fhueske commented Mar 22, 2016

FLINK-3179 Combiner is not injected if Reduce or GroupReduce input is explicitly partitioned (Ram) #1553

FLINK-3179 Combiner is not injected if Reduce or GroupReduce input is explicitly partitioned (Ram) #1553

Conversation

ramkrish86 commented Jan 27, 2016

ramkrish86 commented Jan 27, 2016

fhueske commented Jan 28, 2016

fhueske Jan 29, 2016

Choose a reason for hiding this comment

fhueske commented Jan 29, 2016

ramkrish86 commented Jan 30, 2016

ramkrish86 commented Feb 1, 2016

fhueske commented Feb 1, 2016

tillrohrmann commented Feb 11, 2016

fhueske commented Feb 11, 2016

ramkrish86 commented Feb 12, 2016

fhueske commented Feb 12, 2016

ramkrish86 commented Feb 17, 2016

fhueske Feb 17, 2016

Choose a reason for hiding this comment

fhueske commented Feb 17, 2016

ramkrish86 commented Feb 27, 2016

fhueske commented Feb 29, 2016

ramkrish86 commented Mar 3, 2016

fhueske commented Mar 3, 2016

ramkrish86 commented Mar 4, 2016

fhueske Mar 7, 2016

Choose a reason for hiding this comment

fhueske Mar 7, 2016

Choose a reason for hiding this comment

fhueske commented Mar 7, 2016

ramkrish86 commented Mar 8, 2016

fhueske commented Mar 8, 2016

ramkrish86 commented Mar 8, 2016

fhueske commented Mar 18, 2016

fhueske commented Mar 22, 2016