Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[FLINK-5890] [gelly] GatherSumApply broken when object reuse enabled
The initial fix for this ticket is not working on larger data sets. Reduce supports returning the left input, right input, a new object, or a locally reused object. The trouble with the initial fix was that the returned local object was reusing fields from the input tuples. The problem is with ReduceDriver#run managing two values (reuse1 and reuse2) and with a third, local value returned by GatherSumApplyIteration.SumUDF. After the first grouping value.f1 == reuse1.f1. Following UDF calls may swap value.f1 and reuse2.f1, which causes reuse1.f1 == reuse2.f1. With an odd number of swaps the next grouping will reduce with reuse1 and reuse2 sharing a field and deserialization will overwrite stored values. The simple fix is to only use and return the provided inputs. This closes apache#3515
- Loading branch information