Time sort with offset/fetch without retraction #4380

rtudoran · 2017-07-20T15:44:11Z

Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the How To Contribute guide.
In addition to going through the list, please provide a meaningful description of your changes.

[x ] General
- The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text")
- The pull request addresses only one issue
- Each commit in the PR has a meaningful commit message (including the JIRA id)
[x ] Documentation
- Documentation has been added for new functionality
- Old documentation affected by the pull request has been updated
- JavaDoc for public methods has been added
[x ] Tests & Build
- Functionality added by the pull request is covered by tests
- mvn clean verify has been executed successfully locally or a Travis build has passed

rtudoran · 2017-07-20T15:45:59Z

@fhueske ,@wuchong
As mentioned in PR #4263 I re-implemented the offset/fetch support for *time without retraction. You can find things in this PR. In principle things should be easy to follow as it is similar with what we already have.
Please let me know what you think

fhueske · 2017-07-20T16:28:44Z

Hi @rtudoran, thanks for updating the PR.

I had a brief look at it and as I said before, I don't think we need additional ProcessFunctions for any of the ORDER BY *time ASC OFFSET x FETCH y cases. All cases (proctime and rowtime) can be implemented by extending the current functions for ORDER BY.

All we have to do is to

add a counter for OFFSET and not emit the first x rows (counter should be stored as state)
add a counter for FETCH and not emit more than y rows (counter should be stored as state).

If OFFSET or FETCH are not required, we can set x and/or y to -1 and ignore the counters.

Once we have extended the current ProcessFunctions, we can also remove the additional methods in DataStreamSort and SortUtil because we always translate to the same ProcessFunction just with different values for offset and fetch. So we only have to extend the current methods in DataStreamSort and SortUtil to set the correct values for offset and fetch (either the values from the query or -1 if not used) to the updated ProcessFunctions.

That should simplify the PR and require much less changes and code.

Let me know what you think,
Fabian

rtudoran · 2017-07-21T11:48:44Z

@fhueske
I did not understood initially that this is your suggestion. What you propose has the advantage that it is easy to maintain (as we consolidate the whole functionality) and a slight disadvantage that you have a couple of useless checks in some scenarios (e.g., you would still check the fetch condition even if you would have only offset or just the plain sorting). If this tiny performance price to pay is ok ...than clearly we can consolidate the implementation.

Nevertheless, even if this is the case i would still propose we keep the ProcTimeIdentitySortProcessFunction
This is for the scenario where you have the order by simply on the time (and no other field). In the case of simple sort we only had an identity map function to pass the events. For the case of offset/fetch we can either extend that or keep one different implementation that adds the state counters for offset/fetch. I propose we keep this later implementation.

rtudoran · 2017-07-21T11:58:16Z

@fhueske
Also - i do not understand why you would need to keep the counters for offset/fetch as states?

Assuming we have in the buffer state with events for proctime T values (1, 2, 3, 4, 5)
You want to emit them with offset 2 and fetch 2 (hence values 3 and 4)

So you will have the onTimer function when proctime moved and you can trigger computation for time T (i.e. at T+1)

The basic logic after you sort is that you go through the 5 elements and count the offset and then the fetch

for(int i=0; i< inputs.size; i++) {
offsetCounter++;
if(offsetCounter > offset && fetchCounter<fetch) {
out.collect(inputs(i))
fetchCounter++;
}
}

...you would then update the states at the end
What is the point here to memorize the fetchCounter and offsetCounter?
If a failure happen meanwhile you would anyway restore the whole list of the 5 elements and restart the logic (i.e., from the beginning of the function).
It is not like you do a state update at every iteration to pick in case of a failure from let's say line 5 when the value of counter was at a certain value

rtudoran · 2017-07-21T15:09:07Z

@fhueske ,@wuchong
I updated the PR. Please have a look

fhueske

Hi @rtudoran,

thanks for updating the PR. I had a look at the sort functions and noticed that the OFFSET and FETCH semantics are not correct. In SQL both are global limits for the complete query result. If a query specifies ORDER BY ... FETCH x ROWS ONLY, then the query must emit exactly x rows (given that the result has at least x rows) and not x rows for the same sort key. We want to have the same SQL semantics as for batch execution, so the operators and tests need to be adjusted to the correct semantics.

Also there is a lot of code duplication that can be avoided with some refactoring.

Best, Fabian

fhueske · 2017-07-27T11:56:55Z

...flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamSort.scala

@@ -108,28 +108,25 @@ class DataStreamSort(
        case _ if FlinkTypeFactory.isProctimeIndicatorType(timeType)  =>
            (sortOffset, sortFetch) match {


change sortOffset and sortFetch member fields to Option[RexNode] to avoid null.

fhueske · 2017-07-27T12:28:38Z

...libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/aggregate/SortUtil.scala

+   * @param execCfg table environment execution configuration
+   * @return org.apache.flink.streaming.api.functions.ProcessFunction
+   */
+  private[flink] def createRowTimeSortFunctionOffset(


I think we can consolidate all sort-related methods in SortUtil into three methods:

createProcTimeNoSortFunction(..., sortOffset: Option[RexNode], sortFetch: Offset[RexNode])

createProcTimeSortFunction(..., sortOffset: Option[RexNode], sortFetch: Offset[RexNode])

createRowTimeSortFunction(..., sortOffset: Option[RexNode], sortFetch: Offset[RexNode])

Each method handles all combinations of offset and fetch with two simple conditions to set the parameter to -1, 0, or the actual value.

fhueske · 2017-07-27T12:51:50Z