-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-26477][python] Support WindowedStream.aggregate in Python DataStream API #19176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dianfu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vancior Thanks a lot for the PR. Have left a few comments.
| .aggregate(new AverageAggregate) | ||
| ``` | ||
| {{< /tab >}} | ||
| {{< tab "Python" >}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should also update the chinese doc: docs.zh/content/docs/dev/datastream/operators/windows.md
| ```python | ||
| class ProcessWindowFunction(Function, Generic[IN, OUT, KEY, W]): | ||
|
|
||
| @abstractmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method clear is missing. Should also update the Java/Scala example.
| input \ | ||
| .key_by(<key selector>) \ | ||
| .window(<window assigner>) \ | ||
| .reduce(lambda v1, v2: (v1[0], v1[1] + v2[1]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation is incorrect. It doesn't match the Java/Scala example, also doesn't match the description of this section: "The following example shows how an incremental ReduceFunction can be combined with a ProcessWindowFunction to return the smallest event in a window along with the start time of the window."
| input \ | ||
| .key_by(<key selector>) \ | ||
| .window(<window assigner>) \ | ||
| .apply(new MyWindowFunction()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .apply(new MyWindowFunction()) | |
| .apply(MyWindowFunction()) |
| aggregation function. | ||
| :param result_type: Type information for the result type of the window function. | ||
| :return: The data stream that is the result of applying the window function to the window. | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.. versionadded:: 1.16.0
| aggregate_function: AggregateFunction, | ||
| window_function: Union[WindowFunction, ProcessWindowFunction] = None, | ||
| accumulator_type: TypeInformation = None, | ||
| result_type: TypeInformation = None) -> DataStream: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| result_type: TypeInformation = None) -> DataStream: | |
| output_type: TypeInformation = None) -> DataStream: |
Keep it consistent with the other methods.
| Arriving data is incrementally aggregated using the given aggregate function. This means | ||
| that the window function typically has only a single value to process when called. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about showing a simple example about this API?
68fab99 to
17733c3
Compare
|
@flinkbot run azure |
dianfu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
What is the purpose of the change
Support
WindowedStream.aggregatein Python DataStream API, which is an alignment to the Java API, with doc updated.Verifying this change
This change added tests and can be verified as follows:
WindowedStream.aggregateAPIDoes this pull request potentially affect one of the following parts:
@Public(Evolving): (yes)Documentation