Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-4017] [py] Add Aggregation support to Python API #2115

Closed
wants to merge 5 commits into from

Conversation

GEOFBOT
Copy link
Contributor

@GEOFBOT GEOFBOT commented Jun 16, 2016

Adds Aggregation support in the Python API accessible through .aggregate() and .agg_and(). (I was unable to use .and() in Python because 'and' is a keyword.)

Assembles and applies a GroupReduceFunction using pre-defined
AggregationOperations in Python. References to aggregations in
PythonOperationInfo and other Java classes in the Python API
removed since aggregations are now handled by Python.
@zentol
Copy link
Contributor

zentol commented Jun 20, 2016

Thank you for your contribution, I will review this tomorrow :)

:param field: The index of the Tuple field on which to perform the function.
:return: A GroupReduceOperator that represents the aggregated DataSet.
"""
child_set = self.reduce_group(aggregation(field), combinable=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.reduce_group(AggregationFunction(aggregation, field), combinable=True)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should also be a test case for a non-grouped aggregation.

@zentol
Copy link
Contributor

zentol commented Jun 21, 2016

Looks good overall. I found 1 big issue, the rest are minor things.

@zentol
Copy link
Contributor

zentol commented Jul 1, 2016

Please write a comment when you update the PR, we don't get any notifications for pushed commits :)

@zentol
Copy link
Contributor

zentol commented Jul 1, 2016

I'll have to try it out to be sure, but i can't find a problem looking through the code.

from flink.functions.Aggregation import Sum, Min

input = # [...]
output = input.aggregate(Sum, 0).agg_and(Min, 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation wasn't updated for the rename to and_agg. The same applies to the python.md

@zentol
Copy link
Contributor

zentol commented Jul 5, 2016

Found a small issue in the documentation, otherwise +1 to merge.

@GEOFBOT
Copy link
Contributor Author

GEOFBOT commented Jul 5, 2016

I've addressed the documentation issue, thanks.

@zentol
Copy link
Contributor

zentol commented Jul 15, 2016

merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants