Skip to content

Conversation

@wangpeibin713
Copy link
Contributor

What is the purpose of the change

Brief change log

  • add the class:
    • MultiFieldSumAggregator
  • add function in KeyedStream:
    • public SingleOutputStreamOperator sum(int[] positionToSums )
    • public SingleOutputStreamOperator sum(String[] fields)
  • add unit test in follow ut
    • DataStreamTest
    • AggregationFunctionTest

Verifying this change

This change added tests and can be verified as follows:

  • Added integration tests for end-to-end deployment with small data collection

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no

  • The public API, i.e., is any changed class annotated with@Public(Evolving): yes

    • org.apache.flink.streaming.api.datastream.KeyedStream
  • The serializers: no

  • The runtime per-record code paths (performance sensitive): no

  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no

  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? not documented

wangpeibin added 2 commits February 1, 2019 16:48
… KeyedStream.sum(int[] positionToSums )

- add the MultiFieldSumAggregator  and modify KeyedStream
- add unit test
… KeyedStream.sum(int[] positionToSums )

- fix the check style problems
@wangpeibin713 wangpeibin713 changed the title [FLINK-11510] [DataStream] Add the MultiFieldSumAggregator to support KeyedStream.sum(int[] positionToSums )[FLINK-11510] [DataStream] Add the MultiFieldSumAggregator to support KeyedStream.sum(int[] positionToSums ) [FLINK-11510] [DataStream] Add the MultiFieldSumAggregator to support KeyedStream.sum(int[] positionToSums ) Feb 1, 2019
@flinkbot
Copy link
Collaborator

flinkbot commented Feb 1, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

  • ✅ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@rmetzger
Copy link
Contributor

rmetzger commented Feb 7, 2019

@flinkbot please re-render :)

[FLINK-11510] [DataStream] Add the MultiFieldSumAggregator to support KeyedStream.sum(int[] positionToSums )
@wangpeibin713
Copy link
Contributor Author

@metzger Is there anything need to update, please let me know. Looking forward to your review. Thanks very much. :)

@rmetzger
Copy link
Contributor

rmetzger commented Feb 13, 2019

Thank you for your contribution!
I've commented on the PR because the tracking comment by flinkbot was outdated.
Before I can review the pull request, I would like to get approval from somebody more familiar with the Flink API whether we have consensus to actually add this feature (see also the contribution guide: https://flink.apache.org/contribute-code.html#before-you-start-coding)

I've pinged Aljoscha in the JIRA ticket: https://issues.apache.org/jira/browse/FLINK-11510
Let's hope he soon confirms that the feature is a good fit.

@rmetzger
Copy link
Contributor

@flinkbot approve description

@aljoscha
Copy link
Contributor

@flinkbot disapprove consensus

I think this addition doesn't fit well into the general roadmap that the community has for Flink (see https://flink.apache.org/roadmap.html#analytics-applications-an-the-roles-of-datastream-dataset-and-table-api). The Table API can be used for cases where you need to do filtering, projection, and aggregation, so adding this limited functionality in the DataStream API would not make sense right now.

@wangpeibin713 I'm sorry that we only saw this only now and that you already put work into this. 😓

@wangpeibin713
Copy link
Contributor Author

limited

@flinkbot disapprove consensus

I think this addition doesn't fit well into the general roadmap that the community has for Flink (see https://flink.apache.org/roadmap.html#analytics-applications-an-the-roles-of-datastream-dataset-and-table-api). The Table API can be used for cases where you need to do filtering, projection, and aggregation, so adding this limited functionality in the DataStream API would not make sense right now.

@wangpeibin713 I'm sorry that we only saw this only now and that you already put work into this. 😓

thanks for review. I will close this issue 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants