Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-7746] Add typing for try_split #10593

Merged
merged 1 commit into from Feb 7, 2020

Conversation

chadrik
Copy link
Contributor

@chadrik chadrik commented Jan 15, 2020

This is a followup to #9056.

R: @robertwb
R: @udim


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- Build Status --- --- Build Status
Java Build Status Build Status Build Status Build Status
Build Status
Build Status
Build Status Build Status Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
--- Build Status
Build Status
Build Status
Build Status
--- --- Build Status
XLang --- --- --- Build Status --- --- ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status
Build Status
Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

@chadrik chadrik force-pushed the python-typing-try-split branch 3 times, most recently from cafbbf5 to 6abebd9 Compare January 15, 2020 17:49
@chadrik chadrik requested review from robertwb and udim January 15, 2020 23:17
@@ -61,6 +61,9 @@
from apache_beam.transforms import sideinputs
from apache_beam.transforms.core import TimerSpec

SplitResultType = Tuple[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like it's time to make a named tuple here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it would help a bit. I searched around for places where we access individual elements of this tuple, and there aren't many cases. Here's the one I found:

  def bundle_application(self,
                         op,  # type: operations.DoOperation
                         primary  # type: common.SplitResultType
                        ):
    ((element_and_restriction, output_watermark),
     _) = primary
    return self.construct_bundle_application(
        op, output_watermark, element_and_restriction)

What name would you give to each item?

SplitResult = NamedTuple(
    'SplitResult', [
        ('???', Tuple[WindowedValue, Optional[Timestamp]]), 
        ('???', Optional[Timestamp])
    ])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe WatermarkedElement and CallbackTimestamp? Talking to @boyuanzz this code is actively being developed, so it might make sense to hold off a bit. In particular, the splits are not entirely symmetric (e.g. a primary only ever has the WindowedValue, while the Residual is more properly a Tuple[WindowedValue, Optiona[Timestamp], Optional[Timestamp]].

@@ -205,16 +205,17 @@ def process_encoded(self, encoded_windowed_values):
self.output(decoded_value)

def try_split(self, fraction_of_remainder, total_buffer_size):
# type: (...) -> Optional[Tuple[int, Optional[Tuple[operations.DoOperation, common.SplitResultType]], Optional[Tuple[operations.DoOperation, common.SplitResultType]], int]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some named tuples, or even a named type, could help for Optional[Tuple[operations.DoOperation, common.SplitResultType]]

@boyuanzz
Copy link
Contributor

Hey, is there any plan to merge this PR?

@chadrik
Copy link
Contributor Author

chadrik commented Jan 30, 2020

I've been meaning to check in on this. @robertwb had some concerns:

Talking to @boyuanzz this code is actively being developed, so it might make sense to hold off a bit. In particular, the splits are not entirely symmetric (e.g. a primary only ever has the WindowedValue, while the Residual is more properly a Tuple[WindowedValue, Optiona[Timestamp], Optional[Timestamp]].

@boyuanzz I'd love to get this merged and have you incorporate typing into your changes, but it may take a bit of effort on your part to properly annotate your changes, depending on how familiar you are with mypy and typing.

@robertwb
Copy link
Contributor

robertwb commented Feb 6, 2020

OK, I'm going to merge this and let @boyuanzz build on top of it.

@chadrik
Copy link
Contributor Author

chadrik commented Feb 6, 2020

Fantastic. That's what I was hoping for. Once some of the outstanding typing PRs get merged the noise in the mypy output should be greatly reduced, so that should make things easier.

@kamilwu
Copy link
Contributor

kamilwu commented Feb 6, 2020

Just to let you know that we've just introduced Python autoformatter. Your merge conflict might be a result of this.
Here you can find an instruction on how to run autoformatter: https://cwiki.apache.org/confluence/display/BEAM/Python+Tips, section Formatting.
Sorry for inconvenience.

@boyuanzz
Copy link
Contributor

boyuanzz commented Feb 6, 2020

Hi Chad, do you happen to have a chance to resolve these file conflicts? Once these conflicts are resolved, I can help you to merge it.

@chadrik
Copy link
Contributor Author

chadrik commented Feb 7, 2020

done!

@boyuanzz
Copy link
Contributor

boyuanzz commented Feb 7, 2020

Thank you, Chad! I'm going to merge it.

@boyuanzz boyuanzz merged commit a9e6d21 into apache:master Feb 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants