Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BEAM-7995: python PGBKCVOperation using wrong timestamp #9364

Merged
merged 3 commits into from Sep 10, 2019

Conversation

lhaiesp
Copy link
Contributor

@lhaiesp lhaiesp commented Aug 16, 2019

python PGBKCVOperation is using closed interval end which is not acceptable for window calculation. Make a change to use window.max_timestamp() which uses open interval end for the window


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- Build Status --- --- Build Status
Java Build Status Build Status Build Status Build Status
Build Status
Build Status
Build Status Build Status Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
--- Build Status
Build Status
Build Status --- --- Build Status
XLang --- --- --- Build Status --- --- ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

python PGBKCVOperation is using closed interval end which is not acceptable for window calculation
@lhaiesp
Copy link
Contributor Author

lhaiesp commented Aug 16, 2019

@robertwb

@@ -886,7 +886,7 @@ def output_key(self, wkey, accumulator):
if windows is 0:
self.output(_globally_windowed_value.with_value((key, value)))
else:
self.output(WindowedValue((key, value), windows[0].end, windows))
self.output(WindowedValue((key, value), windows[0].max_timestamp(), windows))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would not this make the timestamp for the element incorrect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it any more incorrect than "end"? max_timestamp = end - 0.001

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I read this as just 'max_timestamp()', did not realize it is the max timestamp for the window. Your change looks correct.

Referencing your email (https://lists.apache.org/thread.html/ad42c55ed0212cb18b2b29bfc3dddfb47b8cb9f3358583775d0da37b@%3Cdev.beam.apache.org%3E), this needs to be a shared definition between Python and Java. @lukecwik could you confirm that an element in a window should have a timestamp < window.end or could it be <= window.end ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change certainly fixes a bug (though I agree the docs could and should be cleaned up).

We should file a JIRA that we're not supporting more general timestamp combining fns here.

@aaltay aaltay requested a review from robertwb August 16, 2019 21:49
@aaltay
Copy link
Member

aaltay commented Aug 24, 2019

Run Python PreCommit

@lukecwik
Copy link
Member

lukecwik commented Aug 24, 2019 via email

@lhaiesp
Copy link
Contributor Author

lhaiesp commented Sep 3, 2019

Run Python PreCommit

@lhaiesp
Copy link
Contributor Author

lhaiesp commented Sep 3, 2019

Run Portable_Python PreCommit

@lhaiesp
Copy link
Contributor Author

lhaiesp commented Sep 10, 2019

@aaltay thank you so much for the lint fix. Do you happen to know whether anyone is looking into the python pre-commit failure on master branch? It seems to have been broken for a while. I plan to send an email to ask about this but just in case that you have some info.

@aaltay
Copy link
Member

aaltay commented Sep 10, 2019

Test was failing with another lint error. I fixed it in github's UI and did not run linter on it. You can check the test logs to see there are other issues.

@aaltay
Copy link
Member

aaltay commented Sep 10, 2019

Run Portable_Python PreCommit

@lhaiesp
Copy link
Contributor Author

lhaiesp commented Sep 10, 2019

@aaltay ah, thank you so much. Since now all checks have passed. Can you help merge this PR?

@aaltay aaltay merged commit 822c6b8 into apache:master Sep 10, 2019
soyrice pushed a commit to soyrice/beam that referenced this pull request Sep 19, 2019
BEAM-7995: python PGBKCVOperation using wrong timestamp (apache#9364)

python PGBKCVOperation is using closed interval end which is not acceptable for window calculation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants