New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Beam-3208] Add Distinct PTransform to mirror Java SDK #7918
Conversation
Run Portable_Python PreCommit |
Run RAT PreCommit |
Run Portable_Python PreCommit |
42369f6
to
6ae5f5d
Compare
@ttanay is this ready for review? (There is still WIP in the title.) |
Thanks for checking this out @aaltay. |
93162ee
to
227878e
Compare
Run Python_pytest PreCommit |
Python_pytest is failing with message:
Did something change? |
Hi Tanay. The pytest thing showed up only today. I don't know where it came from, but it should be safe to ignore for now. |
@udim might have an idea about the pytest. |
Sorry, that is a test job of mine. It shouldn't be running automatically. 🤷♂️ |
Hello Tanay. Happy to work on this PR together. I like how you used the appropriate Beam deprecation marker for the old transform. That's great. I believe that this new utility that you're adding is complex, and will be easy to misuse. Furthermore, the transform is quite simple. Would you consider using a different method to do this? (e.g. subclassing distinct, or having removeduplicates expand into distinct?) - we'd love to add the least complexity possible. If you're in the ASF slack, feel free to ping me directly to chat. |
Hey @pabloem |
Hi @ttanay - this is great. Would you please squash the commits so we can merge? - This LGTM |
@pabloem Should I change the deprecation version? It is currently "2.11" here: |
Since 2.11 is released already, yeah, I'd think you want to set it to 2.12 |
Cool. I'll make the change. |
The Java SDK introduced a Distinct PTransform to replace the RemoveDuplicates PTransform. But, the Python SDK still had the RemoveDuplicates PTransform. A new Distinct PTransform was added which does the same thing as RemoveDuplicates. RemoveDuplicates is now an alias to the Distinct PTransform and is deprecated.
@pabloem Squashed commits, changed deprecation version to "2.12": |
Thanks Tanay! |
The Java SDK introduced a
Distinct
PTransform to replace theRemoveDuplicates
PTransform.But, the Python SDK still had the
RemoveDuplicates
PTransform.A new
Distinct
PTransform was added which does the same thing asRemoveDuplicates
.RemoveDuplicates
is now an alias to theDistinct
PTransform and is deprecated.JIRA Issue: https://issues.apache.org/jira/browse/BEAM-3208
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.Post-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.