Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-5321] Port transforms package to Python 3 #7104

Merged
merged 3 commits into from
Dec 7, 2018

Conversation

RobbeSneyders
Copy link
Contributor

This is is part of a series of PRs with goal to make Apache Beam PY3 compatible. The proposal with the outlined approach has been documented here: https://s.apache.org/beam-python-3.

This PR ports the transforms package.

The test suite currently still fails due to an error in the dill package. By fixing this error in dill itself, the test suite completes successfully. I have submitted a PR to the dill project, however the project does not seem to be very actively maintained.

Is there another way to add this fix to beam?

R: @tvalentyn @markflyhigh @robertwb

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- --- --- --- ---
Java Build Status Build Status Build Status Build Status Build Status Build Status Build Status Build Status
Python Build Status --- Build Status
Build Status
Build Status --- --- ---

Copy link
Contributor

@markflyhigh markflyhigh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many tests are affected by dill bug? I think we can skip them if they are small amount of number in order to enable the rest early.

if sys.version_info[0] >= 3:
expected_msg = \
"Type hint violation for 'CombinePerKey(MeanCombineFn)': " \
"requires Tuple[TypeVariable[K], Union[float, int]] " \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why long is missing here in Python 3? A comment may help people understand the branching here.

Copy link
Contributor Author

@RobbeSneyders RobbeSneyders Nov 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Python 3, short integers are gone and long has become int.

@RobbeSneyders
Copy link
Contributor Author

The tests fail on importing window_test.py on this line.
So we can't just skip tests, but we can add all other test modules from the transfroms package separately to tox.ini.

@markflyhigh
Copy link
Contributor

sgtm

@tvalentyn
Copy link
Contributor

Let's try to find out when dill could make a release with the fix. It's possible to monkey-patch dill, but it may be brittle.

@RobbeSneyders
Copy link
Contributor Author

The PR on dill has already been merged. I've asked when we can expect this to be released.
We could also use this without a release, by pip installing the commit directly.

@tvalentyn
Copy link
Contributor

pip-installing the commit sounds good to me as long as we version-guard that for Python3 only, and track the cleanup in Jira.

]

REQUIRED_PACKAGES_PY3_ONLY = [
'avro-python3>=1.8.1,<2.0.0'
'avro-python3>=1.8.1,<2.0.0',
'git+git://github.com/uqfoundation/dill.git'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment here with a todo link to JIRA.

@RobbeSneyders RobbeSneyders force-pushed the transforms branch 2 times, most recently from 88d2dbe to 403a723 Compare November 27, 2018 18:14
]

REQUIRED_PACKAGES_PY3_ONLY = [
'avro-python3>=1.8.1,<2.0.0'
'avro-python3>=1.8.1,<2.0.0',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can simplify requirements spec with python_version qualifier: See example at:

avro-python3==1.8.2;python_version>="3.4"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to work for the dill requirement. We can clean this up when we clean up the dill dependency.

@tvalentyn
Copy link
Contributor

Hi, @RobbeSneyders, what's the status of this PR?

@RobbeSneyders
Copy link
Contributor Author

Hi @tvalentyn, sorry for the wait.
The solution implemented in this PR to install dill from a github command works on my local machine, but fails on Jenkins. It seems like the dependencies in setup.py are not processed correctly.
I just tried to run this on a VM, which also worked.
Any idea on why this might fail on Jenkins?

@RobbeSneyders
Copy link
Contributor Author

Seems like adding --process-dependency-links to the tox install_command worked. Although I don't know why this is only necessary for Jenkins.

install_command = {envbindir}/python {envbindir}/pip install --process-dependency-links {opts} {packages}

@RobbeSneyders
Copy link
Contributor Author

PTAL

@tvalentyn
Copy link
Contributor

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants