New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-5317] Finish Python3 porting for options module #6397
Conversation
Retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
I left some comments on the changes.
sdks/python/py3-lint/pyvenv.cfg
Outdated
@@ -0,0 +1,3 @@ | |||
home = /Users/doriadong/miniconda3/bin | |||
include-system-site-packages = false | |||
version = 3.6.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this file was pushed accidentally.
We are aiming for Python 3.5 compatibility, which is also running on Jenkins. There are some changes in 3.6 related to the bytes type, which might cause incompatibilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops
@@ -329,7 +330,11 @@ def get_validator(matcher): | |||
] | |||
|
|||
for case in test_case: | |||
errors = get_validator(case['on_success_matcher']).validate() | |||
matcher = case['on_success_matcher'] | |||
if matcher and sys.version_info[0] >= 3 and type(matcher) is bytes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the version check here?
I think we can only check for bytes and decode on both Python 2 and 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not sure but let me try
R: @tvalentyn |
LGTM! |
@@ -329,7 +329,11 @@ def get_validator(matcher): | |||
] | |||
|
|||
for case in test_case: | |||
errors = get_validator(case['on_success_matcher']).validate() | |||
matcher = case['on_success_matcher'] | |||
if matcher and type(matcher) is bytes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for not using isinstance
here?
errors = get_validator(case['on_success_matcher']).validate() | ||
matcher = case['on_success_matcher'] | ||
if matcher and type(matcher) is bytes: | ||
errors = get_validator(matcher.decode('utf-8')).validate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure you want to decode the bytes here? Looking at test_case
, if the matcher type is bytes, then the content is almost surely not decodeable using UTF-8:
>>> import pickle
>>> matcher = pickle.dumps(object)
>>> matcher
b'\x80\x03cbuiltins\nobject\nq\x00.'
>>> matcher.decode("utf-8")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to fix the code that is tested by this test. It looks like we are using a pipeline option
'--on_success_matcher', |
options.append('--%s=%s' % (k, pickler.dumps(v))) |
This test is running into the same problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@superbobry notice that we are are using a wrapper around pickle:
we base64-encode the results so that we can put them in a JSON objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would instead change line 315 to:
options.append('%s=%s' % ('--on_success_matcher=' , matcher.decode()))
line 155 in test_pipeline.py to:
options.append('--%s=%s' % (k, pickler.dumps(v).decode()))
and change abc
to b'abc'
in one of the test cases.
What do you all think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that looks better and more consistent
80afc02
to
b0497e2
Compare
@@ -311,7 +311,7 @@ def get_validator(matcher): | |||
'--staging_location=gs://foo/bar', | |||
'--temp_location=gs://foo/bar',] | |||
if matcher: | |||
options.append('--on_success_matcher=' + matcher) | |||
options.append('%s=%s' % ('--on_success_matcher=', matcher.decode())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra =
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After --on_success_matcher
. Also please squash the previous commits.
b0497e2
to
a0a98d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks a lot for helping with Python 3, @manuzhang !
This is is part of a series of PRs with goal to make Apache Beam PY3 compatible as per the proposal
Follow this checklist to help us incorporate your contribution quickly and easily:
[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.It will help us expedite review of your Pull Request if you tag someone (e.g.
@username
) to look at it.Post-Commit Tests Status (on master branch)