New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert returned binary data from call() to string #21656
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks non-ASCII output on Python2, and also introduces inconsistent behaviours between 2 and 3.
decode()
uses the default decoding in Python 2, which is ASCII; but in Python 3, the encoding
parameter defaults to utf-8
. I think it's better to explicitly set encoding="utf-8"
so that it behaves the same on 2 and 3 and does not break non-ASCII output.
tools/wpt/browser.py
Outdated
@@ -289,7 +289,7 @@ def find_webdriver(self, channel=None): | |||
return find_executable("geckodriver") | |||
|
|||
def get_version_and_channel(self, binary): | |||
version_string = call(binary, "--version").strip() | |||
version_string = call(binary, "--version").decode('utf8').strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm nervous about having to do this at every callsite, vs just making call
always return text. That's a bigger change, but is there any reason to prefer this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't have any preference. I guess I was a bit over-cautious.
Just updated the code. Please review. Thank you!
This changes is based on the same cause as commit 6ea4b6f subprocess.check_output() returns binary data as output. In python2, binary is basically an alias of str so it can be directly used as a normal string. However in python3 the binary data is different to a string. Inserting binary data into a string in python3 generates strings like '/some/path/with/b'binary'/data/inserted' which are not correct file paths. We must decode() the returned data in order to use it as a string. This change also explicitly sets encoding="utf-8". In Python2, default decoding is ASCII while in Python 3 it is "utf-8". Leaving encoding parameters as defaults will break non-ASCII output on Python 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming we only deal with UTF-8 output (which is probably a safe assumption), this LGTM
I think embedding that assumption in a single place is better than putting it in lots of places… |
This changes is based on the same cause as
commit 6ea4b6f
subprocess.check_output() returns binary data as output. In python2,
binary is basically an alias of str so it can be directly used as a
normal string. However in python3 the binary data is different to a
string. Inserting binary data into a string in python3 generates
strings like '/some/path/with/b'binary'/data/inserted' which are not
correct file paths. We must decode() the returned data in order to use
it as a string.