New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always return unicode with hexdigest() #6313

Merged
merged 8 commits into from Aug 7, 2018

Conversation

Projects
None yet
3 participants
@Eric-Arellano
Copy link
Contributor

Eric-Arellano commented Aug 7, 2018

Problem

sha1.hexdigest() returns bytes in Py2, but unicode in Py3.

This inconsistency increases the complexity of trying to support Py2 and Py3. We should always return the same type regardless of interpreter.

Solution

Adopt the Python 3 semantics, i.e. always return unicode. We are trying to write Python 3 first code, so this is the preferred solution. It also requires less boilerplate with things like string interpolation.

If we really want to be returning bytes, the docs explain we should then be using digest() instead of hexdigest(), as hexdigest is designed to get a unicode string based off of digest.

--

This does not modify local variables when it does not matter.

@illicitonion
Copy link
Contributor

illicitonion left a comment

Thanks!

@stuhood

stuhood approved these changes Aug 7, 2018

@stuhood stuhood merged commit 3be90c2 into pantsbuild:master Aug 7, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@Eric-Arellano Eric-Arellano deleted the Eric-Arellano:py3-fixes_hasher branch Aug 8, 2018

CMLivingston pushed a commit to CMLivingston/pants that referenced this pull request Aug 27, 2018

Always return unicode with hexdigest() (pantsbuild#6313)
### Problem
`sha1.hexdigest()` returns bytes in Py2, but unicode in Py3. 

This inconsistency increases the complexity of trying to support Py2 and Py3. We should always return the same type regardless of interpreter.

### Solution
Adopt the Python 3 semantics, i.e. always return unicode. We are trying to write Python 3 first code, so this is the preferred solution. It also requires less boilerplate with things like string interpolation.

If we really want to be returning bytes, the [docs](https://docs.python.org/3/library/hashlib.html#hashlib.hash.hexdigest) explain we should then be using `digest()` instead of `hexdigest()`, as hexdigest is designed to get a unicode string based off of `digest`.

--

This does not modify local variables when it does not matter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment