Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A few py2 unicode fixes (master-specific) #3602

Merged
merged 3 commits into from Aug 16, 2019

Conversation

@kyleam
Copy link
Member

commented Aug 14, 2019

This series has a few py2 unicode fixes spurred by gh-3597. It relies on a fix from that PR (temporarily applied here as the tip commit), so I'll mark this PR has a draft until that gh-3597 lands in master.

kyleam added 3 commits Aug 12, 2019
BF(py2): interface: Don't assume that results are unicode
We call text_type() on the result's `path` because it may come in as a
pathlib object.  Ideally we'd work with unicode internally within IO
boundaries, but when default_result_renderer() gets the results,
_process_results() has already encoded unicode strings.  This means
our handling fails with non-ascii paths on Python 2:

    $ datalad create Δ0
    [ERROR ] 'ascii' codec can't decode byte 0xce in position 16:
    ordinal not in range(128)
    [utils.py:default_result_renderer:506] (UnicodeDecodeError)

Avoid the above error by ensuring we use bytes for `path` on Python 2.

Note that the added test is oddly parametrized with one case because
the next commit will add another case.

Re: #3597
BF(py2): status/save: Fix handling of non-ascii paths
Avoid a unicode error when passing non-ascii paths on Python 2:

    $ datalad save β0
    [ERROR ] 'ascii' codec can't decode byte 0xce in position 0:
    ordinal not in range(128)
    [status.py:__call__:292] (UnicodeDecodeError)
BF(py2): create: Handle non-ascii dataset argument
The previous commit fixed a similar failure when the non-ascii dataset
comes in through PATH, but we still fail if it is given to --dataset:

    $ datalad create -d Δ1
    [ERROR ] 'ascii' codec can't decode byte 0xce in position 16:
    ordinal not in range(128)
    [create.py:__call__:209] (UnicodeDecodeError)

Fix this by converting the incoming dataset to unicode.

@kyleam kyleam force-pushed the kyleam:more-py2-unicode-fixes branch from b4499fc to 452976b Aug 15, 2019

@codecov

This comment has been minimized.

Copy link

commented Aug 15, 2019

Codecov Report

Merging #3602 into master will decrease coverage by 0.67%.
The diff coverage is 90.9%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3602      +/-   ##
==========================================
- Coverage   76.94%   76.27%   -0.68%     
==========================================
  Files         269      269              
  Lines       35413    35419       +6     
==========================================
- Hits        27248    27015     -233     
- Misses       8165     8404     +239
Impacted Files Coverage Δ
datalad/core/local/create.py 78.94% <100%> (-0.45%) ⬇️
datalad/core/local/tests/test_save.py 97.34% <100%> (-0.49%) ⬇️
datalad/core/local/tests/test_create.py 100% <100%> (ø) ⬆️
datalad/core/local/status.py 73.26% <100%> (ø) ⬆️
datalad/distribution/dataset.py 82.8% <66.66%> (ø) ⬆️
datalad/interface/utils.py 85.07% <66.66%> (-0.48%) ⬇️
datalad/support/status.py 33.33% <0%> (-57.58%) ⬇️
datalad/support/archives.py 61.62% <0%> (-22.49%) ⬇️
datalad/log.py 47.11% <0%> (-19.72%) ⬇️
datalad/customremotes/base.py 67.31% <0%> (-11.01%) ⬇️
... and 32 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dc875d8...452976b. Read the comment docs.

@kyleam kyleam marked this pull request as ready for review Aug 15, 2019

@kyleam

This comment has been minimized.

Copy link
Member Author

commented Aug 15, 2019

gh-3597 has made its way to master. I've updated this PR to drop the temporary commit.

range-diff
1:  77a0a44df = 1:  b403c1401 BF(py2): interface: Don't assume that results are unicode
2:  a1ff37e76 = 2:  f613ef146 BF(py2): create: Handle non-ascii dataset argument
3:  7dd1b9a8b = 3:  452976b62 BF(py2): status/save: Fix handling of non-ascii paths
4:  b4499fc18 < -:  --------- [drop] require_dataset() fix from gh-3597
@yarikoptic

This comment has been minimized.

Copy link
Member

commented Aug 16, 2019

oh green Travis, I thought we lost you forever! ;-) Thanks @kyleam

@yarikoptic yarikoptic merged commit d6b34f5 into datalad:master Aug 16, 2019

2 of 3 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
WIP Ready for review
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@kyleam kyleam deleted the kyleam:more-py2-unicode-fixes branch Aug 16, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.