Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BF: Make load_stream() not stumble over U2028 and friends #3524

Merged
merged 3 commits into from Jul 12, 2019

Conversation

@mih
Copy link
Member

@mih mih commented Jul 11, 2019

Fixes gh-3523

Sadly, we still butcher unicode somewhere on the way from loading it in as metadata and spitting it out as a result.

Here is a dataset that has run records to obtain metadata, and has aggregated metadata that demos the butchered result: https://github.com/datalad-datasets/longnow-podcasts

@mih
Copy link
Member Author

@mih mih commented Jul 12, 2019

Test failure is due to singularity-hub being gone.

Until the previous commit, load_stream() would tolerate if the last
line of the content didn't end in a new line.  Restore that behavior.
@codecov
Copy link

@codecov codecov bot commented Jul 12, 2019

Codecov Report

Merging #3524 into master will decrease coverage by <.01%.
The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3524      +/-   ##
==========================================
- Coverage   82.83%   82.83%   -0.01%     
==========================================
  Files         269      269              
  Lines       35032    35040       +8     
==========================================
+ Hits        29020    29025       +5     
- Misses       6012     6015       +3
Impacted Files Coverage Δ
datalad/support/json_py.py 95.34% <66.66%> (-3.37%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 99b8189...49826a1. Read the comment docs.

@codecov
Copy link

@codecov codecov bot commented Jul 12, 2019

Codecov Report

Merging #3524 into master will decrease coverage by 2.06%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3524      +/-   ##
==========================================
- Coverage   82.83%   80.77%   -2.07%     
==========================================
  Files         269      269              
  Lines       35032    35045      +13     
==========================================
- Hits        29020    28307     -713     
- Misses       6012     6738     +726
Impacted Files Coverage Δ
datalad/support/json_py.py 97.72% <100%> (-1%) ⬇️
datalad/support/tests/test_json_py.py 100% <100%> (ø) ⬆️
datalad/metadata/aggregate.py 14.49% <0%> (-43.79%) ⬇️
datalad/metadata/metadata.py 46.23% <0%> (-42.07%) ⬇️
datalad/metadata/search.py 35.46% <0%> (-41.14%) ⬇️
datalad/interface/save.py 25.49% <0%> (-39.87%) ⬇️
datalad/distribution/add.py 28.19% <0%> (-34.05%) ⬇️
datalad/api.py 75.86% <0%> (-17.25%) ⬇️
datalad/interface/annotate_paths.py 65.64% <0%> (-12.98%) ⬇️
datalad/distribution/remove.py 61.41% <0%> (-10.24%) ⬇️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 99b8189...49826a1. Read the comment docs.

@mih
Copy link
Member Author

@mih mih commented Jul 12, 2019

Thx a lot @kyleam !! My mind wanted me to do this, but the flesh was weak....

def test_load_unicode_line_separator(fname):
# See gh-3523.
result = list(load_stream(fname))
print(result)
Copy link
Contributor

@kyleam kyleam Jul 12, 2019

Oops, stray print. Tests look good, so I'll force push to remove this and then merge.

https://ci.appveyor.com/project/mih/datalad/builds/25936222
https://travis-ci.org/datalad/datalad/builds/557837997

@kyleam kyleam merged commit c6d31ee into datalad:master Jul 12, 2019
1 of 3 checks passed
@yarikoptic yarikoptic added this to the Release 0.12.0 milestone Aug 1, 2019
@mih mih deleted the bf-linebreaks branch Sep 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

3 participants