New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add tranparent compression to json reading/writing #17798

Merged
merged 6 commits into from Oct 6, 2017

Conversation

Projects
None yet
4 participants
@simongibbons
Contributor

simongibbons commented Oct 5, 2017

This works in the same way as the argument to read_csvand to_csv.

I've added tests confirming that it works with both file paths, and S3 URLs. (obviously there will be edge cases I've missed - please let me know if there are important ones that I should add coverage for).

The implementation is mostly plumbing, using the logic that was in place for the same functionality in read_csv.

  • closes #15644
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry
@pep8speaks

This comment has been minimized.

Show comment
Hide comment
@pep8speaks

pep8speaks Oct 5, 2017

Hello @simongibbons! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on October 06, 2017 at 08:18 Hours UTC

pep8speaks commented Oct 5, 2017

Hello @simongibbons! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on October 06, 2017 at 08:18 Hours UTC

simongibbons added some commits Oct 5, 2017

ENH: Add tranparent compression to json reading/writing
This works in the same way as the argument to ``read_csv``
and ``to_csv``.

I've added tests confirming that it works with both file
paths, as well and file URLs and S3 URLs.
@codecov

This comment has been minimized.

Show comment
Hide comment
@codecov

codecov bot Oct 5, 2017

Codecov Report

Merging #17798 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17798      +/-   ##
==========================================
- Coverage   91.24%   91.24%   -0.01%     
==========================================
  Files         163      163              
  Lines       49967    49967              
==========================================
- Hits        45593    45590       -3     
- Misses       4374     4377       +3
Flag Coverage Δ
#multiple 89.04% <ø> (+0.01%) ⬆️
#single 40.24% <ø> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/io/json/json.py 100% <ø> (ø) ⬆️
pandas/core/generic.py 92.03% <ø> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.74% <0%> (-0.1%) ⬇️
pandas/core/indexes/datetimes.py 95.48% <0%> (-0.1%) ⬇️
pandas/io/common.py 71.61% <0%> (+2.96%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 22515f5...3ed830c. Read the comment docs.

codecov bot commented Oct 5, 2017

Codecov Report

Merging #17798 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17798      +/-   ##
==========================================
- Coverage   91.24%   91.24%   -0.01%     
==========================================
  Files         163      163              
  Lines       49967    49967              
==========================================
- Hits        45593    45590       -3     
- Misses       4374     4377       +3
Flag Coverage Δ
#multiple 89.04% <ø> (+0.01%) ⬆️
#single 40.24% <ø> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/io/json/json.py 100% <ø> (ø) ⬆️
pandas/core/generic.py 92.03% <ø> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.74% <0%> (-0.1%) ⬇️
pandas/core/indexes/datetimes.py 95.48% <0%> (-0.1%) ⬇️
pandas/io/common.py 71.61% <0%> (+2.96%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 22515f5...3ed830c. Read the comment docs.

@codecov

This comment has been minimized.

Show comment
Hide comment
@codecov

codecov bot Oct 5, 2017

Codecov Report

Merging #17798 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17798      +/-   ##
==========================================
- Coverage   91.24%   91.23%   -0.02%     
==========================================
  Files         163      163              
  Lines       49967    49971       +4     
==========================================
- Hits        45593    45590       -3     
- Misses       4374     4381       +7
Flag Coverage Δ
#multiple 89.03% <ø> (ø) ⬆️
#single 40.24% <ø> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 92.03% <ø> (ø) ⬆️
pandas/io/json/json.py 100% <ø> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.74% <0%> (-0.1%) ⬇️
pandas/core/indexes/timedeltas.py 91.19% <0%> (ø) ⬆️
pandas/core/indexes/range.py 92.83% <0%> (ø) ⬆️
pandas/core/indexes/numeric.py 97.18% <0%> (ø) ⬆️
pandas/core/indexes/period.py 92.78% <0%> (ø) ⬆️
pandas/core/indexes/datetimes.py 95.58% <0%> (ø) ⬆️
pandas/core/indexes/multi.py 96.39% <0%> (ø) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 22515f5...402fa11. Read the comment docs.

codecov bot commented Oct 5, 2017

Codecov Report

Merging #17798 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17798      +/-   ##
==========================================
- Coverage   91.24%   91.23%   -0.02%     
==========================================
  Files         163      163              
  Lines       49967    49971       +4     
==========================================
- Hits        45593    45590       -3     
- Misses       4374     4381       +7
Flag Coverage Δ
#multiple 89.03% <ø> (ø) ⬆️
#single 40.24% <ø> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 92.03% <ø> (ø) ⬆️
pandas/io/json/json.py 100% <ø> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.74% <0%> (-0.1%) ⬇️
pandas/core/indexes/timedeltas.py 91.19% <0%> (ø) ⬆️
pandas/core/indexes/range.py 92.83% <0%> (ø) ⬆️
pandas/core/indexes/numeric.py 97.18% <0%> (ø) ⬆️
pandas/core/indexes/period.py 92.78% <0%> (ø) ⬆️
pandas/core/indexes/datetimes.py 95.58% <0%> (ø) ⬆️
pandas/core/indexes/multi.py 96.39% <0%> (ø) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 22515f5...402fa11. Read the comment docs.

Show outdated Hide outdated pandas/tests/io/json/test_compression.py Outdated
Show outdated Hide outdated doc/source/whatsnew/v0.21.0.txt Outdated

@jreback jreback added this to the 0.21.0 milestone Oct 6, 2017

@jreback

jreback approved these changes Oct 6, 2017

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 6, 2017

Contributor

lgtm, thanks for the quick response!

@TomAugspurger ?

Contributor

jreback commented Oct 6, 2017

lgtm, thanks for the quick response!

@TomAugspurger ?

@simongibbons

This comment has been minimized.

Show comment
Hide comment
@simongibbons

simongibbons Oct 6, 2017

Contributor

Let me know if you want me to squash this when it's ready to merge.

Contributor

simongibbons commented Oct 6, 2017

Let me know if you want me to squash this when it's ready to merge.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 6, 2017

Contributor

@simongibbons no need to squash, its done automatically on merging.

Contributor

jreback commented Oct 6, 2017

@simongibbons no need to squash, its done automatically on merging.

@TomAugspurger

+1.

Does the ZIP file need to be added to MANIFEST.IN?

@pytest.mark.parametrize('compression', COMPRESSION_TYPES)
def test_with_s3_url(compression):

This comment has been minimized.

@TomAugspurger

TomAugspurger Oct 6, 2017

Contributor

This shares some code with the (to be merged) #17201

I think it's fine for now, but we'll want to clean it up whenever the later is merged. Since this is clean at the moment, I think we'll merge it, and then refactor this test in #17201.

@TomAugspurger

TomAugspurger Oct 6, 2017

Contributor

This shares some code with the (to be merged) #17201

I think it's fine for now, but we'll want to clean it up whenever the later is merged. Since this is clean at the moment, I think we'll merge it, and then refactor this test in #17201.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 6, 2017

Contributor

Does the ZIP file need to be added to MANIFEST.IN?

hmm I think it might be need to be added to setup.py

https://travis-ci.org/pandas-dev/pandas/jobs/284093329 is our build test, which IS picking this up.

Contributor

jreback commented Oct 6, 2017

Does the ZIP file need to be added to MANIFEST.IN?

hmm I think it might be need to be added to setup.py

https://travis-ci.org/pandas-dev/pandas/jobs/284093329 is our build test, which IS picking this up.

@TomAugspurger

This comment has been minimized.

Show comment
Hide comment
@TomAugspurger

TomAugspurger Oct 6, 2017

Contributor

Probably covered by pandas.tests.io: ['json/data/*.json'] in the setup.py.

Contributor

TomAugspurger commented Oct 6, 2017

Probably covered by pandas.tests.io: ['json/data/*.json'] in the setup.py.

@TomAugspurger TomAugspurger merged commit 3b4121b into pandas-dev:master Oct 6, 2017

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@TomAugspurger

This comment has been minimized.

Show comment
Hide comment
@TomAugspurger
Contributor

TomAugspurger commented Oct 6, 2017

Thanks @simongibbons!

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 6, 2017

Contributor

@TomAugspurger NO its NOT covered by that. See the failing tests. This NEEDS to be in setup.py

Contributor

jreback commented Oct 6, 2017

@TomAugspurger NO its NOT covered by that. See the failing tests. This NEEDS to be in setup.py

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 6, 2017

Contributor

you can change it to '/json/data/*.json*' and it will work I think

Contributor

jreback commented Oct 6, 2017

you can change it to '/json/data/*.json*' and it will work I think

jreback added a commit to jreback/pandas that referenced this pull request Oct 6, 2017

jreback added a commit that referenced this pull request Oct 6, 2017

kchomski-reef added a commit to reef-technologies/pandas that referenced this pull request Oct 16, 2017

ENH: Add tranparent compression to json reading/writing (pandas-dev#1…
…7798)

* ENH: Add tranparent compression to json reading/writing

This works in the same way as the argument to ``read_csv``
and ``to_csv``.

I've added tests confirming that it works with both file
paths, as well and file URLs and S3 URLs.

* Fix PEP8 violations

* Add PR number to whatsnew entry

* Remove problematic Windows test (The S3 test hits the same edge case)

* Extract decompress file function so that pytest.paramatrize can be used cleanly

* Fix typo in whatsnew entry

kchomski-reef added a commit to reef-technologies/pandas that referenced this pull request Oct 16, 2017

alanbato added a commit to alanbato/pandas that referenced this pull request Nov 10, 2017

ENH: Add tranparent compression to json reading/writing (pandas-dev#1…
…7798)

* ENH: Add tranparent compression to json reading/writing

This works in the same way as the argument to ``read_csv``
and ``to_csv``.

I've added tests confirming that it works with both file
paths, as well and file URLs and S3 URLs.

* Fix PEP8 violations

* Add PR number to whatsnew entry

* Remove problematic Windows test (The S3 test hits the same edge case)

* Extract decompress file function so that pytest.paramatrize can be used cleanly

* Fix typo in whatsnew entry

alanbato added a commit to alanbato/pandas that referenced this pull request Nov 10, 2017

No-Stream added a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017

ENH: Add tranparent compression to json reading/writing (pandas-dev#1…
…7798)

* ENH: Add tranparent compression to json reading/writing

This works in the same way as the argument to ``read_csv``
and ``to_csv``.

I've added tests confirming that it works with both file
paths, as well and file URLs and S3 URLs.

* Fix PEP8 violations

* Add PR number to whatsnew entry

* Remove problematic Windows test (The S3 test hits the same edge case)

* Extract decompress file function so that pytest.paramatrize can be used cleanly

* Fix typo in whatsnew entry

No-Stream added a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment