Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading bz2, zip, and xz-compressed files from URL fails #14570

Closed
dhimmel opened this issue Nov 2, 2016 · 3 comments
Closed

Reading bz2, zip, and xz-compressed files from URL fails #14570

dhimmel opened this issue Nov 2, 2016 · 3 comments
Labels
Duplicate Report Duplicate issue or pull request IO Data IO issues that don't fit into a more specific label Testing pandas testing functions or related to the test suite
Milestone

Comments

@dhimmel
Copy link
Contributor

dhimmel commented Nov 2, 2016

Previously, we've discussed reading tables form gzip-compressed URLs (see #8685 and #10649). This is an essential feature as data storage increasingly migrates to the cloud.

I now realize that the previous work only added support for gzipped URLs and not bz2, zip, or xz compression. However, the documentation doesn't reflect this. See this notebook to see the errors that are generated.

So 👍 for allowing URL locations for all supported compression formats.

@jreback
Copy link
Contributor

jreback commented Nov 2, 2016

#12688 is the base issue which needs fixing (there is a PR open on it, which is almost there). Once that is fixed almost everything else is just testing.

@jreback jreback added Testing pandas testing functions or related to the test suite IO Data IO issues that don't fit into a more specific label Duplicate Report Duplicate issue or pull request Difficulty Intermediate labels Nov 2, 2016
@jreback jreback added this to the Next Major Release milestone Nov 2, 2016
@dhimmel
Copy link
Contributor Author

dhimmel commented Nov 3, 2016

Thanks @jreback for the update. Excited for the pull request (#13340)!

@jreback
Copy link
Contributor

jreback commented Nov 3, 2016

that PR is a bit stalled - if u want to fix up and submit would be great

dhimmel added a commit to dhimmel/pandas that referenced this issue Dec 13, 2016
dhimmel added a commit to dhimmel/pandas that referenced this issue Dec 13, 2016
@jreback jreback modified the milestones: 0.20.0, Next Major Release Dec 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request IO Data IO issues that don't fit into a more specific label Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants