Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_excel does not accept a URL #6809

Closed
jseabold opened this issue Apr 5, 2014 · 3 comments · Fixed by #7531
Closed

read_excel does not accept a URL #6809

jseabold opened this issue Apr 5, 2014 · 3 comments · Fixed by #7531
Labels
Enhancement IO Data IO issues that don't fit into a more specific label IO Excel read_excel, to_excel
Milestone

Comments

@jseabold
Copy link
Contributor

jseabold commented Apr 5, 2014

http://nbviewer.ipython.org/urls/umich.box.com/shared/static/oh717lkxczhseep71lao.ipynb

@jreback jreback added this to the 0.15.0 milestone Apr 6, 2014
@asobrien
Copy link
Contributor

What you want is to directly open an xls file by directly specifiy the URL in pd.read_excel:

pd.read_excel("http://www.eia.gov/dnav/pet/xls/PET_PRI_ALLMG_A_EPM0_PTC_DPGAL_M.xls", "Data 1", skiprows=2)

but currently this returns an IOError:

IOError: [Errno 2] No such file or directory: 'http://www.eia.gov/dnav/pet/xls/PET_PRI_ALLMG_A_EPM0_PTC_DPGAL_M.xls'

Getting around this currently requires explicit calls to urllib2 and StringIO to read and generate a buffer which can then be passed to pd.read_csv:

data_url = "http://www.eia.gov/dnav/pet/xls/PET_PRI_ALLMG_A_EPM0_PTC_DPGAL_M.xls"
xld = urllib2.urlopen(data_url).read()
xlds = StringIO.StringIO(xld)
data = pd.read_excel(xlds, "Data 1", skiprows=2)

@asobrien
Copy link
Contributor

I'm interested in implementing a method to directly read from a URL in pd.read_excel. My initial thought is to utilize a is_url boolean keyword argument to explicitly indicate that the io string is indeed a URL. The method could then simply be implemented using urllib2 and StringIO, as above.

Any thoughts?

@jreback
Copy link
Contributor

jreback commented Jun 22, 2014

see #7531

@jreback jreback modified the milestones: 0.14.1, 0.15.0 Jun 22, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO Data IO issues that don't fit into a more specific label IO Excel read_excel, to_excel
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants