Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle content-length of 0 #163

Merged
merged 4 commits into from Oct 31, 2019
Merged

Conversation

TomAugspurger
Copy link
Collaborator

@TomAugspurger TomAugspurger commented Oct 23, 2019

cc @martindurant.

This test is failing right now. Some layer is truncating the response when the Content-Length is 0, so the assert f.read() == data is failing. f.read() is returning b''. Any suggestions on an alternative test?

It is working for the original issue though

In [1]: import dask.dataframe as dd
df
In [2]: df2 = dd.read_csv('https://fred.stlouisfed.org/graph/fredgraph.csv?id=SP500')

In [3]: df2.compute()
Out[3]:
            DATE    SP500
0     2009-10-23  1079.60
1     2009-10-26  1066.95
2     2009-10-27  1063.41
3     2009-10-28  1042.63
4     2009-10-29  1066.11
...          ...      ...
2603  2019-10-16  2989.69
2604  2019-10-17  2997.95
2605  2019-10-18   2986.2
2606  2019-10-21  3006.72
2607  2019-10-22  2995.99

[2608 rows x 2 columns]

Closes dask/dask#5517

@TomAugspurger
Copy link
Collaborator Author

ping @martindurant if you have suggestions for an alternative test.

@martindurant
Copy link
Member

I am puzzled that the pandas load works, but the simple read() doesn't. Would be good to figure out why the two take different paths.

@martindurant
Copy link
Member

So this fixes the original case, in which HEAD was giving content length 0, but GET was giving right length. If the content length were 0 in the GET too, I think requests itself doesn't fetch the data.

@martindurant martindurant merged commit 56fd518 into fsspec:master Oct 31, 2019
@TomAugspurger TomAugspurger deleted the zero-length branch July 14, 2021 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dask Dataframe not able to read remote file, but local (and pandas) work
2 participants