-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Table.read_table() could be smarter about auto-detecting the file format #66
Comments
read_table uses the pandas tool for this. In fact, it is the only thing we David E. Culler On Mon, Sep 14, 2015 at 12:04 PM, davidwagner notifications@github.com
|
Cool, thank you! I wonder if this line in datascience/tables.py is causing the problem:
Note to self: investigate when I get a chance. Anyway, this is absolutely not a big deal, just a super-minor annoyance I thought I'd document. |
The table reader doesn't inspect the file, just the path. I think that behavior is here to stay. Instead, you'll have to specify the separator manually.
|
Slightly improved in new release (handles the http query string case) |
Try this:
Table.read_table() fails to recognize the columns; it stuff everything into one column.
Compare to
which does recognize that there are three columns.
Perhaps it is looking at the URL and trying to parse out the filename extension, and then using that to decide how to decode the data. If so, maybe it should be smarter about how to parse URLs (to remove fragments and parameters), or maybe it should ignore the URL/filename and have smarter format detection (e.g., auto-detect it as CSV based on the contents of the data rather than the filename).
The text was updated successfully, but these errors were encountered: