Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of no trailing commas in CSV files #2333

Closed
wesm opened this issue Nov 23, 2012 · 5 comments
Closed

Handling of no trailing commas in CSV files #2333

wesm opened this issue Nov 23, 2012 · 5 comments
Labels
Enhancement IO Data IO issues that don't fit into a more specific label
Milestone

Comments

@wesm
Copy link
Member

wesm commented Nov 23, 2012

I don't know if there's an easy strategy to handle cases like this gracefully:

http://stackoverflow.com/questions/13454909/missing-data-in-pandas-read-csv

@michaelaye
Copy link
Contributor

Just to let you know:

csv.DictReader handles this data fairly well:

n [52]: reader = csv.DictReader(f,restval='')

In [53]: reader.next()
Out[53]: {'a': '1.5', 'b': '4.8', 'c': '', 'd': '6.3', 'e': '', 'f': ''}

In [54]: reader.next()
Out[54]: {'a': '1.60', 'b': '5.2', 'c': '6.5', 'd': '7.2', 'e': '', 'f': ''}

In [55]: reader.next()
Out[55]: {'a': '1.70', 'b': '5.5', 'c': '6.6', 'd': '8.3', 'e': '5.7', 'f': ''}

In [56]: reader.next()
Out[56]: {'a': '1.80', 'b': '6.1', 'c': '6.7', 'd': '9.7', 'e': '6.2', 'f': ''}

In [57]: reader.next()
Out[57]: {'a': '1.90', 'b': '7.1', 'c': '6.8', 'd': '11.1', 'e': '6.7', 'f': ''}

In [58]: reader.next()
Out[58]: {'a': '2', 'b': '', 'c': '6.8', 'd': '12.5', 'e': '7.3', 'f': ''}

In [59]: reader.next()
Out[59]: {'a': '2.08', 'b': '', 'c': '', 'd': '', 'e': '7.8', 'f': ''}

@wesm
Copy link
Member Author

wesm commented Nov 25, 2012

anyone think '' is a reasonable default value (I don't see why not-- they would go through as NA in pandas-land, I guess)?

@michaelaye
Copy link
Contributor

at least they should. not all libraries take a consistent stand for missing data as much as yours, Wes. ;) I was just showing this to indicate that it's possible. Of course I don't know about the performance of the parsing required for this.

@wesm
Copy link
Member Author

wesm commented Nov 28, 2012

Tabling til future release

@wesm
Copy link
Member Author

wesm commented Dec 9, 2012

This is done-- missing fields become NA

@wesm wesm closed this as completed Dec 9, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

No branches or pull requests

2 participants