Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thousands separator for to_csv #30045

Open
ghisvail opened this issue Dec 4, 2019 · 7 comments
Open

Thousands separator for to_csv #30045

ghisvail opened this issue Dec 4, 2019 · 7 comments
Labels
Enhancement IO CSV read_csv, to_csv

Comments

@ghisvail
Copy link

ghisvail commented Dec 4, 2019

Pandas exposes a thousands optional parameter to read_csv used to specify a custom thousands separator, so that 1,000 or 1_000 can be successfully parsed to a numeral in the resulting DataFrame.

Unfortunately, Pandas is missing the very same parameter for to_csv, so that a DataFrame containing 1000 ends up serialized to csv as 1,000 or 1_000.

I understand the general issue of custom float formatting in Pandas remains open (see #4668) and may not even find a solution. However, this particular use case sounds a bit more accessible, since it has been done successfully for read_csv.

@jbrockmendel jbrockmendel added the IO CSV read_csv, to_csv label Dec 6, 2019
@TomAugspurger
Copy link
Contributor

so that a DataFrame containing 1000 ends up serialized to csv as 1,000 or 1_000.

It's formatted as 1000 right, not 1,000 or 1_000?

This seems a bit complex, since you would end up needing to quote the number, right?

@ghisvail
Copy link
Author

ghisvail commented Dec 7, 2019

It's formatted as 1000 right, not 1,000 or 1_000?

Assuming a given df contains 1000, passing thousands="_" should serialize it to 1_000 in the output CSV file.

you would end up needing to quote the number

If the column separator is set to ; or tabs and the thousands separator to , I see no needs for quoting. Imo, quoting should be required if both column and thousands separators are set to the same character.

@ruijpbastos
Copy link

Seems like a relatively niche need, and no activity on this in the past year. Maybe we can close this until further notice?

@ghisvail
Copy link
Author

ghisvail commented Dec 18, 2020 via email

@ruijpbastos
Copy link

Sorry if my comment was rude in any way. Still learning the ropes of contributing and was just trying to be helpful. I can't add tags, but I see how that would be more helpful than simply closing the issue.

In the interest of continued discussion, could you maybe expand how this feature would be useful?

@ghisvail
Copy link
Author

In the interest of continued discussion, could you maybe expand how this feature would be useful?

When manipulating CSV files containing very large numbers (money, quantity, counts), using a thousand separator, usually _, can increase readability.

@MotStr
Copy link

MotStr commented Jan 31, 2024

Right now, the thousands separator seems to be affected by general Windows setting - but I have not found a way to change this setting for programs ran as System user... Therefore, possibility to set thousands separator to to_csv method would be very useful in my case!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO CSV read_csv, to_csv
Projects
None yet
Development

No branches or pull requests

6 participants