read csv thousands separator #4322

hayd · 2013-07-22T21:36:55Z

From this SO question, with input to include thousands:


In [1]: s = '06.02.2013;13:00;1.000,215;0,215;0,185;0,205;0,00'

In [2]: pd.read_csv(StringIO(s), sep=';', header=None, parse_dates={'Dates': [0, 1]}, index_col=0, decimal=',')
Out[2]:
                              2      3      4      5  6
Dates
2013-06-02 13:00:00  10.000,215  0.215  0.185  0.205  0

In [3]: pd.read_csv(StringIO(s), sep=';', header=None, parse_dates={'Dates': [0, 1]}, index_col=0, decimal=',', thousands='.')
Out[3]:
                        2      3      4      5  6
Dates
6022013 13:00   1.000,215  0.215  0.185  0.205  0

Note: the Dates column (as well as the thousands not being converted.

The text was updated successfully, but these errors were encountered:

hayd · 2013-07-31T14:49:35Z

Possibly this is a dupe of #2594

jreback · 2013-07-31T14:52:51Z

this work? http://pandas.pydata.org/pandas-docs/dev/io.html#thousand-separators

hayd · 2013-07-31T15:28:43Z

@jreback not sure what you're asking?

jreback · 2013-07-31T15:30:31Z

is this a bug? e.g. the dtypes not working with thousands or a bug in thousands sep? or a feature request (which is there already)?

hayd · 2013-07-31T15:38:27Z

@jreback this is a bug, thousand separator doesn't seem to work

in the docs you link to it says

For large integers that have been written with a thousands separator, you can set the thousands keyword to True so that integers will be parsed correctly:

in the docstring for read_csv it asks for:

thousands : str, default None
Thousands separator

Doesn't seem to be working with '.', as well as screwing up the dates.

jreback · 2013-07-31T15:43:37Z

ahh ok...i c now.....

Adds support for the thousands character in csv parser for floats. Updated docs to reflect bug fix.

jreback · 2013-08-23T17:51:59Z

closed by #4598

hayd · 2013-08-26T19:03:15Z

@jreback The date aspect of this pr is still not fixed. For some reason the thousands separator attacks the date and makes it a string.

hayd · 2013-08-26T19:21:35Z

Wow, so this is a very edge case... it's cos the date column is just 06.02.2013 which is read as a number 0602013... it's possible dates are sometimes written this way on the continent (along with . thousands): http://en.wikipedia.org/wiki/Date_and_time_notation_in_Europe

Not sure what solution is.

jreback · 2013-08-26T19:27:28Z

but it should ignore dates columns entirely (for thousands parsing...).....hmmm...why don't you open a separate issue and can cross-link it

guyrt mentioned this issue Aug 18, 2013

csv_import: Thousands separator works in floating point numbers #4598

Merged

guyrt added a commit to guyrt/pandas that referenced this issue Aug 23, 2013

BUG: fixes issue pandas-dev#4322

0922599

Adds support for the thousands character in csv parser for floats. Updated docs to reflect bug fix.

jreback closed this as completed Aug 23, 2013

hayd mentioned this issue Aug 26, 2013

Dates are parsed with read_csv thousand seperator #4678

Closed

guyrt mentioned this issue Sep 24, 2013

BUG: Conflict between thousands sep and date parser. #4945

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read csv thousands separator #4322

read csv thousands separator #4322

hayd commented Jul 22, 2013

hayd commented Jul 31, 2013

jreback commented Jul 31, 2013

hayd commented Jul 31, 2013

jreback commented Jul 31, 2013

hayd commented Jul 31, 2013

jreback commented Jul 31, 2013

jreback commented Aug 23, 2013

hayd commented Aug 26, 2013

hayd commented Aug 26, 2013

jreback commented Aug 26, 2013

read csv thousands separator #4322

read csv thousands separator #4322

Comments

hayd commented Jul 22, 2013

hayd commented Jul 31, 2013

jreback commented Jul 31, 2013

hayd commented Jul 31, 2013

jreback commented Jul 31, 2013

hayd commented Jul 31, 2013

jreback commented Jul 31, 2013

jreback commented Aug 23, 2013

hayd commented Aug 26, 2013

hayd commented Aug 26, 2013

jreback commented Aug 26, 2013