Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: read_csv dtype with datetime w/tz & integrations with category #24542

Open
jreback opened this issue Jan 2, 2019 · 3 comments
Open

ENH: read_csv dtype with datetime w/tz & integrations with category #24542

jreback opened this issue Jan 2, 2019 · 3 comments
Labels
Categorical Categorical Data Type Datetime Datetime data dtype Enhancement IO CSV read_csv, to_csv

Comments

@jreback
Copy link
Contributor

jreback commented Jan 2, 2019

xref #23228

Datetime w/tz

I think this should work

In [16]: df = pd.DataFrame({'Int': pd.Series([1, 2, 3], dtype='Int64'), 'Date': pd.date_range('20180101', periods=3, tz='US/Eastern')})
    ...: 

In [17]: df.dtypes
Out[17]: 
Int                          Int64
Date    datetime64[ns, US/Eastern]
dtype: object

In [18]: df.to_csv('foo.csv')

In [19]: pd.read_csv('foo.csv',index_col=0,dtype={'Int':'Int64','Date':pd.DatetimeTZDtype('ns', 'US/Eastern')})
TypeError: the dtype datetime64[ns, US/Eastern] is not supported for parsing

This probably requires #24024 first (as need _sequence_of_strings, which is basically a call to .to_datetime() then a dance to convert to the dtype (as the read values may be localized already or not).

Categoricals

should be unified with our current work-around for 'category'.

In [25]: pd.read_csv('foo.csv',index_col=0,dtype={'Int':'Int64','Date':pd.CategoricalDtype})
NotImplementedError: Extension Array: <class 'pandas.core.arrays.categorical.Categorical'> must implement _from_sequence_of_strings in order to be used in parser methods
In [27]: pd.read_csv('foo.csv',index_col=0,dtype={'Int':'Int64','Date':'category'})
Out[27]: 
   Int                       Date
0    1  2018-01-01 00:00:00-05:00
1    2  2018-01-02 00:00:00-05:00
2    3  2018-01-03 00:00:00-05:00
@jreback jreback added Datetime Datetime data dtype IO CSV read_csv, to_csv Categorical Categorical Data Type Difficulty Intermediate labels Jan 2, 2019
@jreback jreback added this to the Contributions Welcome milestone Jan 2, 2019
@jreback
Copy link
Contributor Author

jreback commented Jan 2, 2019

cc @kprestel

@kprestel
Copy link
Contributor

kprestel commented Jan 2, 2019

I can take a look at this.

@teto
Copy link

teto commented Mar 30, 2020

@kprestel that would be really cool. I do a bunch of read_csv / concatenation / merge of dataframes and not being able to setup the initial datadrame with the correct dtype makes the program a lot more complex.

@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Datetime Datetime data dtype Enhancement IO CSV read_csv, to_csv
Projects
None yet
Development

No branches or pull requests

5 participants