Skip to content

Inconsistency between read_csv(... parse_dates) and read_csv followed by astype('datetime') #3498

@ghost

Description

Write time-zones out in csv file

Datetimes are written to csv in readable UTC. Using pd.read_csv('file.csv', parse_dates=['date']) expects UTC. astype('datetime64') expects readable dates (with no timezone specified) to be local (which also makes sense). But it leads to the following:

    import pandas as pd
    from StringIO import StringIO

    df = pd.DataFrame({'date' : [0]})
    df['date'] = df.date.astype('datetime64[s]')
    print df.date[0]

    s = StringIO()
    df.to_csv(s)
    s = s.getvalue()
    print 'csv:'
    print s


    with_parse_dates = pd.read_csv(StringIO(s), parse_dates=['date'])
    print 'parse_date:', with_parse_dates.date[0]


    with_astype = pd.read_csv(StringIO(s))
    with_astype.date = with_astype.date.astype('datetime64')
    print 'astype    :', with_astype.date[0]


    with_astypeZ = pd.read_csv(StringIO(s))
    with_astypeZ.date[0] += 'Z'
    with_astypeZ.date = with_astypeZ.date.astype('datetime64')
    print 'astype w/Z:', with_astypeZ.date[0]

output (for me):

1970-01-01 00:00:00
csv:
,date
0,1970-01-01 00:00:00

parse_date: 1970-01-01 00:00:00
astype    : 1970-01-01 08:00:00
astype w/Z: 1970-01-01 00:00:00

Would be easily fixed by indicating UTC in the csv output, possibly with just the 'Z' suffix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Dtype ConversionsUnexpected or buggy dtype conversionsIO DataIO issues that don't fit into a more specific label

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions