Skip to content

read_sql_query does not convert invalid dates to NaT #9261

@jorisvandenbossche

Description

@jorisvandenbossche

From: http://stackoverflow.com/questions/27908071/pandas-interprets-timestamp-without-timezones-columns-as-different-types/27960779#27960779

Using:

df = pd.DataFrame({'col1':[0,1], 'col2':pd.date_range('2012-01-01', periods=2, freq='1h')})
df.to_sql('test_invalid_date', engine_postgres, if_exists='replace', index=False)
engine_postgres.execute("INSERT INTO test_invalid_date VALUES (2, '3012-01-01 00:00:00');")

it gives a different handling of the invalid date depending on using read_sql_table or read_sql_query:

In [34]: df1 = pd.read_sql_table('test_invalid_date', engine_postgres)

In [35]: df1
Out[35]:
   col1                col2
0     0 2012-01-01 00:00:00
1     1 2012-01-01 01:00:00
2     2                 NaT

In [36]: df1.dtypes
Out[36]:
col1             int64
col2    datetime64[ns]
dtype: object

In [37]: df2 = pd.read_sql_query('SELECT * FROM test_invalid_date', engine_postgres)

In [38]: df2
Out[38]:
   col1                 col2
0     0  2012-01-01 00:00:00
1     1  2012-01-01 01:00:00
2     2  3012-01-01 00:00:00

In [39]: df2.dtypes
Out[39]:
col1     int64
col2    object
dtype: object

In [40]: df2.col2.values
Out[40]:
array([datetime.datetime(2012, 1, 1, 0, 0),
       datetime.datetime(2012, 1, 1, 1, 0),
       datetime.datetime(3012, 1, 1, 0, 0)], dtype=object)

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO SQLto_sql, read_sql, read_sql_queryMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions