# 10_Convert a String Datetime Column to Proper Datetime

In [1]:
import pandas as pd

### Load the example *timeseries_daily.csv* dataset:

In [2]:
df = pd.read_csv("./data_etc/timeseries_daily.csv")
df.head()

Unnamed: 0,Date,feature_1,feature_2,feature_3,feature_4,categorical_feature,weekday
0,01/02/2017,0,0,37,0,foo,Wednesday
1,02/02/2017,0,0,168,0,foo,Thursday
2,03/02/2017,0,0,157,0,other,Friday
3,04/02/2017,0,0,720,0,other,Saturday
4,05/02/2017,0,0,721,0,bar,Sunday


Note that the "Date" column has loaded as string (object), and not as datetimes:

In [3]:
df.dtypes

Date                   object
feature_1               int64
feature_2               int64
feature_3               int64
feature_4               int64
categorical_feature    object
weekday                object
dtype: object

### Create a proper datetime column:
Note the <code>dayfirst=True</code> flag. This tells pandas to expect a date format where the day goes before the month (e.g. 29/01/2017), instead of an (illogical) American format date where the month goes first (e.g. 01/29/2017).

In [4]:
# Create a proper datetime column
df["datetime"] = pd.to_datetime(df["Date"], dayfirst=True)

df.head()

Unnamed: 0,Date,feature_1,feature_2,feature_3,feature_4,categorical_feature,weekday,datetime
0,01/02/2017,0,0,37,0,foo,Wednesday,2017-02-01
1,02/02/2017,0,0,168,0,foo,Thursday,2017-02-02
2,03/02/2017,0,0,157,0,other,Friday,2017-02-03
3,04/02/2017,0,0,720,0,other,Saturday,2017-02-04
4,05/02/2017,0,0,721,0,bar,Sunday,2017-02-05


The new column is a proper datetime object:

In [5]:
df.dtypes

Date                           object
feature_1                       int64
feature_2                       int64
feature_3                       int64
feature_4                       int64
categorical_feature            object
weekday                        object
datetime               datetime64[ns]
dtype: object