# How to convert/parse strings into correct datetime object and split them into date and time column

---

In this notebook, I want to show you some tricks how conveniently you can convert or parse string date into Pandas datetime object and handle two digit years correctly. Additionaly, I'll show you how to split datetime object into date and time column.

In [1]:
import pandas as pd

Let create some artificial data as dictionary and then convert if to Pandas DataFrame object

In [2]:
data = {"Name":["Tom", "Kate", "Mark", "Ken"], "Birth_Date":["16/11/89", "8/03/65", "14/05/64", "06/01/68"]}

In [3]:
# Convert dict into DataFrame


df = pd.DataFrame(data)

In [4]:
df

Unnamed: 0,Name,Birth_Date
0,Tom,16/11/89
1,Kate,8/03/65
2,Mark,14/05/64
3,Ken,06/01/68


Let check the type of ```Birth_Date``` column to make sure that it's ```object``` type and not ```datetime```

In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 2 columns):
Name          4 non-null object
Birth_Date    4 non-null object
dtypes: object(2)
memory usage: 192.0+ bytes


We clearly see that ```Birth_Date``` is object or it's string and Pandas conventional methods for datetime object will not work here. Let convert in to datetime object with ```pd.to_datetime()``` method. Before that, in ```Birth_Date``` column first entry is day, second is month and third is year.

In [6]:
df['date_of_bith'] = pd.to_datetime(df['Birth_Date'])

In [7]:
df

Unnamed: 0,Name,Birth_Date,date_of_bith
0,Tom,16/11/89,1989-11-16
1,Kate,8/03/65,2065-08-03
2,Mark,14/05/64,2064-05-14
3,Ken,06/01/68,2068-06-01


Hmmm, who was born in the future? what's wrong here? [The answer comes from docs](https://docs.python.org/3/library/time.html): Python depends on the platform’s C library, which generally doesn’t have year 2000 issues, since all dates and times are represented internally as seconds since the epoch. Function ```strptime()``` can parse 2-digit years when given ```%y``` format code. When 2-digit years are parsed, they are converted according to the **POSIX** and **ISO** C standards:values 69–99 are mapped to 1969–1999, and values 0–68 are mapped to 2000–2068**.

Now, everything is clear. We need to find some workaround. I solved this problem with Pandas Series ```str```.

In [8]:
df['correct_birth_date'] = pd.to_datetime(df['Birth_Date'].str[:-2] + '19' + df['Birth_Date'].str[-2:])

In [9]:
df

Unnamed: 0,Name,Birth_Date,date_of_bith,correct_birth_date
0,Tom,16/11/89,1989-11-16,1989-11-16
1,Kate,8/03/65,2065-08-03,1965-08-03
2,Mark,14/05/64,2064-05-14,1964-05-14
3,Ken,06/01/68,2068-06-01,1968-06-01


In [10]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 4 columns):
Name                  4 non-null object
Birth_Date            4 non-null object
date_of_bith          4 non-null datetime64[ns]
correct_birth_date    4 non-null datetime64[ns]
dtypes: datetime64[ns](2), object(2)
memory usage: 256.0+ bytes


Now you can use datetime methods and attributes for this column. Let extract year, month, and day separately.

In [12]:
df['year'] = df['correct_birth_date'].dt.year

df['month'] = df['correct_birth_date'].dt.month

df['day'] = df['correct_birth_date'].dt.day

In [13]:
df

Unnamed: 0,Name,Birth_Date,date_of_bith,correct_birth_date,year,month,day
0,Tom,16/11/89,1989-11-16,1989-11-16,1989,11,16
1,Kate,8/03/65,2065-08-03,1965-08-03,1965,8,3
2,Mark,14/05/64,2064-05-14,1964-05-14,1964,5,14
3,Ken,06/01/68,2068-06-01,1968-06-01,1968,6,1


## Split datetime object into date and time


---

What if we want to split datetime object into date and time? That's not hard task. We can use Pands ```to_datetime()``` method with its attributes, as above. For this let rewrite our data and then create new dataframe.

In [14]:
data = {"Name":["Tom", "Kate", "Mark", "Ken"], "Birth_Date":["2016-02-22 14:59:44.561776", "2017-03-23 15:59:44.561776",
                                                            "2018-04-24 16:59:44.561776", "2019-05-25 17:59:44.561776"]}

In [15]:
df = pd.DataFrame(data)

df['birth_date'] = pd.to_datetime(df['Birth_Date'])

In [16]:
df

Unnamed: 0,Name,Birth_Date,birth_date
0,Tom,2016-02-22 14:59:44.561776,2016-02-22 14:59:44.561776
1,Kate,2017-03-23 15:59:44.561776,2017-03-23 15:59:44.561776
2,Mark,2018-04-24 16:59:44.561776,2018-04-24 16:59:44.561776
3,Ken,2019-05-25 17:59:44.561776,2019-05-25 17:59:44.561776


In [17]:
df.dtypes

Name                  object
Birth_Date            object
birth_date    datetime64[ns]
dtype: object

Let split ```birth_date``` column into data and time columns

In [18]:
df['Date'] = pd.to_datetime(df['birth_date']).dt.date

df['Time'] = pd.to_datetime(df['birth_date']).dt.time

In [19]:
df

Unnamed: 0,Name,Birth_Date,birth_date,Date,Time
0,Tom,2016-02-22 14:59:44.561776,2016-02-22 14:59:44.561776,2016-02-22,14:59:44.561776
1,Kate,2017-03-23 15:59:44.561776,2017-03-23 15:59:44.561776,2017-03-23,15:59:44.561776
2,Mark,2018-04-24 16:59:44.561776,2018-04-24 16:59:44.561776,2018-04-24,16:59:44.561776
3,Ken,2019-05-25 17:59:44.561776,2019-05-25 17:59:44.561776,2019-05-25,17:59:44.561776
