In [1]:
import pandas as pd

### Pandas To Datetime

Pandas to datetime is a beautiful function that allows you to convert your strings into DateTimes. This is extremely useful when working with Time Series data.

Let's convert strings to datetimes:
1. Basic conversion with scalar string
2. Convert Pandas Series to datetime
3. Convert Pandas Series to datetime w/ custom format
4. Convert Unix integer (days) to datetime
5. Convert integer (seconds) to datetime

The hardest part about this jupyter notebook will be creating the messy strings to convert. Forgive the plumming you'll see.

### 1. Basic Basic conversion with scalar string

To convert any string to a datetime, you'll need to start with .to_datetime(). This is called directly from the pandas library.

For this first one, I'll show the types of the variables to demonstrate going from a string to a datetime.

In [2]:
string_to_convert = '2020-02-01'
print ('Your string: {}'.format(string_to_convert))
print ('Your string_to_convert type: {}'.format(type(string_to_convert)))
print ()

# Convert your string
new_date = pd.to_datetime(string_to_convert)

print ('Your new date is: {}'.format(new_date))
print ('Your new type is: {}'.format(type(new_date)))

Your string: 2020-02-01
Your string_to_convert type: <class 'str'>

Your new date is: 2020-02-01 00:00:00
Your new type is: <class 'pandas._libs.tslibs.timestamps.Timestamp'>


### 2. Convert Pandas Series to datetime

Instead of passing a single string, I usually pass a series of strings that need converting.

Then, I'll replace a DataFrame column with the new Datetime column

First I'll make my series

In [3]:
s = pd.Series(['2020-02-01',
               '2020-02-02',
               '2020-02-03',
               '2020-02-04'])
s

0    2020-02-01
1    2020-02-02
2    2020-02-03
3    2020-02-04
dtype: object

In [4]:
s = pd.to_datetime(s)
s

0   2020-02-01
1   2020-02-02
2   2020-02-03
3   2020-02-04
dtype: datetime64[ns]

### 3. Convert Pandas Series to datetime w/ custom format
Let's get into the awesome power of Datetime conversion with format codes. Say you have a messy string with a date inside and you need to convert it to a date. You need to tell pandas how to convert it and this is done via **format codes.**

Look how cool that is! We can pass any string along with a format and pandas will parse the dates

In [5]:
s = pd.Series(['My 3date is 01199002',
           'My 3date is 02199015',
           'My 3date is 03199020',
           'My 3date is 09199204'])
s

0    My 3date is 01199002
1    My 3date is 02199015
2    My 3date is 03199020
3    My 3date is 09199204
dtype: object

In [6]:
s = pd.to_datetime(s, format="My 3date is %m%Y%d")
s

0   1990-01-02
1   1990-02-15
2   1990-03-20
3   1992-09-04
dtype: datetime64[ns]

### 4. Convert Unix integer (days) to datetime
You can also convert integers into Datetimes. You'll need to keep two things in mind
1. What is your reference point?
2. What is the unit of your integer?

**Reference point** = What time do you want to start 'counting' your units from?

**Unit** = Is your integer in terms of # of days, seconds, years, etc.?

In [7]:
pd.to_datetime(14554, unit='D', origin='unix')

Timestamp('2009-11-06 00:00:00')

### 5. Convert integer (seconds) to datetime
More often, you'll have a unix timestamp that is expresses in seconds. As in seconds away from the default origin of 1970-01-01.

For example, at the time of this post, we are 1,600,355,888 seconds away from 1970-01-01. That's lot of seconds!

In [8]:
pd.to_datetime(1600355888, unit='s', origin='unix')

Timestamp('2020-09-17 15:18:08')

### Bonus: 6. Change your origin or reference point
Say your dataset only has # of days after a certain time, but no datetimes. You could either *add* all of those days via a pd.Timedelta(). 

Or you could convert them to datetimes with a different origin. Let's check this out from 2020-02-01.

Below, we convert 160 into +160 days *after* 2020-02-01.

In [9]:
pd.to_datetime(160, unit='D', origin='2020-02-01')

Timestamp('2020-07-10 00:00:00')