# Interview Question - Python 10,000 Days

When is the 10,000 anniversay of Python's Birthday? What day of the week did Python first appear?

First let's explore dates and datetimes in Python

In [1]:
import sys
from io import StringIO
import numpy as np
import pandas as pd

In [2]:
date_data = StringIO("""
month_day_year	day_month_year	date_time	year_month_day
4/22/1996	22-Apr-96	Mon Apr 22 09:50:35 1996	1996-04-22
4/23/1996	23-Apr-96	Tue Apr 23 19:50:35 1996	1996-04-23
5/14/1996	14-May-96	Tue May 14 09:50:35 1996	1996-05-14
5/15/1996	15-May-96	Wed May 15 09:50:35 1996	1996-05-15
5/16/2001	16-May-01	Wed May 16 07:30:36 2001	2001-05-16
5/17/2002	17-May-02	Fri May 17 09:50:35 2002	2002-05-17
5/18/2003	18-May-03	Sun May 18 09:50:35 2003	2003-05-18
5/19/2004	19-May-04	Wed May 19 09:50:35 2004	2004-05-19
5/20/2005	20-May-05	Fri May 20 19:40:25 2005	2005-05-20
""")

date_df = pd.read_table(date_data)

for col in date_df:
    print(type(date_df[col][0]))

date_df.head()

<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>


Unnamed: 0,month_day_year,day_month_year,date_time,year_month_day
0,4/22/1996,22-Apr-96,Mon Apr 22 09:50:35 1996,1996-04-22
1,4/23/1996,23-Apr-96,Tue Apr 23 19:50:35 1996,1996-04-23
2,5/14/1996,14-May-96,Tue May 14 09:50:35 1996,1996-05-14
3,5/15/1996,15-May-96,Wed May 15 09:50:35 1996,1996-05-15
4,5/16/2001,16-May-01,Wed May 16 07:30:36 2001,2001-05-16


We can convert columns to datetimes:

In [3]:
date_df['month_day_year'] = pd.to_datetime(date_df['month_day_year'], errors='coerce')
date_df['day_month_year'] = pd.to_datetime(date_df['day_month_year'], errors='coerce')
date_df['date_time'] = pd.to_datetime(date_df['date_time'], errors='coerce')
date_df['year_month_day'] = pd.to_datetime(date_df['year_month_day'], errors='coerce')

for col in date_df:
    print(type(date_df[col][0]))
date_df.head()

<class 'pandas._libs.tslibs.timestamps.Timestamp'>
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
<class 'pandas._libs.tslibs.timestamps.Timestamp'>


Unnamed: 0,month_day_year,day_month_year,date_time,year_month_day
0,1996-04-22,1996-04-22,1996-04-22 09:50:35,1996-04-22
1,1996-04-23,1996-04-23,1996-04-23 19:50:35,1996-04-23
2,1996-05-14,1996-05-14,1996-05-14 09:50:35,1996-05-14
3,1996-05-15,1996-05-15,1996-05-15 09:50:35,1996-05-15
4,2001-05-16,2001-05-16,2001-05-16 07:30:36,2001-05-16


We can try to suggest date-times when reading in data though:

In [4]:
date_data = StringIO("""
month_day_year	day_month_year	date_time	year_month_day
4/22/1996	22-Apr-96	Mon Apr 22 09:50:35 1996	1996-04-22
4/23/1996	23-Apr-96	Tue Apr 23 19:50:35 1996	1996-04-23
5/14/1996	14-May-96	Tue May 14 09:50:35 1996	1996-05-14
5/15/1996	15-May-96	Wed May 15 09:50:35 1996	1996-05-15
5/16/2001	16-May-01	Wed May 16 07:30:36 2001	2001-05-16
5/17/2002	17-May-02	Fri May 17 09:50:35 2002	2002-05-17
5/18/2003	18-May-03	Sun May 18 09:50:35 2003	2003-05-18
5/19/2004	19-May-04	Wed May 19 09:50:35 2004	2004-05-19
5/20/2005	20-May-05	Fri May 20 19:40:25 2005	2005-05-20
""")
date_df2 = pd.read_table(date_data, parse_dates=[0,1,2,3])

for col in date_df2:
    print(type(date_df2[col][0]))

date_df2.head()

<class 'pandas._libs.tslibs.timestamps.Timestamp'>
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
<class 'pandas._libs.tslibs.timestamps.Timestamp'>


Unnamed: 0,month_day_year,day_month_year,date_time,year_month_day
0,1996-04-22,1996-04-22,1996-04-22 09:50:35,1996-04-22
1,1996-04-23,1996-04-23,1996-04-23 19:50:35,1996-04-23
2,1996-05-14,1996-05-14,1996-05-14 09:50:35,1996-05-14
3,1996-05-15,1996-05-15,1996-05-15 09:50:35,1996-05-15
4,2001-05-16,2001-05-16,2001-05-16 07:30:36,2001-05-16


What if we have an odd format that pandas can't guess?

In [5]:
strange_date = "12:05:15 2018-02-02"
pd.to_datetime(strange_date, format= "%H:%M:%S %Y-%d-%m")

Timestamp('2018-02-02 12:05:15')

For a list of formats see:

https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior


--------------

Now that Pandas knows we have a date, we can get specific properties about that date:

In [6]:
column1 = date_df.iloc[:,0]

pd.DataFrame({"year": column1.dt.year,
              "month": column1.dt.month,
              "day": column1.dt.day,
              "hour": column1.dt.hour,
              "dayofyear": column1.dt.dayofyear,
              "week": column1.dt.week,
              "weekofyear": column1.dt.weekofyear,
              "dayofweek": column1.dt.dayofweek,
              "weekday": column1.dt.weekday,
              "quarter": column1.dt.quarter,
             })

Unnamed: 0,year,month,day,hour,dayofyear,week,weekofyear,dayofweek,weekday,quarter
0,1996,4,22,0,113,17,17,0,0,2
1,1996,4,23,0,114,17,17,1,1,2
2,1996,5,14,0,135,20,20,1,1,2
3,1996,5,15,0,136,20,20,2,2,2
4,2001,5,16,0,136,20,20,2,2,2
5,2002,5,17,0,137,20,20,4,4,2
6,2003,5,18,0,138,20,20,6,6,2
7,2004,5,19,0,140,21,21,2,2,2
8,2005,5,20,0,140,20,20,4,4,2


We can also get native date differences (defaults to days) Note that anything larger than days doesn't have a fixed definition. (Months = 28, 30, or 31 days, Years = 365 or 366 days, etc...)

In [7]:
print(date_df.iloc[1,0])
print(date_df.iloc[3,0])
diff = date_df.iloc[3,0]-date_df.iloc[1,0]
print(diff)
print(type(diff))

diff_day_value = diff / np.timedelta64(1, 'D')
print('# of days: {}'.format(diff_day_value))
print('# of seconds: {}'.format(diff / np.timedelta64(1, 's')))

1996-04-23 00:00:00
1996-05-15 00:00:00
22 days 00:00:00
<class 'pandas._libs.tslibs.timedeltas.Timedelta'>
# of days: 22.0
# of seconds: 1900800.0


### Today's date and past/future dates

In [8]:
from datetime import datetime

In [9]:
today = datetime.now()
print(today)

2019-02-06 09:35:55.569494


## Python 10,000 days later
From <a href="https://en.wikipedia.org/wiki/Python_(programming_language)">Wikipedia</a>, we learn that Python first appeared on February 20, 1991.

In [10]:
pybday = '1991-02-20'

In [11]:
birthdate = np.datetime64(pybday)

timedelta_10k = np.timedelta64(10000, 'D')

print('10,000 day birthday is: {}'.format(birthdate + timedelta_10k))

10,000 day birthday is: 2018-07-08


What day of the week was '1991-02-20'?

In [12]:
first_date = pd.to_datetime(np.datetime64(pybday))
type(first_date)

pandas._libs.tslibs.timestamps.Timestamp

In [13]:
first_date.strftime('%A')

'Wednesday'