# **How to extract year, month, day, hour, minute, second, and week  from datetime variable**

**Learning objectives:**


In this chapter you will learn the following topics:

- Extracting date and time parts from a datetime variable

- Deriving representations of the year, month, and day

- Extracting time parts from a time variable (In Hours, Minutes, and seconds)
- Creating representations of day and week


**Technical Requirements**

In this chapter, you will use the following Python libraries: 
* pandas, 
* datetime

Install a jupyter-notebook along with these libraries. To understand this topic, you should have familiarity with the above-mentioned libraries.

# __Introduction__

Date and time variables are those that contain information about date and time or both. Examples of the
datetime variables are date of birth, date of last transaction etc. 

We do not utilize the datetime variables in their
raw format when building ML models. Instead, we derive multiple features from these variables.

**Example:**

_To proceed datetime extraction, let's first create a toy dataframe for the
demonstration and sample 10 rows from them:_

_freq='s' means frequency changes in each second._

In [0]:
import pandas as pd
from datetime import datetime
date_rng = pd.date_range(start='1/1/2018', end='2/1/2018', freq='s') # day month year generates datetime data
df = pd.DataFrame(date_rng)
df.columns = ["Datetime"]

In [0]:
df.head(5)

Unnamed: 0,Datetime
0,2018-01-01 00:00:00
1,2018-01-01 00:00:01
2,2018-01-01 00:00:02
3,2018-01-01 00:00:03
4,2018-01-01 00:00:04


_Take random ten samples from dataframe._

In [0]:
df = df.sample(n=10).reset_index(drop=True) # take random 10 samples from dataframe df

In [0]:
df.head()

Unnamed: 0,Datetime
0,2018-01-18 16:11:53
1,2018-01-30 09:24:53
2,2018-01-04 09:05:26
3,2018-01-29 09:27:09
4,2018-01-06 17:48:21


_check data type of column._

In [0]:
print (df['Datetime'].dtype) # datatype of Datetime column

datetime64[ns]


_Extract year, month, and day  from Datetime column._



In [0]:
df['Year'] = df['Datetime'].dt.year # extract Year from Datetime column

df['Month'] = df['Datetime'].dt.month # extract Month from Datetime column

df['Day'] = df['Datetime'].dt.day # extract Day from Datetime column
df.head(3)

Unnamed: 0,Datetime,Year,Month,Day
0,2018-01-18 16:11:53,2018,1,18
1,2018-01-30 09:24:53,2018,1,30
2,2018-01-04 09:05:26,2018,1,4


_Extract Hour, Minute, and second from datetime column._

In [0]:
df['Hour'] = df['Datetime'].dt.hour  # extract Hour from Datetime column
df['Minute'] = df['Datetime'].dt.minute  # extract Minute from Datetime column
df['Second'] = df['Datetime'].dt.second  # extract Second from Datetime column
df.head()

Unnamed: 0,Datetime,Year,Month,Day,Hour,Minute,Second
0,2018-01-18 16:11:53,2018,1,18,16,11,53
1,2018-01-30 09:24:53,2018,1,30,9,24,53
2,2018-01-04 09:05:26,2018,1,4,9,5,26
3,2018-01-29 09:27:09,2018,1,29,9,27,9
4,2018-01-06 17:48:21,2018,1,6,17,48,21


_To extract the date part, you utilized pandas' dt.date, and, to extract the time part, you used pandas' dt.time on the DateTime variable._

_In this way you can sucessfully extract datetime from data._

**What if we have months in a text in the datetime variable?**

_Let's first create a new dataframe where the datetime variable is cast as an
object and display the output:_

In [0]:
df = pd.DataFrame({'date_var':['Apr-2020', 'Jan-2018', 'Jun-2017','Nov-2019']})
df


Unnamed: 0,date_var
0,Apr-2020
1,Jan-2018
2,Jun-2017
3,Nov-2019


_lets check datatype of
the date_var variable._

In [0]:
df.dtypes

date_var    object
dtype: object

_It is casted as object variable. So Let's change the data type of the variable into datetime and display the
dataframe:_

In [0]:
df['datetime_var'] = pd.to_datetime(df['date_var'])
df

Unnamed: 0,date_var,datetime_var
0,Apr-2020,2020-04-01
1,Jan-2018,2018-01-01
2,Jun-2017,2017-06-01
3,Nov-2019,2019-11-01


In [0]:
df.dtypes

date_var                object
datetime_var    datetime64[ns]
dtype: object

_We can see  that both the original and newly created
variables are cast as object and datetime, respectively:_


 let's extract the date and time part of the variable that was recast into
datetime:

In [0]:
df['date'] = df['datetime_var'].dt.date
df['time'] = df['datetime_var'].dt.time
df

Unnamed: 0,date_var,datetime_var,date,time
0,Apr-2020,2020-04-01,2020-04-01,00:00:00
1,Jan-2018,2018-01-01,2018-01-01,00:00:00
2,Jun-2017,2017-06-01,2017-06-01,00:00:00
3,Nov-2019,2019-11-01,2019-11-01,00:00:00


In [0]:
df.dtypes

date_var                object
datetime_var    datetime64[ns]
date                    object
time                    object
dtype: object

__What if I want to derive day and week of a month?__

_Let's create 25 datetime observations, beginning from 2020-01-04 at midnight
followed by increments of 1 day._

_freq='D' gives frequency in days_

In [0]:
range_d = pd.date_range('2020-01-04', periods=25, freq='D')
df = pd.DataFrame({'date': range_d})
df.head()

Unnamed: 0,date
0,2020-01-04
1,2020-01-05
2,2020-01-06
3,2020-01-07
4,2020-01-08


_Let's extract the day of the month, which can take values between 1-31._

In [0]:
df['day_of_mo'] = df['date'].dt.day
df.head()

Unnamed: 0,date,day_of_mo
0,2020-01-04,4
1,2020-01-05,5
2,2020-01-06,6
3,2020-01-07,7
4,2020-01-08,8


_Let's extract the day of the week, with values between 0 and 6 in a new column._

In [0]:
df['day_of_week'] = df['date'].dt.dayofweek
df.head()

Unnamed: 0,date,day_of_mo,day_of_week
0,2020-01-04,4,5
1,2020-01-05,5,6
2,2020-01-06,6,0
3,2020-01-07,7,1
4,2020-01-08,8,2


_let's extract the name of the day of the week, that is, Sunday, Monday, and
so on, into a new column._

In [0]:
df['day_week_name'] = df['date'].dt.day_name()
df.head()

Unnamed: 0,date,day_of_mo,day_of_week,day_week_name
0,2020-01-04,4,5,Saturday
1,2020-01-05,5,6,Sunday
2,2020-01-06,6,0,Monday
3,2020-01-07,7,1,Tuesday
4,2020-01-08,8,2,Wednesday


_let's capture the corresponding week of the year. It's value ranges from 0-52_

In [0]:
df['week'] = df['date'].dt.week
df.head()

Unnamed: 0,date,day_of_mo,day_of_week,day_week_name,week
0,2020-01-04,4,5,Saturday,1
1,2020-01-05,5,6,Sunday,1
2,2020-01-06,6,0,Monday,2
3,2020-01-07,7,1,Tuesday,2
4,2020-01-08,8,2,Wednesday,2


**Topics you learned here from datetime variable:**

- To extract year, month, and day

- To extract hour, minute, and second

- To extract week 

# **Key take away**

- To create a range of
values starting from an arbitrary date use Pandas date_range() method  

- To extract the date part, you
utilized pandas' dt.date, and, to extract the time part, you used pandas' dt.time on the
DateTime variable.

- To extract the date and time parts, we first recast the variable into the DateTime
format using pandas' to_datetime()