# 1. Classes of the module Datetime

In [53]:
import pandas as pd
import datetime as dt

df = pd.read_csv("./data/employees_satisfaction.csv", index_col=0)
df

Unnamed: 0,emp_id,age,Dept,education,recruitment_type,job_level,rating,awards,certifications,salary,gender,entry_date,last_raise,satisfied
0,HR8270,28,HR,PG,Referral,5,2.0,1,0,86750,m,2019-02-01,,1
1,TECH1860,50,Technology,PG,Recruitment Agency,3,5.0,2,1,42419,Male,2017-01-17,,0
2,TECH6390,43,Technology,UG,Referral,4,1.0,2,0,65715,f,2012-08-27,,1
3,SAL6191,44,Sales,PG,On-Campus,2,3.0,0,0,29805,f,2017-07-25,,1
4,HR6734,33,HR,UG,Recruitment Agency,2,1.0,5,0,29805,m,2019-05-17,,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
495,HR5330,49,HR,PG,On-Campus,2,5.0,6,0,29805,m,2014-03-21,,1
496,TECH9010,24,Technology,UG,Referral,2,4.0,2,0,29805,f,2018-02-20,,0
497,MKT7801,34,Marketing,PG,On-Campus,1,,2,0,24076,m,2020-10-20,,1
498,TECH5846,26,Technology,UG,Walk-in,2,1.0,1,1,29805,Male,2012-05-18,,0


### 1.1. Class ```date``` of the ```Datetime``` module
The ```date``` class in the ```DateTime``` module of Python deals with dates in the Gregorian calendar. It accepts three integer arguments: year, month, and day. Let’s have a look at how to create a date object of a date class:

In [54]:
d = dt.date(2020,4,23)
print(d)
print(type(d))

2020-04-23
<class 'datetime.date'>


We can extract features like day, month, and year from the date object. This can be done using the ```day```, ```month```, and ```year``` attributes. We will see how to do that on the current local day date object that we will create using the ```today()``` function:

In [55]:
d1 = dt.date.today()

print('Day: ', d1.day)

print('Month: ', d1.month)

print('Year: ', d1.year)

### 1.2. Class ```time``` of the ```Datetime``` module
```time``` is another class of the DateTime module that accepts integer arguments for time up to microseconds and returns a DateTime object:

In [56]:
t = dt.time(13,20,13,40)

print(t)

print(type(t))

# hour
print('Hour :',t.hour)
# minute
print('Minute :',t.minute)
# second
print('Second :',t.second)
# microsecond
print('Microsecond :',t.microsecond)

13:20:13.000040
<class 'datetime.time'>
Hour : 13
Minute : 20
Second : 13
Microsecond : 40


### 1.3. Class ```datetime``` of the ```Datetime``` module

```datetime``` is a class and an object in Python’s ```DateTime``` module, just like date and time. The arguments are a combination of date and time attributes, starting from the year and ending in microseconds. Lets create a DateTime object:

In [57]:
d1 = dt.datetime(2020,4,23,11,20,30,40)
print(d1)
print(type(d1))

2020-04-23 11:20:30.000040
<class 'datetime.datetime'>


Or you could even create an object on the local date and time using the ```now()``` method:

In [58]:
dt.datetime.now()

datetime.datetime(2023, 8, 18, 20, 10, 49, 482128)

You can go on and extract whichever value you want to from the DateTime object using the same attributes we used with the date and time objects individually.

# 2. Methods of  ```datetime``` class


### 2.1. Updating old Dates

There are two ways to do this:
1. Replace a value in the DateTime objects without having to change the entire date using the replace() method:

In [59]:
print('New datetime :',d1.replace(day=24, hour=14))

New datetime : 2020-04-24 14:20:30.000040


2. Separate date and time from the DateTime object using the date() and time() methods. 

In [60]:
print('Datetime :',d1)
# date
print('Date :',d1.date())
# time
print('Time :',d1.time())

Datetime : 2020-04-23 11:20:30.000040
Date : 2020-04-23
Time : 11:20:30.000040


### 2.2. What week of the day it is?
One really cool thing that you can do with the DateTime function is to extract the day of the week! This is especially helpful in feature engineering because the value of the target variable can be dependent on the day of the week, like sales of a product are generally higher on a weekend or traffic on StackOverflow could be higher on a weekday when people are working, etc.

The ```weekday()``` method returns an integer value for the day of the week, where Monday is 0 and Sunday is 6. But if you wanted it to return the weekday value between 1 and 7, like in a real-world scenario, you should use ```isoweekday()```:

In [61]:
print(d1.weekday()) # output 3 for Thurday
# week starts with 1
print(d1.isoweekday())

3
4


###  2.3. What week is it?

Knowing what week of the year it is is useful in cases in which the value of the target variable might be higher during certain times of the year. For example, the sales of products on e-commerce websites are generally higher during vacations. You can get the week of the year by slicing the value returned by the ````isocalendar()``` method:

In [62]:
print(d1.isocalendar())
print('Week :',d1.isocalendar()[1])

datetime.IsoCalendarDate(year=2020, week=17, weekday=4)
Week : 17


### 2.4. The ```Calender``` Module Leap Year or Not?

You will need to use the ```isleap()``` method from the ```calendar``` module and pass the year as an attribute:

In [63]:
import calendar
# leap year or not
calendar.isleap(d1.year)

True

You can print a calender as follows:

In [64]:
print(calendar.month(2020,4))

     April 2020
Mo Tu We Th Fr Sa Su
       1  2  3  4  5
 6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30



In [65]:
print(calendar.calendar(2020))


                                  2020

      January                   February                   March
Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su
       1  2  3  4  5                      1  2                         1
 6  7  8  9 10 11 12       3  4  5  6  7  8  9       2  3  4  5  6  7  8
13 14 15 16 17 18 19      10 11 12 13 14 15 16       9 10 11 12 13 14 15
20 21 22 23 24 25 26      17 18 19 20 21 22 23      16 17 18 19 20 21 22
27 28 29 30 31            24 25 26 27 28 29         23 24 25 26 27 28 29
                                                    30 31

       April                      May                       June
Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su
       1  2  3  4  5                   1  2  3       1  2  3  4  5  6  7
 6  7  8  9 10 11 12       4  5  6  7  8  9 10       8  9 10 11 12 13 14
13 14 15 16 17 18 19      11 12 13 14 15 16 17      15 16 17 18 19 20 21
20 21 22 23 24 25 26      18 19 20 21 22 

# 3. Datetime Formats
The Datetime module lets you interchange the format of DateTime between a few options.

### 3.1. From and into ISO format:
If you wanted to create a DateTime object from the string form of the date in ISO format, use the ```fromisoformat()``` method. And if you intended to do the reverse, use the ```isoformat()``` method:

In [66]:
d_datetime = dt.date.fromisoformat('2020-04-23')
print("From ISO to normal datetime: "+str(d_datetime))
print(type(d_datetime))
d_ISO = dt.date(2020,4,23).isoformat()
print("From normal datetime to ISO: "+str(d_ISO))
print(type(d_ISO))

From ISO to normal datetime: 2020-04-23
<class 'datetime.date'>
From normal datetime to ISO: 2020-04-23
<class 'str'>


### 3.2. From and into String format:
If you wanted to convert DateTime into a string format, you could use the ```ctime()``` method. This returns the date in a string format. And if you wanted to extract just the date from that, well, you would have to use slicing:

In [67]:
# date in string format
d = dt.datetime.now()
# string format for date
print(d.ctime())
# slicing to extract date
print(d.ctime()[:10])

Fri Aug 18 20:10:50 2023
Fri Aug 18


### 3.3. User defined format:
And if none of these functions strike your fancy, you could use the ```format()``` method which lets you define your own format:

In [69]:
dt.date(2020,4,23).__format__('%Y/%m/%d')

'2020/04/23'

# 4. time difference with TimeDelta
So far, we have seen how to create a DateTime object and how to format it. But sometimes, you might have to find the duration between two dates, which can be another very useful feature that you can derive from a dataset. This duration is, however, returned as a ```timedelta``` object.

### 4.1. Time difference between two dates:

In [70]:
d1 = dt.datetime(2020,4,23,11,13,10)
d2 = dt.datetime(2021,4,23,12,13,10)
duration = d2-d1

print(duration)
print(type(duration))

365 days, 1:00:00
<class 'datetime.timedelta'>


As you can see, the duration is returned as the number of days for the date and seconds for the time between the dates. So you can actually retrieve these values for your features.

In [71]:
print("difference in days is "+str(duration.days))
print("difference in seconds is "+str(duration.seconds))

difference in days is 365
difference in seconds is 3600


But what if you actually wanted the duration in hours or minutes? Well, there is a simple solution for that.

timedelta is also a class in the DateTime module. So, you could use it to convert your duration into hours and minutes as I’ve done below:

In [72]:
# duration in hours
print('Duration in hours :',duration/dt.timedelta(hours=1))
# duration in minutes
print('Duration in minutes :',duration/dt.timedelta(minutes=1))
# duration in seconds
print('Duration in seconds :',duration/dt.timedelta(seconds=1))

Duration in hours : 8761.0
Duration in minutes : 525660.0
Duration in seconds : 31539600.0


### 4.2. Add time to current date

#### Example: What's the date 5 days from today?
timedelta makes it possible to add and subtract integers from a DateTime object.

In [73]:
d1 = dt.datetime.now()
print("Today's date :",d1)

d2 = d1+dt.timedelta(days=2)
print("Date 2 days from today :",d2)

d3 = d1-dt.timedelta(weeks=2)
print("Date 2 weeks ago from today :",d3)

Today's date : 2023-08-18 20:11:22.212524
Date 2 days from today : 2023-08-20 20:11:22.212524
Date 2 weeks ago from today : 2023-08-04 20:11:22.212524


# 5. Datetime in Pandas

Pandas has some great methods for handling dates and times, such as:
### 5.1. ```to_datetime()```: converts from string to Datetime objects



In [74]:
date = pd.to_datetime('24th of April, 2020')
print(date)
print(type(date))

2020-04-24 00:00:00
<class 'pandas._libs.tslibs.timestamps.Timestamp'>


The type of the object returned by ```to_datetime()``` is not ```DateTime``` but ```Timestamp```. This is just the Pandas equivalent of Python’s DateTime.

For example: The `entry_date` column in the df above, contains information about the entry date, but the data is not of datetime type. This is very unfavorable for date calculations, so convert the data types accordingly.

In [75]:
# Ihre Lösung
# convert the values in the `entry_date` column to *actual* dates
df['entry_date'] = pd.to_datetime(df['entry_date'], format='%Y-%m-%d')

# print out the `min()` and `max()` values in the `entry_date` column
display(df['entry_date'])
print(df['entry_date'].min())
print(df['entry_date'].max())

0     2019-02-01
1     2017-01-17
2     2012-08-27
3     2017-07-25
4     2019-05-17
         ...    
495   2014-03-21
496   2018-02-20
497   2020-10-20
498   2012-05-18
499   2018-12-28
Name: entry_date, Length: 500, dtype: datetime64[ns]

2004-01-05 00:00:00
2020-12-17 00:00:00


To convert multiple columns use one of the following methods:

In [None]:
#df.iloc[:, 7:12] = data.iloc[:, 7:12].apply(pd.to_datetime, errors='coerce')

In [None]:
#cols_2_extract = data.columns[2:15]

#data[cols_2_extract] = data[cols_2_extract].applymap(lambda x : pd.to_datetime(x, format = '%d %M %Y'))

### 5.2. ```to_timedelta()```: calculates absolute time differences

In [None]:
import numpy as np
date = dt.datetime.now()
# present date
print(date)
# date after 1 day
print(date+pd.to_timedelta(1,unit='D'))
# date after 1 month
print(date+pd.to_timedelta(1*30,unit='D'))

### 5.3. ```date_range()```: creation of date sequences
To make the creation of date sequences a convenient task, Pandas provides the ```date_range()``` method. It accepts a start date, an end date, and an optional frequency code (you can find the list of frequency cods [here](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases)):

In [None]:
pd.date_range(start='24/4/2020', end='24/5/2020', freq='D')

Instead of defining the end date, you could define the period or number of time periods you want to generate:

In [None]:
start_date = dt.datetime.today()
dates_start = pd.date_range(start=start_date, periods=10, freq='T')
print(dates_start[:5])
len(dates_start)

# 6. Lets make an example

Let’s also create a series of end dates and make a dummy dataset from which we can derive some new features and bring our learning about DateTime to fruition.

In [None]:
dates_end = pd.date_range(start=start_date, periods=10, freq='D')
dates_end[:5]

Initialise a dataset containing start date, end date, and a target variable:

In [None]:
import random
randomList = []
for i in range(10):
    randomList.append(random.randint(0,1))

# dataframe
df = pd.DataFrame()
df['Start_date'] = dates_start
df['End_date'] = dates_end
df['Target'] = randomList

df.head()

We can create multiple new features (columns) from the date column, like the day, month, year, hour, minute, etc. using the dt attribute as shown below:

In [None]:
# day
df['Day'] = df['Start_date'].dt.day
# month
df['Month'] = df['Start_date'].dt.month
# year
df['Year'] = df['Start_date'].dt.year
# hour
df['Start_hour'] = df['Start_date'].dt.hour
# minute
df['Start_minute'] = df['Start_date'].dt.minute
# second
df['Start_second'] = df['Start_date'].dt.second
# Monday is 0 and Sunday is 6
df['Start_weekday'] = df['Start_date'].dt.weekday
# week of the year
a=df['Start_date'].dt.to_pydatetime()
df['Start_week_of_year'] = a[0].isocalendar()[1]
# duration
df['Duration'] = df['End_date']-df['Start_date']


In [None]:
df.head()

In [None]:
df['Duration_days'] = df['Duration']/dt.timedelta(days=1)
df['Duration_minutes'] = df['Duration']/dt.timedelta(minutes=1)
df['Duration_seconds'] = df['Duration']/dt.timedelta(seconds=1)
df.head()

Now, let’s make the start date the index of the DataFrame. This will help us easily analyze our dataset because we can use slicing to find data representing our desired dates:

In [None]:
df.index=df['Start_date']
df['2020-04-24':'2020-04-24'].head()