## Working with Dates and Time in Python 

Python has builtin modules to handle date, time and datetime

## Table of Contents
- The Importance of the Date-Time Component
- Working with Dates in Python
- Working with Time in Python
- DateTime in Python
    - Updating old dates
    - Extracting Weekday from DateTime
    - What week is it?
    - Leap year or not? Use the calendar!
    - The Different Datetime formats
    - Advanced DateTime formatting with Strptime & Strftime
    - Timedelta
- DateTime with Pandas
    - DateTime and Timedelta objects in Pandas
    - Date range in Pandas
    - Making DateTime features in Pandas

## The Importance of the Date-Time Component
It’s worth reiterating, dates and times are a treasure trove of information and that is why data scientists love them so much.

Before we dive into the crux of the article, I want you to experience this yourself. Take a look at the date and time right now. Try and imagine all kinds of information that you can extract from it to understand your reading habit. The year, month, day, hour, and minute are the usual suspects.

But if you dig a little further, you can determine whether you prefer reading on weekdays or weekends, whether you are a morning person or a night owl (we are in the same boat here!), or whether you accumulate all the interesting articles to read at the end of the month!

Clearly, the list will go on and you will gradually learn a lot about your reading habits if you repeat this exercise after collecting the data over a period of time, say a month. Now imagine how useful this feature would be in a real-world scenario where information is collected over a long period of time.

Date and time features find importance in data science problems spanning industries from sales, marketing, and finance to HR, e-commerce, retail, and many more. Predicting how the stock markets will behave tomorrow, how many products will be sold in the upcoming week, when is the best time to launch a new product, how long before a position at the company gets filled, etc. are some of the problems that we can find answers to using date and time data.

This incredible amount of insight that you can unravel from the data is what makes date and time components so fun to work with! So let’s get down to the business of mastering date-time manipulation in Python.

## Working with Dates in Python
The date class in the DateTime module of Python deals with dates in the Gregorian calendar. It accepts three integer arguments: year, month, and day. Let’s have a look at how it’s done:

In [2]:
from datetime import date

d1 = date(2020,4,23)

print(d1)

print(type(d1))

2020-04-23
<class 'datetime.date'>



You can see how easy it was to create a date object of datetime class. And it’s even easier to extract features like day, month, and year from the date. This can be done using the day, month, and year attributes. We will see how to do that on the current local day date object that we will create using the today() function:

In [3]:
# present day date
d1 = date.today()
print(d1)
# day
print('Day :',d1.day)
# month
print('Month :',d1.month)
# year
print('Year :',d1.year)

2021-03-26
Day : 26
Month : 3
Year : 2021


## Working with Time in Python
time is another class of the DateTime module that accepts integer arguments for time up to microseconds and returns a DateTime object:

In [5]:
from datetime import time

t1 = time(13,20,13,40)

print(t1)

print(type(t1))

13:20:13.000040
<class 'datetime.time'>


You can extract features like hour, minute, second, and microsecond from the time object using the respective attributes. Here is an example:

In [8]:
# hour
print('Hour :',t1.hour)
# minute
print('Minute :',t1.minute)
# second
print('Second :',t1.second)
# microsecond
print('Microsecond :',t1.microsecond)

Hour : 13
Minute : 20
Second : 13
Microsecond : 40


## DateTime in Python
So far, we have seen how to create a date and a time object using the DateTime module. But the beauty of the DateTime module is that it lets you dovetail both the properties into a single object, DateTime!

datetime is a class and an object in Python’s DateTime module, just like date and time. The arguments are a combination of date and time attributes, starting from the year and ending in microseconds.

So, let’s see how you can create a DateTime object:

In [10]:
from datetime import datetime
d1 = datetime(2020,4,23,11,20,30,40)
print(d1)
print(type(d1))

2020-04-23 11:20:30.000040
<class 'datetime.datetime'>


In [11]:
# local date-time
d1 = datetime.now()
d1

datetime.datetime(2021, 3, 26, 23, 34, 36, 405300)

You can go on and extract whichever value you want to from the DateTime object using the same attributes we used with the date and time objects individually.

Next, let’s look at some of the methods in the DateTime class.

## Updating old Dates
First, we’ll see how to separate date and time from the DateTime object using the date() and time() methods. But you could also replace a value in the DateTime objects without having to change the entire date using the replace() method:

In [13]:
print('Datetime :',d1)
# date
print('Date :',d1.date())
# time
print('Time :',d1.time())
# new datetime
print('New datetime :',d1.replace(day=24, hour=14))

Datetime : 2021-03-26 23:34:36.405300
Date : 2021-03-26
Time : 23:34:36.405300
New datetime : 2021-03-24 14:34:36.405300


## Weekday from DateTime
One really cool thing that you can do with the DateTime function is to extract the day of the week! This is especially helpful in feature engineering because the value of the target variable can be dependent on the day of the week, like sales of a product are generally higher on a weekend or traffic on StackOverflow could be higher on a weekday when people are working, etc.

The * weekday() * method returns an integer value for the day of the week, where Monday is 0 and Sunday is 6. But if you wanted it to return the weekday value between 1 and 7, like in a real-world scenario, you should use *isoweekday()*:

In [15]:
d1 = datetime.now()
# week starts from 0
print(d1.weekday()) # output 4 for Friday
# week starts with 1
print(d1.isoweekday()) # output 5 in ISO format

4
5


## What Week is it?
Alright, you know the day of the week, but do you know what week of the year is it? This is another very important feature that you can generate from the given date in a dataset.

Sometimes the value of the target variable might be higher during certain times of the year. For example, the sales of products on e-commerce websites are generally higher during vacations.

You can get the week of the year by slicing the value returned by the isocalendar() method:

In [17]:
d1 = datetime.now()
# retuns year, week, month
print(d1.isocalendar())
print('Week :',d1.isocalendar()[1])

datetime.IsoCalendarDate(year=2021, week=12, weekday=5)
Week : 12


## Leap Year or Not? Use Calendar!
Want to check whether it is a leap year or not? You will need to use the isleap() method from the calendar module and pass the year as an attribute:

In [20]:
import calendar
d1 = datetime.now()
# leap year or not
calendar.isleap(d1.year) # Output True or False

False

In [21]:
# view calendar
# Its was March when I worked on this
print(calendar.month(2020,3))

     March 2020
Mo Tu We Th Fr Sa Su
                   1
 2  3  4  5  6  7  8
 9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31



In [22]:
# Not free this month? You can have a look at the entire calendar for the year:

In [23]:
print(calendar.calendar(2021))

                                  2021

      January                   February                   March
Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su
             1  2  3       1  2  3  4  5  6  7       1  2  3  4  5  6  7
 4  5  6  7  8  9 10       8  9 10 11 12 13 14       8  9 10 11 12 13 14
11 12 13 14 15 16 17      15 16 17 18 19 20 21      15 16 17 18 19 20 21
18 19 20 21 22 23 24      22 23 24 25 26 27 28      22 23 24 25 26 27 28
25 26 27 28 29 30 31                                29 30 31

       April                      May                       June
Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su
          1  2  3  4                      1  2          1  2  3  4  5  6
 5  6  7  8  9 10 11       3  4  5  6  7  8  9       7  8  9 10 11 12 13
12 13 14 15 16 17 18      10 11 12 13 14 15 16      14 15 16 17 18 19 20
19 20 21 22 23 24 25      17 18 19 20 21 22 23      21 22 23 24 25 26 27
26 27 28 29 30            24 25 26 27 

## DateTime Formats
The Datetime module lets you interchange the format of DateTime between a few options.

First up is the ISO format. If you wanted to create a DateTime object from the string form of the date in ISO format, use the fromisoformat() method. And if you intended to do the reverse, use the isoformat() method:

In [25]:
# ISO format
d1_datetime = date.fromisoformat('2020-04-23')
print(d1_datetime)
print(type(d1_datetime))
d1_ISO = date(2020,4,23).isoformat()
print(d1_ISO)
print(type(d1_ISO))

2020-04-23
<class 'datetime.date'>
2020-04-23
<class 'str'>


If you wanted to convert DateTime into a string format, you could use the ctime() method. This returns the date in a string format. And if you wanted to extract just the date from that, well, you would have to use slicing:

In [26]:
# date in string format
d1 = datetime.now()
# string format for date
print(d1.ctime())
# slicing to extract date
print(d1.ctime()[:10])

Fri Mar 26 23:46:09 2021
Fri Mar 26


And if none of these functions strike your fancy, you could use the format() method which lets you define your own format:

In [27]:
date(2020,4,23).__format__('%Y/%m/%d')

'2020/04/23'

Wait – what are these arguments I passed to the function? These are called formatted string codes and we will look at them in detail in the next section.

## Advanced DateTime Formatting with Strptime & Strftime
These functions are very important as they let you define the format of the DateTime object explicitly. This can give you a lot of flexibility with handling DateTime features.

strptime() creates a DateTime object from a string representing date and time. It takes two arguments: the date and the format in which your date is present. Have a look below:

In [29]:
# strptime
date = '22 April, 2020 13:20:13'
d1 = datetime.strptime(date,'%d %B, %Y %H:%M:%S')
print(d1)
print(type(d1))

2020-04-22 13:20:13
<class 'datetime.datetime'>


You define the format using the formatting codes as I did above. There are a number of formatting codes and you can have a look at them in the [documentation](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes).

The stftime() method, on the other hand, can be used to convert the DateTime object into a string representing date and time:

In [30]:
# strftime
d1 = datetime.now()
print('Datetime object :',d1)
new_date = d1.strftime('%d/%m/%Y %H:%M')
print('Formatted date :',new_date)
print(type(new_date))

Datetime object : 2021-03-26 23:51:12.001227
Formatted date : 26/03/2021 23:51
<class 'str'>


But you can also extract some important information from the DateTime object like weekday name, month name, week number, etc. which can turn out to be very useful in terms of features as we saw in previous sections.

In [31]:
d1 = datetime.now()
print('Weekday :',d1.strftime('%A'))
print('Month :',d1.strftime('%B'))
print('Week number :',d1.strftime('%W'))
print("Locale's date and time representation :",d1.strftime('%c'))

Weekday : Friday
Month : March
Week number : 12
Locale's date and time representation : Fri Mar 26 23:51:59 2021


## Timedelta
So far, we have seen how to create a DateTime object and how to format it. 

But sometimes, you might have to find the duration between two dates, which can be another very useful feature that you can derive from a dataset. 
This duration is, however, returned as a timedelta object.

In [33]:
# timedelta : duration between dates
d1 = datetime(2020,4,23,11,13,10)
d2 = datetime(2021,4,23,12,13,10)
duration = d2-d1
print(type(duration))
duration

<class 'datetime.timedelta'>


datetime.timedelta(days=365, seconds=3600)

As you can see, the duration is returned as the number of days for the date and seconds for the time between the dates. So you can actually retrieve these values for your features:

In [34]:
print(duration.days) # 365
print(duration.seconds) # 3600

365
3600


But what if you actually wanted the duration in hours or minutes? Well, there is a simple solution for that.

timedelta is also a class in the DateTime module. So, you could use it to convert your duration into hours and minutes as I’ve done below:

In [35]:
from datetime import timedelta
# duration in hours
print('Duration in hours :',duration/timedelta(hours=1))
# duration in minutes
print('Duration in minutes :',duration/timedelta(minutes=1))
# duration in seconds
print('Duration in seconds :',duration/timedelta(seconds=1))

Duration in hours : 8761.0
Duration in minutes : 525660.0
Duration in seconds : 31539600.0


In [36]:
## Now, what if you wanted to get the date 5 days from today? Do you simply add 5 to the present date?
d1 = datetime.now()
d1+5

TypeError: unsupported operand type(s) for +: 'datetime.datetime' and 'int'

Not quite. So how do you go about it then? You use timedelta of course!

timedelta makes it possible to add and subtract integers from a DateTime object.

In [37]:
d1 = datetime.now()
print("Today's date :",d1)

d2 = d1+timedelta(days=2)
print("Date 2 days from today :",d2)

d3 = d1+timedelta(weeks=2)
print("Date 2 weeks from today :",d3)

Today's date : 2021-03-27 00:01:18.696139
Date 2 days from today : 2021-03-29 00:01:18.696139
Date 2 weeks from today : 2021-04-10 00:01:18.696139


# Part 2

## DateTime in Pandas
We already know that Pandas is a great library for doing data analysis tasks. And so it goes without saying that Pandas also supports Python DateTime objects. It has some great methods for handling dates and times, such as to_datetime() and to_timedelta().

## DateTime and Timedelta objects in Pandas
The to_datetime() method converts the date and time in string format to a DateTime object:

In [42]:
import pandas as pd
# to_datetime
date = pd.to_datetime('24th of April, 2020')
print(date)
print(type(date))

2020-04-24 00:00:00
<class 'pandas._libs.tslibs.timestamps.Timestamp'>


You might have noticed something strange here. The type of the object returned by to_datetime() is not DateTime but Timestamp. Well, don’t worry, it is just the Pandas equivalent of Python’s DateTime.

We already know that timedelta gives differences in times. The Pandas to_timedelta() method does just this:

In [51]:
import numpy as np
# timedelta
import numpy as np
date = datetime.now()
# present date
print(date)
# date after 1 day
print(date+pd.to_timedelta(1,unit='D'))
# date after 1 month i.e written as 4 weeks  
print(date+pd.to_timedelta(4,unit='W'))

2021-03-27 00:18:29.647666
2021-03-28 00:18:29.647666
2021-04-24 00:18:29.647666


In the above, the unit determines the unit of the argument, whether that’s day, month, year, hours, etc.

## Date Range in Pandas
To make the creation of date sequences a convenient task, Pandas provides the date_range() method. It accepts a start date, an end date, and an optional frequency code:

In [53]:
pd.date_range(start='24/4/2020', end='24/5/2020', freq='D')

DatetimeIndex(['2020-04-24', '2020-04-25', '2020-04-26', '2020-04-27',
               '2020-04-28', '2020-04-29', '2020-04-30', '2020-05-01',
               '2020-05-02', '2020-05-03', '2020-05-04', '2020-05-05',
               '2020-05-06', '2020-05-07', '2020-05-08', '2020-05-09',
               '2020-05-10', '2020-05-11', '2020-05-12', '2020-05-13',
               '2020-05-14', '2020-05-15', '2020-05-16', '2020-05-17',
               '2020-05-18', '2020-05-19', '2020-05-20', '2020-05-21',
               '2020-05-22', '2020-05-23', '2020-05-24'],
              dtype='datetime64[ns]', freq='D')

Instead of defining the end date, you could define the period or number of time periods you want to generate:

In [60]:
from datetime import datetime
start_date = datetime.today()
dates_start = pd.date_range(start=start_date, periods=10, freq='T')
# freq takes values T, D, W, M, Y
dates_start[:5]

DatetimeIndex(['2021-03-27 00:27:01.906996', '2021-03-27 00:28:01.906996',
               '2021-03-27 00:29:01.906996', '2021-03-27 00:30:01.906996',
               '2021-03-27 00:31:01.906996'],
              dtype='datetime64[ns]', freq='T')

## Making DateTime Features in Pandas
Let’s also create a series of end dates and make a dummy dataset from which we can derive some new features and bring our learning about DateTime to fruition.

In [62]:
dates_end = pd.date_range(start=start_date, periods=10, freq='D')
dates_end[:5]

DatetimeIndex(['2021-03-27 00:27:01.906996', '2021-03-28 00:27:01.906996',
               '2021-03-29 00:27:01.906996', '2021-03-30 00:27:01.906996',
               '2021-03-31 00:27:01.906996'],
              dtype='datetime64[ns]', freq='D')

In [63]:
import random
randomList = []
for i in range(10):
    randomList.append(random.randint(0,1))

# dataframe
df = pd.DataFrame()
df['Start_date'] = dates_start
df['End_date'] = dates_end
df['Target'] = randomList

df.head()

Unnamed: 0,Start_date,End_date,Target
0,2021-03-27 00:27:01.906996,2021-03-27 00:27:01.906996,1
1,2021-03-27 00:28:01.906996,2021-03-28 00:27:01.906996,1
2,2021-03-27 00:29:01.906996,2021-03-29 00:27:01.906996,1
3,2021-03-27 00:30:01.906996,2021-03-30 00:27:01.906996,0
4,2021-03-27 00:31:01.906996,2021-03-31 00:27:01.906996,1


We can create multiple new features from the date column, like the day, month, year, hour, minute, etc. using the dt attribute as shown below:

In [65]:
# day
df['Day'] = df['Start_date'].dt.day
# month
df['Month'] = df['Start_date'].dt.month
# year
df['Year'] = df['Start_date'].dt.year
# hour
df['Start_hour'] = df['Start_date'].dt.hour
# minute
df['Start_minute'] = df['Start_date'].dt.minute
# second
df['Start_second'] = df['Start_date'].dt.second
# Monday is 0 and Sunday is 6
df['Start_weekday'] = df['Start_date'].dt.weekday
# week of the year
df['Start_week_of_year'] = df['Start_date'].dt.isocalendar().week
# duration
df['Duration'] = df['End_date']-df['Start_date']

In [66]:
df

Unnamed: 0,Start_date,End_date,Target,Day,Month,Year,Start_hour,Start_minute,Start_second,Start_weekday,Start_week_of_year,Duration
0,2021-03-27 00:27:01.906996,2021-03-27 00:27:01.906996,1,27,3,2021,0,27,1,5,12,0 days 00:00:00
1,2021-03-27 00:28:01.906996,2021-03-28 00:27:01.906996,1,27,3,2021,0,28,1,5,12,0 days 23:59:00
2,2021-03-27 00:29:01.906996,2021-03-29 00:27:01.906996,1,27,3,2021,0,29,1,5,12,1 days 23:58:00
3,2021-03-27 00:30:01.906996,2021-03-30 00:27:01.906996,0,27,3,2021,0,30,1,5,12,2 days 23:57:00
4,2021-03-27 00:31:01.906996,2021-03-31 00:27:01.906996,1,27,3,2021,0,31,1,5,12,3 days 23:56:00
5,2021-03-27 00:32:01.906996,2021-04-01 00:27:01.906996,0,27,3,2021,0,32,1,5,12,4 days 23:55:00
6,2021-03-27 00:33:01.906996,2021-04-02 00:27:01.906996,0,27,3,2021,0,33,1,5,12,5 days 23:54:00
7,2021-03-27 00:34:01.906996,2021-04-03 00:27:01.906996,0,27,3,2021,0,34,1,5,12,6 days 23:53:00
8,2021-03-27 00:35:01.906996,2021-04-04 00:27:01.906996,1,27,3,2021,0,35,1,5,12,7 days 23:52:00
9,2021-03-27 00:36:01.906996,2021-04-05 00:27:01.906996,0,27,3,2021,0,36,1,5,12,8 days 23:51:00


Our duration feature is great, but what if we would like to have the duration in minutes or seconds? Remember how in the timedelta section we converted the date to seconds? We could do the same here!

In [67]:
df['Duration_days'] = df['Duration']/timedelta(days=1)
df['Duration_minutes'] = df['Duration']/timedelta(minutes=1)
df['Duration_seconds'] = df['Duration']/timedelta(seconds=1)

In [68]:
df

Unnamed: 0,Start_date,End_date,Target,Day,Month,Year,Start_hour,Start_minute,Start_second,Start_weekday,Start_week_of_year,Duration,Duration_days,Duration_minutes,Duration_seconds
0,2021-03-27 00:27:01.906996,2021-03-27 00:27:01.906996,1,27,3,2021,0,27,1,5,12,0 days 00:00:00,0.0,0.0,0.0
1,2021-03-27 00:28:01.906996,2021-03-28 00:27:01.906996,1,27,3,2021,0,28,1,5,12,0 days 23:59:00,0.999306,1439.0,86340.0
2,2021-03-27 00:29:01.906996,2021-03-29 00:27:01.906996,1,27,3,2021,0,29,1,5,12,1 days 23:58:00,1.998611,2878.0,172680.0
3,2021-03-27 00:30:01.906996,2021-03-30 00:27:01.906996,0,27,3,2021,0,30,1,5,12,2 days 23:57:00,2.997917,4317.0,259020.0
4,2021-03-27 00:31:01.906996,2021-03-31 00:27:01.906996,1,27,3,2021,0,31,1,5,12,3 days 23:56:00,3.997222,5756.0,345360.0
5,2021-03-27 00:32:01.906996,2021-04-01 00:27:01.906996,0,27,3,2021,0,32,1,5,12,4 days 23:55:00,4.996528,7195.0,431700.0
6,2021-03-27 00:33:01.906996,2021-04-02 00:27:01.906996,0,27,3,2021,0,33,1,5,12,5 days 23:54:00,5.995833,8634.0,518040.0
7,2021-03-27 00:34:01.906996,2021-04-03 00:27:01.906996,0,27,3,2021,0,34,1,5,12,6 days 23:53:00,6.995139,10073.0,604380.0
8,2021-03-27 00:35:01.906996,2021-04-04 00:27:01.906996,1,27,3,2021,0,35,1,5,12,7 days 23:52:00,7.994444,11512.0,690720.0
9,2021-03-27 00:36:01.906996,2021-04-05 00:27:01.906996,0,27,3,2021,0,36,1,5,12,8 days 23:51:00,8.99375,12951.0,777060.0


Great! Can you see how many new features we created from just the dates?

Now, let’s make the start date the index of the DataFrame. This will help us easily analyze our dataset because we can use slicing to find data representing our desired dates:

In [72]:
df.index=df['Start_date']
df['2021-03-27':'2021-03-27'].head()

Unnamed: 0_level_0,Start_date,End_date,Target,Day,Month,Year,Start_hour,Start_minute,Start_second,Start_weekday,Start_week_of_year,Duration,Duration_days,Duration_minutes,Duration_seconds
Start_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
2021-03-27 00:27:01.906996,2021-03-27 00:27:01.906996,2021-03-27 00:27:01.906996,1,27,3,2021,0,27,1,5,12,0 days 00:00:00,0.0,0.0,0.0
2021-03-27 00:28:01.906996,2021-03-27 00:28:01.906996,2021-03-28 00:27:01.906996,1,27,3,2021,0,28,1,5,12,0 days 23:59:00,0.999306,1439.0,86340.0
2021-03-27 00:29:01.906996,2021-03-27 00:29:01.906996,2021-03-29 00:27:01.906996,1,27,3,2021,0,29,1,5,12,1 days 23:58:00,1.998611,2878.0,172680.0
2021-03-27 00:30:01.906996,2021-03-27 00:30:01.906996,2021-03-30 00:27:01.906996,0,27,3,2021,0,30,1,5,12,2 days 23:57:00,2.997917,4317.0,259020.0
2021-03-27 00:31:01.906996,2021-03-27 00:31:01.906996,2021-03-31 00:27:01.906996,1,27,3,2021,0,31,1,5,12,3 days 23:56:00,3.997222,5756.0,345360.0


Awesome! This is super useful when you want to do visualizations or any data analysis.