# Working with Dates and Times

-----

In this notebook, we introduce how to work efficiently with dates and times in Python.

-----


## Table of Contents

[Time](#Time)

[Datetime](#Datetime)

[Timezone](#Timezone)

[Timedelta](#Timedelta)

[Datetime Format](#Datetime-Format)

[Pandas DataFrame Date and Time Manipulation](#Pandas-DataFrame-Date-and-Time-Manipulation)


-----

[[Back to TOC]](#Table-of-Contents)

## Time

While it might seem that working with time in a program would be simple, the reality is that handling time data is complicated by several challenges. For a discussion of these issues, see the official Python [time documentation][td]. First, computers are designed to work with numerical data, so what is the numerical representation for a time? The simple solution for this problem is to define an epoch, or starting point for time data, which in Python (or Unix systems in general) is typically taken to be midnight, or 00:00:00, on January 1, 1970(UTC). 

Second, time is relative, and is measured relative to a time zone. For example, in Champaign-Urbana, we are in the US Central time zone. Programmatically, time zones are often indicated by the large major cities. Thus, for the University of Illinois we would programmatically refer to our time zone by 'America/Chicago'. The epoch is defined in the UTC(Coordinated Universal Time) time zone.

These issues, and the way we deal with them in Python, are demonstrated in the following Code cells. First, we import the `time` module, and demonstrate how to convert between the time (in seconds) since the start of the current epoch and the corresponding time representation by using the `ctime` function. These times can, of course, extend into the future. We can perform the conversion in the other direction (time to seconds since the start of the epoch) by using the `time` function.


-----

[td]: https://docs.python.org/3/library/time.html




In [1]:
import time as tm

# Convert from current time to the seconds since the start of the epoch.
tm.time()

1578502808.544706

In [2]:
# Time for one billion seconds
print(tm.ctime(1E9))

Sat Sep  8 20:46:40 2001


In [3]:
# Time for ten billion seconds
print(tm.ctime(1E10))

Sat Nov 20 11:46:40 2286


-----

[[Back to TOC]](#Table-of-Contents)


## Datetime

To deal effectively with dates and times from around the world, Python provides the [`datetime`][dt] module. By default, the times are represented in the coordinated universal time(UTC), or timezone zero. This module also includes functionality for dealing with times in the `time` module, and for dates in the `date` module. Several basic functions are demonstrated for each of these modules in the following Code cells.

-----

[dt]: https://docs.python.org/3.5/library/datetime.html

In [4]:
# Work with time and dates
from datetime import datetime
from datetime import time

print(datetime.now())
print(datetime.utcnow())
print(time(14, 2, 12, 2105))

2020-01-08 11:00:08.610489
2020-01-08 17:00:08.613225
14:02:12.002105


In [5]:
# Display dates
from datetime import date 
today = date.today()
print(today)

2020-01-08


-----

[[Back to TOC]](#Table-of-Contents)


## Timezone

To be effective, these Python time and date functions need to be able to deal with time zones. By default, the functionality for this is declared in the [`tzinfo` abstract class][tz]. This allows different implementations that all conform to the same standard; but the challenge to developers is to ensure a fast and reliable implementation is available and can be used for our analyses. In this notebook, we will use the `dateutil` module, which provides timezone support. This is demonstrated in the following Code cells, where we obtain the time zone for 'America/Chicago' by using the `gettz` function, and display the date and time in this representation, the full `datetime` object, as well as the current time in several other time zones around the world.

-----

[tz]: https://docs.python.org/3.5/library/datetime.html#datetime.tzinfo

In [6]:
from dateutil import tz

lcl = tz.gettz('America/Chicago')

In [7]:
print(datetime.now(lcl))

2020-01-08 11:00:08.641119-06:00


In [8]:
print('Tokyo: ', datetime.now(tz.gettz("Asia/Tokyo")))
print('Dubai: ', datetime.now(tz.gettz("Asia/Dubai")))
print('London: ', datetime.now(tz.gettz("Europe/London")))
print('Madrid: ', datetime.now(tz.gettz('Europe/Madrid')))
print('Chicago: ', datetime.now(tz.gettz("America/Chicago")))

Tokyo:  2020-01-09 02:00:08.663687+09:00
Dubai:  2020-01-08 21:00:08.667415+04:00
London:  2020-01-08 17:00:08.670846+00:00
Madrid:  2020-01-08 18:00:08.674346+01:00
Chicago:  2020-01-08 11:00:08.675048-06:00


-----

[[Back to TOC]](#Table-of-Contents)

## Timedelta

To simplify processing time and date information, Python provides the [`timedelta`][tdm] class. This allows mathematical operations to be performed on times and dates by using standard conventions, like the number of hours, or the number of days. The following Code cells demonstrate several ways to perform date computations by using this module.


-----

[tdm]: https://docs.python.org/3.5/library/datetime.html#timedelta-objects

In [9]:
# days to next Olympic which starts on July 24, 2020
(datetime(2020,7,24) - datetime.now()).days

197

In [10]:
from datetime import timedelta

today = datetime.today()
print(today)

2020-01-08 11:00:08.699571


In [11]:
#24 hours ago
print(today - timedelta(hours=24))

2020-01-07 11:00:08.699571


In [12]:
#52 weeks ago
print(today - timedelta(weeks=52))

2019-01-09 11:00:08.699571


In [13]:
#2 days later
print(today + timedelta(days=2))

2020-01-10 11:00:08.699571


-----

<font color='red' size = '5'> Student Exercise </font>


In the preceding cells, we introduced working with times and dates in Python. Now that you have run the notebook, go back and make the following changes to see how the results change.

1. Change the time zone to a new location.
2. Compute the date and time corresponding to 2.5 billion seconds since the start of the epoch.
3. Compute the date and time that corresponds to 5,000 days from today.


-----

[[Back to TOC]](#Table-of-Contents)

## Datetime Format

Date and time are very common in datasets. Different regions around the world have different ways of representing dates and times. For example, September 30, 2019 can be recorded in a data file as `2019-09-30`, `09/30/2019`, `Sep 30, 2019` or many other different formats. To analyze data with date and time, we need to convert a datetime string in various formats to a datetime object. So it's very important to understand datetime formatting and the ways to convert datetime string to a datetime object and vice versa.

Python `datetime` module contains methods to convert between datetime string and object. `strftime` method is to convert datetime object to readable date and time string. `strptime` is to convert date and time string to datetime object. We will demonstrate these two methods in the following Code cells.

In the Code cell below, we first define a datetime object for September 30, 2019, then convert it to various datetime string representations.

In [14]:
from datetime import datetime
#define a demo date of September 30, 2019
demo_date = datetime(2019, 9, 30)
#09/30/2019
print(demo_date.strftime('%m/%d/%Y'))
#Sep 30, 2019
print(demo_date.strftime('%b %d, %Y'))
#2019-09-30 00:00:00
print(demo_date.strftime('%Y-%m-%d %H:%M:%S'))

09/30/2019
Sep 30, 2019
2019-09-30 00:00:00


---

In above Code cell, we use `strftime` method to convert `demo_date` to date string with different formats. `strftime` takes one argument, which is a string that defines the format of the output date and time string.

We have used the following format codes to format the `demo_date`:

%m: Month as a zero-padded decimal number. In our example, it returned "09".   
%d: Day of the month as a zero-padded decimal number. In our example, it returned "30".  
%Y: Year with century as a decimal number. In our example, it returned "2019".  
%b: Month as locale’s abbreviated name. In our example, it returned "Sep".  
%H: Hour (24-hour clock) as a zero-padded decimal number. In our example, it returned "00".  
%M: Minute as a zero-padded decimal number. In our example, it returned "00".  
%S: Second as a zero-padded decimal number. In our example, it returned "00".  

For the complete date and time format codes, please check out Python [`datetime` module document][dtf].

Any character in the format string that is not a format code will be printed as is. Just like "/", ",", "-" and white spaces in our examples.

`strptime` converts date and time string to a datetime object. The function takes two arguments, the first one is the date and time string, the second one is the format string. In the next Code cell, we convert a date and time string to datetime object with `strptime` function.

---
[dtf]:https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior

In [15]:
datetime.strptime('09/30/2019', '%m/%d/%Y')

datetime.datetime(2019, 9, 30, 0, 0)

---

When calling `strptime`, the format string must match the datetime string. If there is a mismatch, python will throw `ValueError` exception as shown below.

<img src='images/mismatch_date_format.png' width=1000>

-----

<font color='red' size = '5'> Student Exercise </font>


In the preceding cells, we introduced `strftime` and `strptime`. In the following Code cell, try to finish following tasks:

1. Convert `demo_date` to string `'Sep 30, 19'`. (You will need to find out the format character for two-digit year.)
2. Convert `'2019-09-30 08:30'` to a datetime object.

-----

-----

[[Back to TOC]](#Table-of-Contents)


## Pandas DataFrame Date and Time Manipulation

We already learned how to convert a datetime string to datetime object. In this section we will learn how to convert a column in a Pandas DataFrame to datetime object. First we will load a weekly Dow Joes average data. The `Date` column is type `Object`, which means it's loaded as string value.

In [16]:
import pandas as pd
df_dow = pd.read_csv('data/dow_weekly_all.csv')
df_dow.head()

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
0,1985-01-28,1277.719971,1305.099976,1266.890015,1277.719971,1277.719971,55430000
1,1985-02-04,1272.079956,1301.130005,1268.98999,1289.969971,1289.969971,59480000
2,1985-02-11,1287.98999,1307.530029,1266.339966,1282.02002,1282.02002,61270000
3,1985-02-18,1279.810059,1292.51001,1269.98999,1275.839966,1275.839966,34550000
4,1985-02-25,1269.98999,1309.959961,1263.910034,1299.359985,1299.359985,55750000


In [17]:
df_dow.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1764 entries, 0 to 1763
Data columns (total 7 columns):
Date         1764 non-null object
Open         1764 non-null float64
High         1764 non-null float64
Low          1764 non-null float64
Close        1764 non-null float64
Adj Close    1764 non-null float64
Volume       1764 non-null int64
dtypes: float64(5), int64(1), object(1)
memory usage: 96.6+ KB


---

Pandas has a method `to_datetime` which converts datetime string to datetime object. We demonstrate this function in next Code cell.

In [18]:
df_dow['Date'] = pd.to_datetime(df_dow.Date)
df_dow.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1764 entries, 0 to 1763
Data columns (total 7 columns):
Date         1764 non-null datetime64[ns]
Open         1764 non-null float64
High         1764 non-null float64
Low          1764 non-null float64
Close        1764 non-null float64
Adj Close    1764 non-null float64
Volume       1764 non-null int64
dtypes: datetime64[ns](1), float64(5), int64(1)
memory usage: 96.6 KB


---
Now `Date` column in the DataFrame has type `datetime64`. In above code, we don't even pass a format to the function. Pandas is able to figure out the datetime format from the string values in the column. This approach works when the datetime string values are recognizable by Pandas. While Pandas can recognize many datetime formats, it can't cover all the cases. In next Code cell, we define a DataFrame with a Date column, which has various datetime formats. The first two datetime strings are recognizable by Pandas. But the last one is not. If we use `to_datetime` to convert the `Date` column, a `ValueError` exception is thrown as shown in the second Code cell below. We put the code in try-except block so that the exception is handled(printed) elegantly.

In [19]:
demo_df = pd.DataFrame({'Date':['2019-09-30', '09/30/2019', '2019.09.30 00.00.00']})
demo_df

Unnamed: 0,Date
0,2019-09-30
1,09/30/2019
2,2019.09.30 00.00.00


In [20]:
try:
    demo_df['Date_ojb'] = pd.to_datetime(demo_df.Date)
except Exception as ex:
    print(ex)

('Unknown string format:', '2019.09.30 00.00.00')


---

When there are datetime string formats that are not recognizable by `to_datetime` method, we can pass a user-defined format to the function argument `format`. But in the above case, since there are multiple datetime formats in the `Date` column, we can't pass a single format string to cover all the cases. We can then use lambda function to solve the issue.

In next Code cell, we first define a function `convert_date`. In the function, we use `strptime` to convert date string to datetime object. Since there are multiple formats, we use multiple try-except blocks to handle different formats. If the first format matches the data string, return the datetime object; if not, `except` clause will catch the exception. `pass` in the `except` clause tells the code to do nothing and just go to the next try-except block to try the next date format. We define try-except blocks for all date formats appearing in the dataset. When a match is found, the function directly returns the datetime object. The last line is a print statement which prints out the input argument. When this line is reached, it means all formats in the function can't handle this `date_str`. We then need to add a new format in another try-except block to the function to handle the case.

We then apply this function via lambda function on `Date` column. This time we successfully created `Date_obj` column which is `datetime64` type.

Please note that it's a good practice to create a new date column when converting date string columns. This is to ensure that we always have the original data. If there's anything wrong in the conversion, we can fix the problem and try on the original date column repeatedly.

In [21]:
def convert_date(date_str):
    try:
        return datetime.strptime(date_str, '%Y-%m-%d')
    except:
        pass
    try:
        return datetime.strptime(date_str, '%m/%d/%Y')
    except:
        pass
    try:
        return datetime.strptime(date_str, '%Y.%m.%d %H.%M.%S')
    except:
        pass
    print ("Can't convert", date_str)
    
demo_df['Date_obj'] = demo_df.Date.apply(lambda x:convert_date(x))
demo_df.head()

Unnamed: 0,Date,Date_obj
0,2019-09-30,2019-09-30
1,09/30/2019,2019-09-30
2,2019.09.30 00.00.00,2019-09-30


-----

<font color='red' size = '5'> Student Exercise </font>


In the following Code cell, a DataFrame is defined with a Data column. Please convert data strings in the Date column to datetime objects and create a new `Date_obj` column to host the datetime objects.

Try using `to_datetime` directly and lambda function. Do both ways work?

-----

In [22]:
exe_df = pd.DataFrame({'Date':['2019-09-30', '09/30/2019', '2019.09.30']})
exe_df

Unnamed: 0,Date
0,2019-09-30
1,09/30/2019
2,2019.09.30


## Ancillary Information

The following links are to additional documentation that you might find helpful in learning this material. Reading these web-accessible documents is completely optional.

1. Python [`time` module][ptm] and [Python `datetime` module][pdtm]
2. Informative tutorial introducing Python [time and datetime][tad] modules
3. Summary sheet of [string time formatting codes][so]
4. A Python [date and time][dt1] tutorial
5. A tutorial on using Python [string format methods for working with times and dates][tsft]
7. Python tutorial on [working with times and dates][ptt] including the use of Pandas.
8. Tutorial on using pandas for times and dates, [part 1][ptd1] and [part 2][ptd2]
111. Review of the Python [DateTime][pdtm] module

-----

[wtzd]: https://en.wikipedia.org/wiki/Tz_database

[ptm]: https://docs.python.org/3/library/time.html
[pdtm]: https://docs.python.org/3/library/datetime.html

[tad]: http://o7planning.org/en/11443/python-date-time-tutorial
[so]: http://strftime.org

[dt1]: https://intellipaat.com/tutorial/python-tutorial/python-date-and-time/
[dt2]: https://www.webcodegeeks.com/python/python-datetime-tutorial/

[ptct]: https://www.saltycrane.com/blog/2008/06/how-to-get-current-date-and-time-in/
[dt3]: http://www.pythonforbeginners.com/basics/python-datetime-timedelta/

[tsft]: https://www.tutorialspoint.com/python/time_strptime.htm
[dt4]: https://www.tutorialspoint.com/python/python_date_time.htm
[dt5]: https://opensource.com/article/17/5/understanding-datetime-python-primer
[ptt]: http://www.marcelscharth.com/python/time.html
[ptd1]: http://earthpy.org/pandas-basics.html
[ptd2]: http://earthpy.org/time_series_analysis_with_pandas_part_2.html

[dud]: https://dateutil.readthedocs.io/en/stable/
[pdtm]: https://pymotw.com/2/datetime/

**&copy; 2019: Gies College of Business at the University of Illinois.**

This notebook is released under the [Creative Commons license CC BY-NC-SA 4.0][ll]. Any reproduction, adaptation, distribution, dissemination or making available of this notebook for commercial use is not allowed unless authorized in writing by the copyright holder.

[ll]: https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode