## Pandas Datetime Index

We'll usually deal with time series as a datetime index when working with pandas dataframes. Fortunately pandas has a lot of functions and methods to work with time series!<br>
For more on the pandas DatetimeIndex visit https://pandas.pydata.org/pandas-docs/stable/timeseries.html

In [1]:
import pandas as pd

The simplest way to build a DatetimeIndex is with the <tt><strong>pd.date_range()</strong></tt> method:

In [2]:
# THE WEEK OF JULY 8TH, 2018
idx = pd.date_range('7/8/2018', periods=7, freq='D')
idx

DatetimeIndex(['2018-07-08', '2018-07-09', '2018-07-10', '2018-07-11',
               '2018-07-12', '2018-07-13', '2018-07-14'],
              dtype='datetime64[ns]', freq='D')

Another way is to convert incoming text with the <tt><strong>pd.to_datetime()</strong></tt> method:

In [3]:
idx = pd.to_datetime(['Jan 01, 2018','1/2/18','03-Jan-2018',None])
idx

DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', 'NaT'], dtype='datetime64[ns]', freq=None)

In [23]:
idx1 = pd.to_datetime(['01/1/2018','02/1/2018','03/1/2018'],format='%d/%m/%Y')
idx1

DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03'], dtype='datetime64[ns]', freq=None)

In [25]:
# Create a NumPy datetime array
import numpy as np
some_dates = np.array(['2016-03-15', '2017-05-24', '2018-08-09'], dtype='datetime64[D]')
some_dates

array(['2016-03-15', '2017-05-24', '2018-08-09'], dtype='datetime64[D]')

In [26]:
# Convert to an index
id1 = pd.DatetimeIndex(some_dates)
id1

DatetimeIndex(['2016-03-15', '2017-05-24', '2018-08-09'], dtype='datetime64[ns]', freq=None)

Notice that even though the dates came into pandas with a day-level precision, pandas assigns a nanosecond-level precision with the expectation that we might want this later on.

To set an existing column as the index, use <tt>.set_index()</tt><br>
><tt>df.set_index('Date',inplace=True)</tt>

## Pandas Datetime Analysis

In [27]:
# Create some random data
data = np.random.randn(3,2)
cols = ['A','B']
print(data)

[[ 0.47917341 -0.50493205]
 [ 1.33046785  1.9779913 ]
 [-0.00575753  2.38239225]]


In [31]:
# Create a DataFrame with our random data, our date index, and our columns
df = pd.DataFrame(data,columns=cols)
df

Unnamed: 0,A,B
0,0.479173,-0.504932
1,1.330468,1.977991
2,-0.005758,2.382392


In [32]:
df.set_index(id1,inplace=True)

In [33]:
df

Unnamed: 0,A,B
2016-03-15,0.479173,-0.504932
2017-05-24,1.330468,1.977991
2018-08-09,-0.005758,2.382392


In [34]:
df.index

DatetimeIndex(['2016-03-15', '2017-05-24', '2018-08-09'], dtype='datetime64[ns]', freq=None)

In [35]:
# Latest Date Value
df.index.max()

Timestamp('2018-08-09 00:00:00')

In [36]:
# Latest Date Index Location
df.index.argmax()

2

In [37]:
# Earliest Date Value
df.index.min()

Timestamp('2016-03-15 00:00:00')

In [38]:
# Earliest Date Index Location
df.index.argmin()

0

<div class="alert alert-info"><strong>NOTE:</strong> Normally we would find index locations by running <tt>.idxmin()</tt> or <tt>.idxmax()</tt> on <tt>df['column']</tt> since <tt>.argmin()</tt> and <tt>.argmax()</tt> have been deprecated. However, we still use <tt>.argmin()</tt> and <tt>.argmax()</tt> on the index itself.</div>