## Pandas Tutorial 19: Time Series Analysis: Period and PeriodIndex
In this tutorial, we continue exploring Pandas time series analysis by introducing `Period` and `PeriodIndex`. Periods represent specific time durations, widely used in financial analysis. Pandas offers robust support for period arithmetic, allowing you to create quarterly, yearly, and other time periods, and perform various operations on them. You can use `period_range` to generate a `PeriodIndex` between a defined start and end.

#### Topics covered:
* **What is a timestamp?**
* **What is a timespan?**
* **Using the `period()` function**
* **Arithmetic operations on period objects**
* **Creating quarterly time periods**
* **Using the `asfreq()` function**
* **List of all frequencies**
* **Creating a `PeriodIndex`**
* **Converting a `PeriodIndex` to a `DatetimeIndex`**
* **Converting a `DatetimeIndex` to a `PeriodIndex`**
* **Using `set_index()` with periods**
* **Converting an index to `PeriodIndex` using `PeriodIndex()`**
* **Creating new columns in a DataFrame**

This tutorial will guide you through working with time periods and period indexing in Pandas for advanced time series analysis.

In [38]:
import pandas as pd
import numpy as np

## Yearly Period

In [2]:
y = pd.Period('2016')
y

Period('2016', 'A-DEC')

In [3]:
y.start_time

Timestamp('2016-01-01 00:00:00')

In [4]:
y.end_time

Timestamp('2016-12-31 23:59:59.999999999')

In [5]:
y.is_leap_year

True

## Monthly Period

In [6]:
m = pd.Period('2017-12')
m

Period('2017-12', 'M')

In [7]:
m.start_time

Timestamp('2017-12-01 00:00:00')

In [8]:
m.end_time

Timestamp('2017-12-31 23:59:59.999999999')

In [9]:
m+1

Period('2018-01', 'M')

## Daily Period

In [10]:
d = pd.Period('2016-02-28', freq='D')
d

Period('2016-02-28', 'D')

In [11]:
d.start_time

Timestamp('2016-02-28 00:00:00')

In [12]:
d.end_time

Timestamp('2016-02-28 23:59:59.999999999')

In [13]:
d+1

Period('2016-02-29', 'D')

## Hourly Period

In [14]:
h = pd.Period('2017-08-15 23:00:00', freq='H')
h

Period('2017-08-15 23:00', 'H')

In [15]:
h+1

Period('2017-08-16 00:00', 'H')

**Achieve same results using pandas offsets hour**

In [16]:
h+pd.offsets.Hour(1)

Period('2017-08-16 00:00', 'H')

## Quarterly Period

In [17]:
q1 = pd.Period('2017Q1', freq='Q-JAN')
q1

Period('2017Q1', 'Q-JAN')

In [18]:
q1.start_time

Timestamp('2016-02-01 00:00:00')

In [19]:
q1.end_time

Timestamp('2016-04-30 23:59:59.999999999')

**Use asfreq to convert period to a different frequency**

In [20]:
q1.asfreq('M', how='start')

Period('2016-02', 'M')

In [21]:
q1.asfreq('M', how='end')

Period('2016-04', 'M')

## Weekly Period

In [22]:
w = pd.Period('2017-07-05', freq='W')
w

Period('2017-07-03/2017-07-09', 'W-SUN')

In [23]:
w-1

Period('2017-06-26/2017-07-02', 'W-SUN')

In [24]:
w2 = pd.Period('2017-08-15', freq='W')
w2

Period('2017-08-14/2017-08-20', 'W-SUN')

In [25]:
w2-w

<6 * Weeks: weekday=6>

## PeriodIndex and period_range

In [26]:
r = pd.period_range('2011', '2017', freq='q')
r

PeriodIndex(['2011Q1', '2011Q2', '2011Q3', '2011Q4', '2012Q1', '2012Q2',
             '2012Q3', '2012Q4', '2013Q1', '2013Q2', '2013Q3', '2013Q4',
             '2014Q1', '2014Q2', '2014Q3', '2014Q4', '2015Q1', '2015Q2',
             '2015Q3', '2015Q4', '2016Q1', '2016Q2', '2016Q3', '2016Q4',
             '2017Q1'],
            dtype='period[Q-DEC]')

In [27]:
r[0].start_time

Timestamp('2011-01-01 00:00:00')

In [28]:
r[0].end_time

Timestamp('2011-03-31 23:59:59.999999999')

**Walmart's fiscal year ends in Jan, below is how you generate walmart's fiscal quarters between 2011 and 2017**

In [29]:
r = pd.period_range('2011', '2017', freq='q-jan')
r

PeriodIndex(['2011Q4', '2012Q1', '2012Q2', '2012Q3', '2012Q4', '2013Q1',
             '2013Q2', '2013Q3', '2013Q4', '2014Q1', '2014Q2', '2014Q3',
             '2014Q4', '2015Q1', '2015Q2', '2015Q3', '2015Q4', '2016Q1',
             '2016Q2', '2016Q3', '2016Q4', '2017Q1', '2017Q2', '2017Q3',
             '2017Q4'],
            dtype='period[Q-JAN]')

In [30]:
r[0].start_time

Timestamp('2010-11-01 00:00:00')

In [31]:
r[0].end_time

Timestamp('2011-01-31 23:59:59.999999999')

In [34]:
r = pd.period_range(start='2016-01', freq='3M', periods=10)
r

PeriodIndex(['2016-01', '2016-04', '2016-07', '2016-10', '2017-01', '2017-04',
             '2017-07', '2017-10', '2018-01', '2018-04'],
            dtype='period[3M]')

In [43]:
idx = pd.period_range('2011', periods=10, freq='Q-JAN')
idx

PeriodIndex(['2011Q4', '2012Q1', '2012Q2', '2012Q3', '2012Q4', '2013Q1',
             '2013Q2', '2013Q3', '2013Q4', '2014Q1'],
            dtype='period[Q-JAN]')

In [44]:
ps = pd.Series(np.random.randn(len(idx)), idx)
ps

2011Q4   -1.681687
2012Q1   -0.141117
2012Q2   -0.460517
2012Q3   -0.175963
2012Q4   -1.118079
2013Q1   -1.291482
2013Q2    0.397757
2013Q3    1.608209
2013Q4    0.118321
2014Q1   -2.054462
Freq: Q-JAN, dtype: float64

In [46]:
ps.index

PeriodIndex(['2011Q4', '2012Q1', '2012Q2', '2012Q3', '2012Q4', '2013Q1',
             '2013Q2', '2013Q3', '2013Q4', '2014Q1'],
            dtype='period[Q-JAN]')

## Partial Indexing

In [48]:
ps['2012']

2012Q4   -1.118079
2013Q1   -1.291482
2013Q2    0.397757
2013Q3    1.608209
2013Q4    0.118321
Freq: Q-JAN, dtype: float64

In [47]:
ps['2011':'2013']

2011Q4   -1.681687
2012Q1   -0.141117
2012Q2   -0.460517
2012Q3   -0.175963
2012Q4   -1.118079
2013Q1   -1.291482
2013Q2    0.397757
2013Q3    1.608209
2013Q4    0.118321
2014Q1   -2.054462
Freq: Q-JAN, dtype: float64

### Converting between representations

In [49]:
pst = ps.to_timestamp()
pst

2010-11-01   -1.681687
2011-02-01   -0.141117
2011-05-01   -0.460517
2011-08-01   -0.175963
2011-11-01   -1.118079
2012-02-01   -1.291482
2012-05-01    0.397757
2012-08-01    1.608209
2012-11-01    0.118321
2013-02-01   -2.054462
Freq: QS-NOV, dtype: float64

In [50]:
pst.index

DatetimeIndex(['2010-11-01', '2011-02-01', '2011-05-01', '2011-08-01',
               '2011-11-01', '2012-02-01', '2012-05-01', '2012-08-01',
               '2012-11-01', '2013-02-01'],
              dtype='datetime64[ns]', freq='QS-NOV')

In [51]:
ps = pst.to_period()
ps

2010Q4   -1.681687
2011Q1   -0.141117
2011Q2   -0.460517
2011Q3   -0.175963
2011Q4   -1.118079
2012Q1   -1.291482
2012Q2    0.397757
2012Q3    1.608209
2012Q4    0.118321
2013Q1   -2.054462
Freq: Q-DEC, dtype: float64

In [52]:
ps.index

PeriodIndex(['2010Q4', '2011Q1', '2011Q2', '2011Q3', '2011Q4', '2012Q1',
             '2012Q2', '2012Q3', '2012Q4', '2013Q1'],
            dtype='period[Q-DEC]')

## Processing Walmart's Financials

In [53]:
df = pd.read_csv("wmt.csv")
df

Unnamed: 0,Line Item,2017Q1,2017Q2,2017Q3,2017Q4,2018Q1
0,Revenue,115904,120854,118179,130936,117542
1,Expenses,86544,89485,87484,97743,87688
2,Profit,29360,31369,30695,33193,29854


In [54]:
df.set_index("Line Item", inplace=True)
df =  df.T
df

Line Item,Revenue,Expenses,Profit
2017Q1,115904,86544,29360
2017Q2,120854,89485,31369
2017Q3,118179,87484,30695
2017Q4,130936,97743,33193
2018Q1,117542,87688,29854


In [55]:
df.index = pd.PeriodIndex(df.index, freq="Q-JAN")
df

Line Item,Revenue,Expenses,Profit
2017Q1,115904,86544,29360
2017Q2,120854,89485,31369
2017Q3,118179,87484,30695
2017Q4,130936,97743,33193
2018Q1,117542,87688,29854


In [56]:
df.index

PeriodIndex(['2017Q1', '2017Q2', '2017Q3', '2017Q4', '2018Q1'], dtype='period[Q-JAN]')

In [57]:
df.index[0].start_time

Timestamp('2016-02-01 00:00:00')

### Add start end date columns to dataframe

In [58]:
df["Start Date"]=df.index.map(lambda x: x.start_time)
df

Line Item,Revenue,Expenses,Profit,Start Date
2017Q1,115904,86544,29360,2016-02-01
2017Q2,120854,89485,31369,2016-05-01
2017Q3,118179,87484,30695,2016-08-01
2017Q4,130936,97743,33193,2016-11-01
2018Q1,117542,87688,29854,2017-02-01


In [59]:
df['End Date']=df.index.map(lambda x: x.end_time)
df

Line Item,Revenue,Expenses,Profit,Start Date,End Date
2017Q1,115904,86544,29360,2016-02-01,2016-04-30 23:59:59.999999999
2017Q2,120854,89485,31369,2016-05-01,2016-07-31 23:59:59.999999999
2017Q3,118179,87484,30695,2016-08-01,2016-10-31 23:59:59.999999999
2017Q4,130936,97743,33193,2016-11-01,2017-01-31 23:59:59.999999999
2018Q1,117542,87688,29854,2017-02-01,2017-04-30 23:59:59.999999999
