
# BMIS-2542: Data Programming Essentials with Python 
##### Katz Graduate School of Business, Spring 2021


## Session 12: Time Series/Date Functionality


### Dates and Times in Python

In [1]:
from datetime import datetime

In [2]:
now = datetime.now()
now

datetime.datetime(2021, 4, 13, 16, 22, 55, 738141)

In [3]:
type(now)

datetime.datetime

In [4]:
now.year, now.month, now.day, now.hour, now.minute, now.second, now.microsecond

(2021, 4, 13, 16, 22, 55, 738141)

In [5]:
# you can create a new datetime object by setting the datetime parameters
dtObj = datetime(2021, 10, 29, 20, 15, 45)
print('Year:', dtObj.year)
print('Month:', dtObj.month)
print('Day:', dtObj.day)
print('Hour:', dtObj.hour)
print('Minute:', dtObj.minute)
print('Second:', dtObj.second)

Year: 2021
Month: 10
Day: 29
Hour: 20
Minute: 15
Second: 45


`datetime` stores both the date and time down to the microsecond. `timedelta` represents the temporal difference between two `datetime` objects.<br>
Only days, seconds, microseconds, milliseconds, minutes, hours, and weeks are included as parameters.
Read the documentation [here](https://docs.python.org/3/library/datetime.html#timedelta-objects).

In [6]:
delta = datetime(2021, 10, 29) - datetime(2020, 10, 29, 8, 30)
delta

datetime.timedelta(days=364, seconds=55800)

In [7]:
delta.days, delta.seconds

(364, 55800)

In [8]:
delta.total_seconds()

31505400.0

You can add or substract a `timedelta` to a `datetime` object to obtain a new shifted object.

In [9]:
from datetime import timedelta

today = datetime.today()
print('Today:', today)

one_day = timedelta(days=1)
print('One Day:', one_day)

yesterday = today - one_day
print('Yesterday:', yesterday)

tomorrow = today + one_day
print('Tomorrow :', tomorrow)

print('Tomorrow - Yesterday:', tomorrow - yesterday)
print('Yesterday - Tomorrow:', yesterday - tomorrow)

Today: 2021-04-13 16:22:55.851256
One Day: 1 day, 0:00:00
Yesterday: 2021-04-12 16:22:55.851256
Tomorrow : 2021-04-14 16:22:55.851256
Tomorrow - Yesterday: 2 days, 0:00:00
Yesterday - Tomorrow: -2 days, 0:00:00


In [10]:
type(tomorrow-yesterday)

datetime.timedelta

### Converting Datetime Objects to Strings
You can format `datetime` objects (as well as Pandas `timestamp` objects), using `str` or `strftime`, passing a format specification.

In [11]:
stamp = datetime(2021, 10, 15) # create a datetime object
str(stamp)

'2021-10-15 00:00:00'

In [12]:
strStamp = stamp.strftime('%Y/%m/%d')
strStamp

'2021/10/15'

In [13]:
datetime.now().strftime('%c')

'Tue Apr 13 16:22:55 2021'

#### <u><center>Datetime Format Specification</center></u>

|<p left>Type</p>|<p left>Description</p>| 
| :- | :- 
|<p center>%Y</p>|<p left>Four-digit Year</p>|
|<p center>%y</p>|<p left>Two-digit Year</p>|
|<p center>%m</p>|<p left>Two-digit Month(01, 12)</p>|
|<p center>%b</p>|<p left>Month as locale’s abbreviated name - Jan, Feb..., Dec (en_US)</p>|
|<p center>%B</p>|<p left>Month as locale’s full name - January, February,..., December (en_US)</p>|
|<p center>%d</p>|<p left>Two-digit Day (01, 31)</p>|
|<p center>%H</p>|<p left>Hour(24-Hour Clock- (00, 23))</p>|
|<p center>%I</p>|<p left>Hour(12-Hour Clock- (01, 12))</p>|
|<p center>%M</p>|<p left>Two-digit Minute(00, 59)</p>|
|<p center>%S</p>|<p left>Second(00,61) seconds 60 and 61 account for leap seconds</p>|
|<p center>%p</p>|<p left>Locale’s equivalent of either AM or PM</p>|
|<p center>%w</p>|<p left>Weekday as Integer(0-Sunday, 6)</p>|
|<p center>%a</p>|<p left>Weekday as locale’s abbreviated name (Sun, Mon,..., Sat (en_US))</p>|
|<p center>%A</p>|<p left>Weekday as locale’s as full name (Sunday, Monday...,Saturday)</p>|
|<p center>%j</p>|<p left>Day of the year as a zero-padded decimal number (001, 002,..., 365)</p>|
|<p center>%U</p>|<p left>Week number of the year (00, 53) Sunday is considered the first day of the week, and days before the first Sunday is week 0</p>|
|<p center>%W</p>|<p left>Week number of the year (00, 53) Monday is considered the first day of the week, and days before the first Sunday is week 0</p>|
|<p center>%z</p>|<p left>UTC time zone offset as +HHMM or -HHMM</p>|
|<p center>%F</p>|<p left>Shortcut for %Y-%m-%d(e.g., 2018-10-29</p>|
|<p center>%D</p>|<p left>Shortcut for %m/%d/%y(e.g., 10/29/18</p>|
|<p center>%c</p>|<p left>Full Date and Time</p>|

#### <mark>Exercise</mark>
- Format the current date and time to the following format: **10/29/18 14:15:23**
- Format the current date and time to the following format: **Monday 29-OCT-2018 2:15:23 PM**

In [14]:
datetime.now().strftime('%D %H:%M:%S')

'04/13/21 16:22:55'

In [15]:
datetime.now().strftime('%A %d-%b-%Y %I:%M:%S %p')

'Tuesday 13-Apr-2021 04:22:55 PM'

### Converting Strings to Datetime Objects

You can use the same format specifications above to convert strings to `datetime` objects using `datetime.strptime`.

In [16]:
dateStr = '2021/10/29'
dateObj = datetime.strptime(dateStr, '%Y/%m/%d')
dateObj

datetime.datetime(2021, 10, 29, 0, 0)

In [17]:
dateStr2 = '7/6/2021'
datetime.strptime(dateStr2, '%m/%d/%Y')

datetime.datetime(2021, 7, 6, 0, 0)

### Using Pandas to Handle Datetime

The `to_datetime` method in Pandas can be used to parse many different kinds of date representations.<br>
The basic reprsentation of datetime in Pandas is a `Timestamp`. Read more about Pandas date functionality [here](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html).

In [18]:
import pandas as pd
import numpy as np

In [19]:
ts = pd.Timestamp('3/2/2021 10:15')
ts

Timestamp('2021-03-02 10:15:00')

In [20]:
ddf = pd.to_datetime('3/2/2021 8:15:32', dayfirst=True) # February 3, 2021
ddf

Timestamp('2021-02-03 08:15:32')

In [21]:
print('year:',ddf.year, 'month:', ddf.month, 'day:', ddf.day)

year: 2021 month: 2 day: 3


In [22]:
d = pd.to_datetime('21/3/2 8:15:32 AM', yearfirst= True)
print('year:',d.year, 'month:', d.month, 'day:', d.day)

year: 2021 month: 3 day: 2


In [23]:
d = pd.to_datetime('21/3/2 8:15:32 AM', yearfirst= True, dayfirst=True)
print('year:',d.year, 'month:', d.month, 'day:', d.day)

year: 2021 month: 2 day: 3


In [24]:
# date offsets in Pandas
from pandas.tseries.offsets import *

myDate = datetime(2021, 7, 15, 9, 15)
shifted = myDate + DateOffset(months=1, days=5, hours=2) 
print('Shifted Date:', shifted)

Shifted Date: 2021-08-20 11:15:00


In [25]:
datetime.now() + 3 * Day()

Timestamp('2021-04-16 16:22:56.676370')

The basic time series object in Pandas is a `Series` indexed by timestamps.

In [26]:
dates = [datetime(2021,10,1), datetime(2021,10,2), datetime(2021,10,3), datetime(2021,10,4), datetime(2021,10,5)]
timeSeries = pd.Series(np.random.randn(5), index = dates)
timeSeries

2021-10-01    1.035687
2021-10-02   -0.326204
2021-10-03   -0.806830
2021-10-04   -0.834590
2021-10-05    0.344713
dtype: float64

In [27]:
timeSeries.index

DatetimeIndex(['2021-10-01', '2021-10-02', '2021-10-03', '2021-10-04',
               '2021-10-05'],
              dtype='datetime64[ns]', freq=None)

In [28]:
timeSeries[1:4]

2021-10-02   -0.326204
2021-10-03   -0.806830
2021-10-04   -0.834590
dtype: float64

In [29]:
timeStampAtZero = timeSeries.index[0]
timeStampAtZero

Timestamp('2021-10-01 00:00:00')

`Timestamp` objects can be substituted anywhere where `datetime` objects are used.

In [30]:
diff = timeStampAtZero - datetime(2021,9,30)
diff

Timedelta('1 days 00:00:00')

### Creating Date Ranges
Read the documentation [here](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.date_range.html).

In [31]:
dtRange = pd.date_range('10/1/2021', '10/31/2021')
dtRange

DatetimeIndex(['2021-10-01', '2021-10-02', '2021-10-03', '2021-10-04',
               '2021-10-05', '2021-10-06', '2021-10-07', '2021-10-08',
               '2021-10-09', '2021-10-10', '2021-10-11', '2021-10-12',
               '2021-10-13', '2021-10-14', '2021-10-15', '2021-10-16',
               '2021-10-17', '2021-10-18', '2021-10-19', '2021-10-20',
               '2021-10-21', '2021-10-22', '2021-10-23', '2021-10-24',
               '2021-10-25', '2021-10-26', '2021-10-27', '2021-10-28',
               '2021-10-29', '2021-10-30', '2021-10-31'],
              dtype='datetime64[ns]', freq='D')

By default, `date_range` generates daily timestamps. If you pass only a start date or an end date, you need to pass a number of periods to generate.<br>
Find the time series frequency aliases [here](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-aliases).

In [32]:
dateRange = pd.date_range(start='2021/10/1', periods = 31, freq = 'D')
dateRange

DatetimeIndex(['2021-10-01', '2021-10-02', '2021-10-03', '2021-10-04',
               '2021-10-05', '2021-10-06', '2021-10-07', '2021-10-08',
               '2021-10-09', '2021-10-10', '2021-10-11', '2021-10-12',
               '2021-10-13', '2021-10-14', '2021-10-15', '2021-10-16',
               '2021-10-17', '2021-10-18', '2021-10-19', '2021-10-20',
               '2021-10-21', '2021-10-22', '2021-10-23', '2021-10-24',
               '2021-10-25', '2021-10-26', '2021-10-27', '2021-10-28',
               '2021-10-29', '2021-10-30', '2021-10-31'],
              dtype='datetime64[ns]', freq='D')

In [33]:
dateRange = pd.date_range(end='2021/10/31', periods = 31, freq = 'D')
dateRange

DatetimeIndex(['2021-10-01', '2021-10-02', '2021-10-03', '2021-10-04',
               '2021-10-05', '2021-10-06', '2021-10-07', '2021-10-08',
               '2021-10-09', '2021-10-10', '2021-10-11', '2021-10-12',
               '2021-10-13', '2021-10-14', '2021-10-15', '2021-10-16',
               '2021-10-17', '2021-10-18', '2021-10-19', '2021-10-20',
               '2021-10-21', '2021-10-22', '2021-10-23', '2021-10-24',
               '2021-10-25', '2021-10-26', '2021-10-27', '2021-10-28',
               '2021-10-29', '2021-10-30', '2021-10-31'],
              dtype='datetime64[ns]', freq='D')

In [34]:
hourOffset = pd.date_range('2021/10/1','2021/10/3 23:59', freq = '4H') # offset 4 hours
hourOffset

DatetimeIndex(['2021-10-01 00:00:00', '2021-10-01 04:00:00',
               '2021-10-01 08:00:00', '2021-10-01 12:00:00',
               '2021-10-01 16:00:00', '2021-10-01 20:00:00',
               '2021-10-02 00:00:00', '2021-10-02 04:00:00',
               '2021-10-02 08:00:00', '2021-10-02 12:00:00',
               '2021-10-02 16:00:00', '2021-10-02 20:00:00',
               '2021-10-03 00:00:00', '2021-10-03 04:00:00',
               '2021-10-03 08:00:00', '2021-10-03 12:00:00',
               '2021-10-03 16:00:00', '2021-10-03 20:00:00'],
              dtype='datetime64[ns]', freq='4H')

In [35]:
timeOffset = pd.date_range('2021/10/1', periods=10, freq = '1h30min') # offset 1 hour 30 minutes
timeOffset

DatetimeIndex(['2021-10-01 00:00:00', '2021-10-01 01:30:00',
               '2021-10-01 03:00:00', '2021-10-01 04:30:00',
               '2021-10-01 06:00:00', '2021-10-01 07:30:00',
               '2021-10-01 09:00:00', '2021-10-01 10:30:00',
               '2021-10-01 12:00:00', '2021-10-01 13:30:00'],
              dtype='datetime64[ns]', freq='90T')

In [36]:
wom = pd.date_range('2021/10/1', periods=10, freq = 'WOM-3FRI')# Weak of Month (WOM) freq class: third Friday of each month
list(wom)

[Timestamp('2021-10-15 00:00:00', freq='WOM-3FRI'),
 Timestamp('2021-11-19 00:00:00', freq='WOM-3FRI'),
 Timestamp('2021-12-17 00:00:00', freq='WOM-3FRI'),
 Timestamp('2022-01-21 00:00:00', freq='WOM-3FRI'),
 Timestamp('2022-02-18 00:00:00', freq='WOM-3FRI'),
 Timestamp('2022-03-18 00:00:00', freq='WOM-3FRI'),
 Timestamp('2022-04-15 00:00:00', freq='WOM-3FRI'),
 Timestamp('2022-05-20 00:00:00', freq='WOM-3FRI'),
 Timestamp('2022-06-17 00:00:00', freq='WOM-3FRI'),
 Timestamp('2022-07-15 00:00:00', freq='WOM-3FRI')]

#### <mark>Exercise</mark>
Create a pandas date range that lists the first five business days, starting with 10/11/2021.

### Indexing, Selection, and Subsetting

In [37]:
timeSeries

2021-10-01    1.035687
2021-10-02   -0.326204
2021-10-03   -0.806830
2021-10-04   -0.834590
2021-10-05    0.344713
dtype: float64

In [38]:
stamp = timeSeries.index[2]
timeSeries[stamp]

-0.8068295569112776

In [39]:
# strings that can be interpreted as a date can also be passed 
timeSeries['10/1/2021']

1.0356873603132044

In [40]:
# For longer time series, a year or only a year and month can be passed to esily select slices of data
longTimeSeries = pd.Series(np.random.randn(1000), index = pd.date_range('1/1/2000', periods=1000))
longTimeSeries

2000-01-01    0.364828
2000-01-02   -0.649125
2000-01-03   -0.967711
2000-01-04    0.339074
2000-01-05   -0.999000
                ...   
2002-09-22    0.786321
2002-09-23   -2.206360
2002-09-24    0.754466
2002-09-25   -0.039906
2002-09-26    1.284039
Freq: D, Length: 1000, dtype: float64

In [41]:
longTimeSeries['2001']

2001-01-01   -0.597690
2001-01-02    0.647959
2001-01-03   -0.210529
2001-01-04    0.080588
2001-01-05   -0.084076
                ...   
2001-12-27    0.097605
2001-12-28    0.649055
2001-12-29    0.007797
2001-12-30    2.334066
2001-12-31   -0.237285
Freq: D, Length: 365, dtype: float64

In [42]:
longTimeSeries['2001/05']

2001-05-01   -0.550355
2001-05-02    1.264741
2001-05-03   -0.822532
2001-05-04   -0.025806
2001-05-05   -1.353125
2001-05-06    0.525954
2001-05-07   -0.403505
2001-05-08   -0.468145
2001-05-09   -0.917277
2001-05-10    0.486866
2001-05-11   -1.304256
2001-05-12    0.693405
2001-05-13    0.991960
2001-05-14    1.244270
2001-05-15   -0.177700
2001-05-16    0.849836
2001-05-17    0.811466
2001-05-18    0.976004
2001-05-19   -0.349553
2001-05-20    0.438851
2001-05-21    0.062159
2001-05-22   -0.897194
2001-05-23    0.589350
2001-05-24   -0.892175
2001-05-25   -0.192985
2001-05-26    1.691870
2001-05-27    0.612002
2001-05-28   -1.408061
2001-05-29    0.585652
2001-05-30    0.419790
2001-05-31   -0.958926
Freq: D, dtype: float64

In [43]:
longTimeSeries[datetime(2002,9,15):]

2002-09-15   -0.489834
2002-09-16    0.859476
2002-09-17   -0.117183
2002-09-18   -0.525697
2002-09-19    0.267902
2002-09-20   -0.002168
2002-09-21    0.142848
2002-09-22    0.786321
2002-09-23   -2.206360
2002-09-24    0.754466
2002-09-25   -0.039906
2002-09-26    1.284039
Freq: D, dtype: float64

In [44]:
longTimeSeries['9/1/2002': '4/25/2005'] # can slice with timestamps not contained ina time series

2002-09-01    0.918337
2002-09-02    1.276270
2002-09-03    0.459975
2002-09-04    1.472905
2002-09-05    0.188459
2002-09-06    0.188044
2002-09-07   -0.137928
2002-09-08    0.623167
2002-09-09   -0.504311
2002-09-10   -0.381207
2002-09-11   -1.838875
2002-09-12   -1.156874
2002-09-13    0.483582
2002-09-14   -0.166875
2002-09-15   -0.489834
2002-09-16    0.859476
2002-09-17   -0.117183
2002-09-18   -0.525697
2002-09-19    0.267902
2002-09-20   -0.002168
2002-09-21    0.142848
2002-09-22    0.786321
2002-09-23   -2.206360
2002-09-24    0.754466
2002-09-25   -0.039906
2002-09-26    1.284039
Freq: D, dtype: float64

In [45]:
# Alternative Slicing Method: Truncate all rows after '9/20/2000'
longTimeSeries.truncate(after = '9/20/2000')

2000-01-01    0.364828
2000-01-02   -0.649125
2000-01-03   -0.967711
2000-01-04    0.339074
2000-01-05   -0.999000
                ...   
2000-09-16    0.642074
2000-09-17   -0.093663
2000-09-18    0.329965
2000-09-19   -0.457523
2000-09-20    1.432539
Freq: D, Length: 264, dtype: float64

In [46]:
# Alternative Slicing Method: Truncate all rows before '9/20/2002'
longTimeSeries.truncate(before = '9/20/2002')

2002-09-20   -0.002168
2002-09-21    0.142848
2002-09-22    0.786321
2002-09-23   -2.206360
2002-09-24    0.754466
2002-09-25   -0.039906
2002-09-26    1.284039
Freq: D, dtype: float64

In [47]:
# Slicing, selection, or truncation does not modify the original series
len(longTimeSeries)

1000

<mark>All the above holds true for **`DataFrames`** as well.</mark>

In [48]:
dateRange = pd.date_range('2021/10/1', periods = 10, freq = 'D')
dateRange

DatetimeIndex(['2021-10-01', '2021-10-02', '2021-10-03', '2021-10-04',
               '2021-10-05', '2021-10-06', '2021-10-07', '2021-10-08',
               '2021-10-09', '2021-10-10'],
              dtype='datetime64[ns]', freq='D')

In [49]:
np.random.randn(2, 4) # 2 by 4 numpy array

array([[ 1.34561224,  0.01652205, -0.53725014, -0.72341627],
       [ 0.47672303, -0.8449601 ,  0.89353612,  0.58671335]])

In [50]:
dates = pd.date_range('10/1/2021', periods=10, freq='D') 
dfWeather = pd.DataFrame(np.random.randn(10,4), index = dates, columns = ['Temperature', 'Pressure', 'Humidity', 'Wind'])
dfWeather

Unnamed: 0,Temperature,Pressure,Humidity,Wind
2021-10-01,-0.486667,-0.235083,-0.702337,0.479306
2021-10-02,0.713496,0.796711,-0.209316,-0.400832
2021-10-03,0.380524,0.707934,-0.002814,0.005761
2021-10-04,0.044418,-0.768945,0.405334,-1.657299
2021-10-05,-0.416929,0.768559,-0.029983,-0.997698
2021-10-06,-0.299442,0.744559,0.330766,0.264393
2021-10-07,-1.1038,-0.200867,-0.898696,0.907302
2021-10-08,0.12299,0.11745,0.239709,2.172569
2021-10-09,0.80802,0.349285,-0.115654,-0.265259
2021-10-10,-0.534579,1.414922,0.586525,1.323948


#### <mark>Exercise</mark>
From `dfWeather`, select the Temperature and Humidity data from '10/3/2021' to '10/8/2021' (inclusive), 
 - using general slicing 
 - using `dfWeather.truncate`
 - using `dfWeather.loc`

NameError: name 'after' is not defined

In [54]:
dfWeather.truncate(after = '10/8/2021',before = '10/3/2021',)

Unnamed: 0,Temperature,Pressure,Humidity,Wind
2021-10-03,0.380524,0.707934,-0.002814,0.005761
2021-10-04,0.044418,-0.768945,0.405334,-1.657299
2021-10-05,-0.416929,0.768559,-0.029983,-0.997698
2021-10-06,-0.299442,0.744559,0.330766,0.264393
2021-10-07,-1.1038,-0.200867,-0.898696,0.907302
2021-10-08,0.12299,0.11745,0.239709,2.172569


In [55]:
dfWeather.loc('10/3/2021':'10/8/2021')

SyntaxError: invalid syntax (<ipython-input-55-3f4d1a80b4e2>, line 1)

### Shifting/Lagging
Sometimes we need to shift or lag the values in a time series back and forward in time.<br>
Both `Series` and `DataFrame` have a `shift` method that can perform shifts forward or backword, without modifying the index.

In [None]:
tsShiftLag = pd.Series(np.random.randn(4), index = pd.date_range('1/1/2021', periods=4, freq='M'))
tsShiftLag

In [None]:
tsShiftLag.shift(1) #lead

In [None]:
tsShiftLag.shift(-1) # lag

Instead of advancing data and discarding some information, we can advance the timestamps, modifying the index.

In [None]:
tsShiftLag

In [None]:
tsShiftLag.shift(2, freq='M')

In [None]:
tsShiftLag.shift(2, freq='D')

#### <mark>Exercise</mark>
`dfPrice` is a `DataFrame` that contains the Closing Price for a particular stock for the month of January, 2021.

- Create the new column **Previous_Close** that displays the closing price of the stock for the previous day
- Create the new column **% Change** that displays the percent change in the close price for each day
- Display `dfPrice`

In [None]:
dfPrice = pd.DataFrame(np.random.uniform(low=100, high=300, size= 31), index = pd.date_range('1/1/2018', periods = 31, freq = 'D'), columns = ['Close_Price'])
dfPrice

### Periods and Period Arithmetic
If you want to have both the start time and the end time of a particular time duration in a single object, you can use `Periods`. Periods represent timespans like days, months, quarters, or years.

Find the time series **frequency aliases** [here](https://pandas.pydata.org/pandas-docs/stable/timeseries.html#timeseries-offset-aliases).

In [None]:
timeSpan = pd.Period('10/2021', freq = 'M')
timeSpan

In [None]:
timeSpan.start_time, timeSpan.end_time

In [None]:
periodJanToDec2021 = pd.Period(2021, freq='A-DEC') # full timespan from January 1, 2007, to December 31, 2007.
periodJanToDec2021

In [None]:
periodJanToDec2021.start_time, periodJanToDec2021.end_time

In [None]:
testTimestamp = pd.Timestamp('10/15/2021')
testTimestamp

<mark>**Exercise**</mark><br>
Write an expression that determines whether the `testTimestamp` falls within the period `timeSpan`.

Adding and subtracting integers from periods, shifts the period by its own frequency. Arithmetic is not allowed between Periods with different frequency.

In [None]:
periodJanToDec2021

In [None]:
periodJanToDec2021 + 1

In [None]:
periodJanToDec2021 - 2

Taking the difference of Period instances with the same frequency will return the number of frequency units between them.

In [None]:
pd.Period(2021, freq='A-DEC') - pd.Period(2017, freq='A-DEC')

In [None]:
periodRange = pd.period_range('1/1/2021', '6/30/2021', freq = 'M')
periodRange

In [None]:
pd.Series(np.random.randn(6), index = periodRange)

In [None]:
values = ['2019Q3', '2020Q2', '2021Q1']
periodIndex = pd.PeriodIndex(values, freq = 'Q-DEC')
periodIndex

In [None]:
pd.Series(np.random.randn(3), index = periodIndex)

#### Period Frequency Conversion
Periods and `PeriodIndex` objects can be converted to another frequency using the `asfreq` method.

In [None]:
# convert an annual period into a monthly period at the start or end of the year.
p = pd.Period('2021', freq='A-DEC')
print(p.start_time, p.end_time)
p

In [None]:
convertedP = p.asfreq('M', how = 'end')
print(convertedP.start_time, convertedP.end_time)
convertedP

In [None]:
#for a fiscal year ending on a month other than December, the corresponding monthly subperiods are different.
p = pd.Period('2021', freq='A-JUN')
print(p.start_time, p.end_time)
p

In [None]:
convertedP = p.asfreq('M', how = 'start')
print(convertedP.start_time, convertedP.end_time)
convertedP

#### Quarterly Period Frequencies
Pandas supports 12 quarterly frequencies as `Q-JAN` through `Q-DEC`.

In [None]:
p = pd.Period('2021Q4', freq='Q-JAN')
p.start_time, p.end_time

In [None]:
pDaily = p.asfreq('D', how='start')
pDaily.start_time, pDaily.end_time

#### Converting Timestamps to Periods

In [None]:
rng = pd.date_range('2021/1/1', periods=3, freq='M')
ts = pd.Series(np.random.randn(3), index=rng)
ts

In [None]:
periodTs = ts.to_period()
periodTs

In [None]:
rng = pd.date_range('1/29/2021', periods=6, freq='D')
ts2 = pd.Series(np.random.randn(6), index = rng)
ts2

In [None]:
ts2.to_period('M')

In [None]:
# converting back to timestamps
pts = ts2.to_period()
pts.to_timestamp()

### Resampling and Frequency Conversion
Resampling refers to the process of converting a time series from one frequency to another.<br>
`resample` is the workhorse function in Pandas for all frequency conversion. `resample()` is a time-based groupby operation.

In [None]:
rng = pd.date_range('1/1/2021', periods = 100, freq='D')
ts = pd.Series(np.random.randn(len(rng)), index=rng)
ts

In [None]:
ts.resample('M').mean()

In [None]:
ts.resample('M', kind = 'period').mean()

#### Downsampling
Aggregating data to a regular, lower frquency is a common time series task. The desired frequency defines bin edges that are used to slice the time series into pieces to aggregate.

In [None]:
rng = pd.date_range('1/1/2021', periods=12, freq = 'T')
ts = pd.Series(np.arange(12), index=rng)
ts

In [None]:
# let's chunck the data to five-minute chunks by taking the sum of each group
ts.resample('5min').sum()

In [None]:
ts.resample('5min', closed = 'right').sum()