# Workshop Lecture 8, Exercise 3
In this exercise, you’ll be working with selected macroeconomic variables for the United States reported at monthly frequency obtained from FRED. The data set starts in 1948 and contains observations for a
total of 864 months.
1. Load the data from the file FRED_monthly.csv located in the data/ folder. Print the first 10 observations to get an idea how the data looks like.
2. Keep only the columns Year, Month, CPI, and UNRATE. Moreover, perform this analysis only on observations prior to 1970 and drop the rest.
3. Since pandas has great support for time series data, we want to create an index based on observation dates.
- To this end, use to_datetime() to convert the Year and Month columns into a date.

Hint: to_datetime() requires information on Year/Month/Day, so you need to create a Day column first and assign it a value of 1. You can then call to_datetime() with the argument df[['Year', 'Month', 'Day']] to create the corresponding date.
- Store the date information in the column Date. Delete the columns Year, Month and Day once you are done as these are no longer needed.
• Set the Date column as the index for the DataFrame using set_index().
4. The column CPI stores the consumer price index for the US. You may be more familiar with the concept of inflation, which is the percent change of the CPI relative to the previous period. Create a new column Inflation which contains the annual inflation in percent relative to the same month in the previous year by applying pct_change() to the column CPI.

Hints:

Since this is monthly data, you need to pass the arguments periods=12 to pct_change() to get annual percent changes.

You need to multiply the values returned by pct_change() by 100 to get percent values. 2
5. Compute the average unemployment rate (column UNRATE) over the whole sample period. Create a new column UNRATE_HIGH that contains an indicator whenever the unemployment rate is above its average value (“high unemployment period”).
- How many observations fall into the high- and the low-unemployment periods?
- What is the average unemployment rate in the high- and low-unemployment periods?
6. Compute the average inflation rate for high- and low-unemployment periods. Is there any difference?
7. Use resample() to aggregate the inflation data to annual frequency and compute the average inflation within each calendar year. Which are the three years with the highest inflation rates in the sample?

Hint: Use the resampling rule 'YE' when calling resample().

In [74]:
import pandas as pd

#load the titanic data set
DATA_PATH = '/Users/lilapfageraas/Downloads/nhh/tech2/TECH2-H24/data'
file = pd.read_csv(f'{DATA_PATH}/FRED_monthly.csv')
df = pd.DataFrame(file)
df.head(10)

Unnamed: 0,Year,Month,CPI,UNRATE,FEDFUNDS,REALRATE,LFPART
0,1948,1,23.7,3.4,,,58.6
1,1948,2,23.7,3.8,,,58.9
2,1948,3,23.5,4.0,,,58.5
3,1948,4,23.8,3.9,,,59.0
4,1948,5,24.0,3.5,,,58.3
5,1948,6,24.2,3.6,,,59.2
6,1948,7,24.4,3.6,,,59.3
7,1948,8,24.4,3.9,,,58.9
8,1948,9,24.4,3.8,,,58.9
9,1948,10,24.3,3.7,,,58.7


In [75]:
df = df[['Year', 'Month', 'CPI', 'UNRATE']]
df = df.loc[(df['Year']<1970)]
df.tail()

Unnamed: 0,Year,Month,CPI,UNRATE
259,1969,8,36.9,3.5
260,1969,9,37.1,3.7
261,1969,10,37.3,3.7
262,1969,11,37.5,3.5
263,1969,12,37.7,3.5


In [76]:
df['Day'] = 1

In [77]:
df['Date'] = pd.to_datetime(df[['Year', 'Month', 'Day']])
del df['Year']
del df['Month']
del df['Day']
df.head()

Unnamed: 0,CPI,UNRATE,Date
0,23.7,3.4,1948-01-01
1,23.7,3.8,1948-02-01
2,23.5,4.0,1948-03-01
3,23.8,3.9,1948-04-01
4,24.0,3.5,1948-05-01


In [78]:
df = df.set_index('Date')

In [79]:
df['Inflation'] = df['CPI'].pct_change(periods=12)*100
df.head(15)

Unnamed: 0_level_0,CPI,UNRATE,Inflation
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1948-01-01,23.7,3.4,
1948-02-01,23.7,3.8,
1948-03-01,23.5,4.0,
1948-04-01,23.8,3.9,
1948-05-01,24.0,3.5,
1948-06-01,24.2,3.6,
1948-07-01,24.4,3.6,
1948-08-01,24.4,3.9,
1948-09-01,24.4,3.8,
1948-10-01,24.3,3.7,


In [80]:
avg_unrate = df['UNRATE'].mean()
avg_unrate

4.668560606060607

In [81]:
df['UNRATE_HIGH'] = df['UNRATE']>avg_unrate
df.head(15)

Unnamed: 0_level_0,CPI,UNRATE,Inflation,UNRATE_HIGH
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1948-01-01,23.7,3.4,,False
1948-02-01,23.7,3.8,,False
1948-03-01,23.5,4.0,,False
1948-04-01,23.8,3.9,,False
1948-05-01,24.0,3.5,,False
1948-06-01,24.2,3.6,,False
1948-07-01,24.4,3.6,,False
1948-08-01,24.4,3.9,,False
1948-09-01,24.4,3.8,,False
1948-10-01,24.3,3.7,,False


In [82]:
df['UNRATE_HIGH'].value_counts()

UNRATE_HIGH
False    141
True     123
Name: count, dtype: int64

In [83]:
df.groupby('UNRATE_HIGH')['UNRATE'].mean()

UNRATE_HIGH
False    3.697872
True     5.781301
Name: UNRATE, dtype: float64

In [84]:
df.groupby('UNRATE_HIGH')['Inflation'].mean()

UNRATE_HIGH
False    3.110456
True     0.942056
Name: Inflation, dtype: float64

In [88]:
df_year = df.resample('YE').mean()
df_year['Inflation'].sort_values(ascending=False).head(3)

Date
1951-12-31    7.987456
1969-12-31    5.432647
1968-12-31    4.241319
Name: Inflation, dtype: float64