## U.S. Bureau of Labor Statistics - CPI Analysis
#### Eric Bottinelli

### 1. Retrieve data via BLS API v2

**Documentation**

- https://www.bls.gov/developers/api_python.htm
- https://data.bls.gov/cgi-bin/surveymost?cu

**Packages to install**

- Prettytable ('pip install prettytable')

**API Series ID**

Consumer Price Index for All Urban Consumers (CPI-U)
- *All items in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SA0
- *All items less food and energy in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SA0L1E
- *Food and beverages in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SAF
- *Food at home in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SAF11
- *Food away from home in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SEFV
- *Energy in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SA0E
- *Housing in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SAH
- *Shelter in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SAH1
((https://www.bls.gov/cpi/factsheets/owners-equivalent-rent-and-rent.htm))

**Calculate special CPI**

Occasionally, a user wishes to estimate a price change that is not published by BLS. For instance, suppose a user would like a CPI series for ‘services less energy services and shelter’. This can be done by estimating a special index, in this case, ‘services less energy services and shelter’.
[BLS Doc](https://www.bls.gov/cpi/factsheets/constructing-special-cpis.htm)

If SEEB01 -> CUUR0000SEEB01

Cost weight is just a sum of all the items

If I add all the values to calculate the services less energy services and shelter, it becomes a lot of data. Explore different solution (e.g. remove goods from core CPI)

**Supercore CPI**

"Fed Chair Jerome Powell cited a specific category of inflation—inflation in core services other than housing—as being perhaps “the most important category for understanding the future evolution of core inflation.” The financial press has termed this category “supercore” inflation" ([FED of St. Louis](https://www.stlouisfed.org/on-the-economy/2024/may/measuring-inflation-headline-core-supercore-services))

In [3]:
import os
import requests
import json
import prettytable
import pandas as pd
from datetime import datetime

folder_name = 'CPI_Data'

In [7]:
if not os.path.exists(folder_name):
    os.makedirs(folder_name)

current_date = datetime.now()
current_year = current_date.year
last_year = current_year - 1

headers = {'Content-type': 'application/json'}
series_ids = ['CUUR0000SA0', 'CUUR0000SA0L1E', 'CUUR0000SAF', 'CUUR0000SAF11', 'CUUR0000SEFV', 'CUUR0000SA0E', 'CUUR0000SAH1']
data = json.dumps({"seriesid": series_ids, "startyear": str(last_year), "endyear": str(current_year)})
response = requests.post('https://api.bls.gov/publicAPI/v2/timeseries/data/', data=data, headers=headers)
json_data = json.loads(response.text)

series_names = {
    'CUUR0000SA0': 'All_Items',
    'CUUR0000SA0L1E': 'All_Items_Less_Food_Energy',
    'CUUR0000SAF': 'Food_Beverages',
    'CUUR0000SAF11': 'Food_At_Home',
    'CUUR0000SEFV': 'Food_Away_From_Home',
    'CUUR0000SA0E': 'Energy',
    'CUUR0000SAH1': 'Shelter'
}

all_data = []
for series in json_data['Results']['series']:
    rows = []
    for item in series['data']:
        footnotes = "".join([footnote['text'] + ',' for footnote in item['footnotes'] if footnote]).rstrip(',')
        if 'M01' <= item['period'] <= 'M12':
            rows.append([series_names[series['seriesID']], item['year'], item['period'], item['value'], footnotes])

    # Create dataframe for current series
    df = pd.DataFrame(rows, columns=["series id", "year", "period", "value", "footnotes"])
    all_data.append(df)

complete_data = pd.concat(all_data)

csv_path = os.path.join(folder_name, 'CPI_data.csv')
complete_data.to_csv(csv_path, index=False)

In [4]:
complete_data = pd.read_csv("CPI_Data/CPI_data.csv")

In [15]:
df = complete_data.copy()
df['date'] = pd.to_datetime(df['year'].astype(str) + df['period'].str.replace('M', ''), format='%Y%m')
df['series id'] = df['series id'].astype(str)  # Convert series id to string
df['value'] = pd.to_numeric(df['value'], errors='coerce')  # Ensure value is numeric
df['footnotes'] = df['footnotes'].astype(str)  # Convert footnotes to string
df.drop(['year', 'period', 'footnotes'], axis=1, inplace=True)
df.rename(columns={'series id': 'id'}, inplace=True)
df = df[['id', 'date', 'value']]

In [16]:
df['MoM_change'] = df.groupby('id')['value'].pct_change()
df['YoY_change'] = df.groupby('id')['value'].pct_change(periods=12)

In [73]:
df.head()

Unnamed: 0,id,date,value,MoM_change,YoY_change
0,All_Items,2024-07-01,314.54,,
1,All_Items,2024-06-01,314.175,-0.12,
2,All_Items,2024-05-01,314.069,-0.03,
3,All_Items,2024-04-01,313.548,-0.17,
4,All_Items,2024-03-01,312.332,-0.39,


In [18]:
df2 = df.copy()
df2['Month-Year'] = df2['date'].dt.strftime('%b-%y')

# Define mappings for IDs to Categories and Weights
category_map = {
    'All_Items': 'Headline',
    'All_Items_Less_Food_Energy': 'Food + Energy'
}
weight_map = {
    'All_Items': '100%',
    'All_Items_Less_Food_Energy': '~80%'
}

# Map the categories
df2['Category'] = df2['id'].map(category_map)
df2['Weight'] = df2['id'].map(weight_map)

# Pivot the DataFrame
pivot_df = df2.pivot_table(index=['Category', 'Weight'], columns='Month-Year', values='MoM_change', aggfunc='first')

# Sort columns by converting them back to datetime and sorting in descending order
pivot_df = pivot_df[sorted(pivot_df.columns, key=lambda x: pd.to_datetime(x, format='%b-%y'), reverse=True)]

# Flatten the headers by removing the MultiIndex after pivot
pivot_df.columns.name = None  # Remove the aggregation name
pivot_df.reset_index(inplace=True)  # Make 'Category' and 'Weight' as regular columns



In [19]:
pivot_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 20 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   Category  2 non-null      object 
 1   Weight    2 non-null      object 
 2   Jun-24    2 non-null      float64
 3   May-24    2 non-null      float64
 4   Apr-24    2 non-null      float64
 5   Mar-24    2 non-null      float64
 6   Feb-24    2 non-null      float64
 7   Jan-24    2 non-null      float64
 8   Dec-23    2 non-null      float64
 9   Nov-23    2 non-null      float64
 10  Oct-23    2 non-null      float64
 11  Sep-23    2 non-null      float64
 12  Aug-23    2 non-null      float64
 13  Jul-23    2 non-null      float64
 14  Jun-23    2 non-null      float64
 15  May-23    2 non-null      float64
 16  Apr-23    2 non-null      float64
 17  Mar-23    2 non-null      float64
 18  Feb-23    2 non-null      float64
 19  Jan-23    2 non-null      float64
dtypes: float64(18), object(2)
memory usa

In [20]:
csv_path = os.path.join(folder_name, 'cleaned_CPI_data.csv')
pivot_df.to_csv(csv_path, index=False)