## U.S. Bureau of Labor Statistics - CPI Analysis
#### Eric Bottinelli

### 1. Retrieve data via BLS API v2

**Documentation**

- https://www.bls.gov/developers/api_python.htm
- https://data.bls.gov/cgi-bin/surveymost?cu

**Packages to install**

- Prettytable ('pip install prettytable')

**API Series ID**

Consumer Price Index for All Urban Consumers (CPI-U)
- *All items in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SA0
- *All items less food and energy in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SA0L1E
- *Food and beverages in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SAF
- *Food at home in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SAF11
- *Food away from home in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SEFV
- *Energy in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SA0E
- *Housing in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SAH
- *Shelter in U.S. city average, all urban consumers, not seasonally adjusted*: CUUR0000SAH1
((https://www.bls.gov/cpi/factsheets/owners-equivalent-rent-and-rent.htm))

**Calculate special CPI**

Occasionally, a user wishes to estimate a price change that is not published by BLS. For instance, suppose a user would like a CPI series for ‘services less energy services and shelter’. This can be done by estimating a special index, in this case, ‘services less energy services and shelter’.
[BLS Doc](https://www.bls.gov/cpi/factsheets/constructing-special-cpis.htm)

If SEEB01 -> CUUR0000SEEB01

Cost weight is just a sum of all the items

If I add all the values to calculate the services less energy services and shelter, it becomes a lot of data. Explore different solution (e.g. remove goods from core CPI)

**Supercore CPI**

"Fed Chair Jerome Powell cited a specific category of inflation—inflation in core services other than housing—as being perhaps “the most important category for understanding the future evolution of core inflation.” The financial press has termed this category “supercore” inflation" ([FED of St. Louis](https://www.stlouisfed.org/on-the-economy/2024/may/measuring-inflation-headline-core-supercore-services))

In [11]:
import os
import requests
import json
import prettytable
import pandas as pd
from datetime import datetime

folder_name = 'CPI_Data'
if not os.path.exists(folder_name): 
    os.makedirs(folder_name)

current_date = datetime.now()
current_year = current_date.year
last_year = current_year - 1

headers = {'Content-type': 'application/json'}
series_ids = ['CUUR0000SA0', 'CUUR0000SA0L1E', 'CUUR0000SAF', 'CUUR0000SAF11', 'CUUR0000SEFV', 'CUUR0000SA0E', 'CUUR0000SAH1']
data = json.dumps({"seriesid": series_ids, "startyear": str(last_year), "endyear": str(current_year)})
response = requests.post('https://api.bls.gov/publicAPI/v2/timeseries/data/', data=data, headers=headers)
json_data = json.loads(response.text)

series_names = {
    'CUUR0000SA0': 'All_Items',
    'CUUR0000SA0L1E': 'All_Items_Less_Food_Energy',
    'CUUR0000SAF': 'Food_Beverages',
    'CUUR0000SAF11': 'Food_At_Home',
    'CUUR0000SEFV': 'Food_Away_From_Home',
    'CUUR0000SA0E': 'Energy',
    'CUUR0000SAH1': 'Shelter'
}

dataframes = {}
for series in json_data['Results']['series']:
    x = prettytable.PrettyTable(["series id", "year", "period", "value", "footnotes"])
    seriesId = series['seriesID']
    descriptive_name = series_names.get(seriesId, seriesId)
    rows = []
    
    for item in series['data']:
        year = item['year']
        period = item['period']
        value = item['value']
        footnotes = ""
        for footnote in item['footnotes']:
            if footnote:
                footnotes += footnote['text'] + ','
        if 'M01' <= period <= 'M12':
            x.add_row([descriptive_name, year, period, value, footnotes.rstrip(',')])
            rows.append([descriptive_name, year, period, value, footnotes.rstrip(',')])

    df = pd.DataFrame(rows, columns=["series id", "year", "period", "value", "footnotes"])
    dataframes[seriesId] = df

    file_path = os.path.join(folder_name, descriptive_name + '.txt')
    with open(file_path, 'w') as output:
        output.write(x.get_string())

In [None]:
# Fix datasets


In [8]:
# Calculate MoM and YoY changes
for seriesId, df in dataframes.items():
    df['value'] = pd.to_numeric(df['value'], errors='coerce')
    df['MoM_change'] = df['value'].pct_change() * 100 
    df['YoY_change'] = df['value'].pct_change(periods=12) * 100
    dataframes[seriesId] = df

dataframes


{'CUUR0000SA0':       series id  year period    value footnotes       date  MoM_change  \
 18  CUUR0000SA0  2023    M01  299.170           2023-01-01         NaN   
 17  CUUR0000SA0  2023    M02  300.840           2023-02-01    0.558211   
 16  CUUR0000SA0  2023    M03  301.836           2023-03-01    0.331073   
 15  CUUR0000SA0  2023    M04  303.363           2023-04-01    0.505904   
 14  CUUR0000SA0  2023    M05  304.127           2023-05-01    0.251844   
 13  CUUR0000SA0  2023    M06  305.109           2023-06-01    0.322891   
 12  CUUR0000SA0  2023    M07  305.691           2023-07-01    0.190752   
 11  CUUR0000SA0  2023    M08  307.026           2023-08-01    0.436716   
 10  CUUR0000SA0  2023    M09  307.789           2023-09-01    0.248513   
 9   CUUR0000SA0  2023    M10  307.671           2023-10-01   -0.038338   
 8   CUUR0000SA0  2023    M11  307.051           2023-11-01   -0.201514   
 7   CUUR0000SA0  2023    M12  306.746           2023-12-01   -0.099332   
 6   CUUR0