# Accessing BLS API

**Part 3**

In this notebook we will see how to access the BLS API to retreive multiple series.

## step 1 - load packages and keys

In [1]:
library(httr)
library(jsonlite)
source('APIkeys.R')

## step 2 - setting up 

The communication with the API to download multiple series is done through a _POST_ request. This is how the BLS sets its API. 

We need to define two dictionaries as follows.

In [2]:
base_url = 'https://api.bls.gov/publicAPI/v2/timeseries/data/'  #this will not change
headers = c('Content-type' = 'application/json')  #This will not change !

# For the key seriesid enter a list of series names you wish to download
# For the key startyear enter the start year inside ""
# For the key endyear enter the end year inside ""

parameters = list(
    "seriesid" = c("CUUR0000SA0","CUUR0000SA0E"),
    "startyear" = "2011", 
    "endyear" = "2021",
    "catalog" = TRUE, 
    "calculations" = FALSE, 
    "annualaverage" = FALSE,
    "aspects" = FALSE,
    "registrationkey" = Sys.getenv('BLS_API_key')
)
data = toJSON(parameters, pretty=TRUE, auto_unbox = TRUE) #this converts the R data frame into a JSON format

# Note: we don't need to do toJSON for the variable headers because it will be handled
# inside the POST function

Caution: There's a subtle differentce between ```list()``` and ```data.frame()```. If we use 
```R
parameters = data.frame(
    "seriesid" = c("CUUR0000SA0","CUUR0000SA0E"),
    "startyear" = "2011", 
    "endyear" = "2021",
    "catalog" = TRUE, 
    "calculations" = FALSE, 
    "annualaverage" = FALSE,
    "aspects" = FALSE,
    "registrationkey" = Sys.getenv('BLS_API_key')
)
data = toJSON(parameters, pretty=TRUE, auto_unbox = TRUE)
```
instead of ```list()```, there will be extra escapes and brackets in ```data```, which will cause error later when we make the ```POST``` request. 

# step 3 - POST request

In [3]:
p = POST(url = base_url, 
         body = data, 
         add_headers(headers))

In [4]:
json_data = fromJSON(rawToChar(p$content))

In [5]:
json_data

“row names were found from a short variable and have been discarded”
“row names were found from a short variable and have been discarded”
“row names were found from a short variable and have been discarded”


Unnamed: 0_level_0,seriesID,catalog.series_title,catalog.series_id,catalog.seasonality,catalog.survey_name,catalog.survey_abbreviation,catalog.measure_data_type,catalog.area,catalog.item,data.year,data.period,data.periodName,data.value,data.footnotes,data.year,data.period,data.periodName,data.value,data.footnotes
Unnamed: 0_level_1,<chr>,"<df[,8]>",<list>,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
1,CUUR0000SA0,"All items in U.S. city average, all urban consumers, not seasonally adjusted",CUUR0000SA0,Not Seasonally Adjusted,CPI for All Urban Consumers (CPI-U),CU,All items,U.S. city average,All items,2021,M12,December,278.802,,2021,M12,December,256.207,
2,CUUR0000SA0E,"Energy in U.S. city average, all urban consumers, not seasonally adjusted",CUUR0000SA0E,Not Seasonally Adjusted,CPI for All Urban Consumers (CPI-U),CU,Energy,U.S. city average,Energy,2021,M11,November,277.948,,2021,M11,November,259.100,
3,CUUR0000SA0,"All items in U.S. city average, all urban consumers, not seasonally adjusted",CUUR0000SA0,Not Seasonally Adjusted,CPI for All Urban Consumers (CPI-U),CU,All items,U.S. city average,All items,2021,M10,October,276.589,,2021,M10,October,255.338,
4,CUUR0000SA0E,"Energy in U.S. city average, all urban consumers, not seasonally adjusted",CUUR0000SA0E,Not Seasonally Adjusted,CPI for All Urban Consumers (CPI-U),CU,Energy,U.S. city average,Energy,2021,M09,September,274.310,,2021,M09,September,248.228,
5,CUUR0000SA0,"All items in U.S. city average, all urban consumers, not seasonally adjusted",CUUR0000SA0,Not Seasonally Adjusted,CPI for All Urban Consumers (CPI-U),CU,All items,U.S. city average,All items,2021,M08,August,273.567,,2021,M08,August,246.639,
6,CUUR0000SA0E,"Energy in U.S. city average, all urban consumers, not seasonally adjusted",CUUR0000SA0E,Not Seasonally Adjusted,CPI for All Urban Consumers (CPI-U),CU,Energy,U.S. city average,Energy,2021,M07,July,273.003,,2021,M07,July,244.800,
7,CUUR0000SA0,"All items in U.S. city average, all urban consumers, not seasonally adjusted",CUUR0000SA0,Not Seasonally Adjusted,CPI for All Urban Consumers (CPI-U),CU,All items,U.S. city average,All items,2021,M06,June,271.696,,2021,M06,June,240.720,
8,CUUR0000SA0E,"Energy in U.S. city average, all urban consumers, not seasonally adjusted",CUUR0000SA0E,Not Seasonally Adjusted,CPI for All Urban Consumers (CPI-U),CU,Energy,U.S. city average,Energy,2021,M05,May,269.195,,2021,M05,May,235.339,
9,CUUR0000SA0,"All items in U.S. city average, all urban consumers, not seasonally adjusted",CUUR0000SA0,Not Seasonally Adjusted,CPI for All Urban Consumers (CPI-U),CU,All items,U.S. city average,All items,2021,M04,April,267.054,,2021,M04,April,229.116,
10,CUUR0000SA0E,"Energy in U.S. city average, all urban consumers, not seasonally adjusted",CUUR0000SA0E,Not Seasonally Adjusted,CPI for All Urban Consumers (CPI-U),CU,Energy,U.S. city average,Energy,2021,M03,March,264.877,,2021,M03,March,225.861,


In [6]:
# Coerce JSON arrays containing only records (JSON objects) into a data frame
json_data = fromJSON(rawToChar(p$content), simplifyDataFrame = FALSE)

In [7]:
json_data

## Step 4 - exploring the data

We need to dig in and find the numbers we want.

In [8]:
typeof(json_data$Results)

In [9]:
names(json_data$Results)

In [10]:
json_data$Results$series

In [11]:
typeof(json_data$Results$series)

In [12]:
nrow(json_data$Results$series)

NULL

In [13]:
is.vector(json_data$Results$series)

In [14]:
length(json_data$Results$series)

In [15]:
json_data$Results$series[[1]]

In [16]:
typeof(json_data$Results$series[[1]])

In [17]:
names(json_data$Results$series[[1]])

In [18]:
json_data$Results$series[[1]]$catalog

In [19]:
json_data$Results$series[[1]]$data

In [20]:
typeof(json_data$Results$series[[1]]$data)

In [21]:
nrow(json_data$Results$series[[1]]$data)

NULL

In [22]:
is.vector(json_data$Results$series[[1]]$data)

In [23]:
length(json_data$Results$series[[1]]$data)

In [24]:
json_data$Results$series[[1]]$data[132]

In [25]:
json_data$Results$series[[1]]$data[132][[1]]

In [26]:
as.numeric(json_data$Results$series[[1]]$data[132][[1]]$value)

How about the other series? How can we access it?

In [27]:
as.numeric(json_data$Results$series[[2]]$data[132][[1]]$value)

## Step 5 - creating a function

We want to make a function that will accept a list of variables (i.e. series names) and will return the dictionary output.

In [28]:
multiSeries = function(varList,myKey){
    base_url = 'https://api.bls.gov/publicAPI/v2/timeseries/data/'  #this will not change
    headers = c('Content-type' = 'application/json')  #This will not change !

    parameters = list(
        "seriesid" = varList,
        "startyear" = "2011", 
        "endyear" = "2021",
        "catalog" = TRUE, 
        "calculations" = FALSE, 
        "annualaverage" = FALSE,
        "aspects" = FALSE,
        "registrationkey" = myKey
    )
    
    data = toJSON(parameters, pretty=TRUE, auto_unbox = TRUE) #this converts the R data frame into a JSON format

    p = POST(url = base_url, 
         body = data, 
         add_headers(headers))
    
    json_data = fromJSON(rawToChar(p$content), simplifyDataFrame = FALSE)
    
    return(json_data)
}

Let's test it:

In [29]:
res = multiSeries(c("CUUR0000SAM1","CUUR0400SEMC"),Sys.getenv('BLS_API_key'))

In [30]:
res$Results$series

In [31]:
length(res$Results$series)

We still need to write a function that will parse the data and tease out the values. 

But first, which variables exist? 
<p style="font-size:24px;color:red;">
Not all combinations of area and item code exist.
</p>

> Price indexes are available for the U.S., the four Census regions, nine Census divisions, two size of city classes, eight cross-classifications of regions and size-classes, and for 23 local areas. Indexes are available for major groups of consumer expenditures (food and beverages, housing, apparel, transportation, medical care, recreation, education and communications, and other goods and services), for items within each group, and for special categories, such as services.

[Census divisions and regions](https://www2.census.gov/geo/pdfs/maps-data/maps/reference/us_regdiv.pdf) and  [FIPS codes explanation](https://www.census.gov/library/reference/code-lists/ansi.html)

> Monthly indexes are available for the U.S., the four Census regions, and some local areas. More detailed item indexes are available for the U.S. than for regions and local areas.

What if we ask for a variable that doesn't exist?

In [32]:
# the first variable in the list doesn't exist

res = multiSeries(c("CUUR0000S","CUUR0400SEMC"),Sys.getenv('BLS_API_key'))

In [33]:
res

In [34]:
length(res$Results$series)

still length 2... but what is in the first position?

In [35]:
res$Results$series[1]

If a variable name doesn't exist, the 'data' key has a value equal to an empty list.

To be continued...