<a href="https://colab.research.google.com/github/RDeconomist/RDeconomist.github.io/blob/main/data/DSEP_2_1_LoopsFREDdownloader.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Richard Davies**
*Data Science for Economics and Policy - 2023*

**Tutorial:** Using loops to batch download data from an API

**Motivation:** You are asked by your Minister to build a dashboard for the US economy. This must take in 10 important series, each of them plotted with a line chart. The data will need to be re-downloaded each month, meaning that you are manually downloading 120 series per year, in order to keep your dashboard up to date. How can we bath process this, so that all downloads are done with one click?

In [None]:
# Preliminaries 1 - the format() method:
# See: https://www.w3schools.com/python/ref_string_format.asp

# Take a sentence, and put a placeholder {} where we want to insert something:
sentence = "The best rugby team in the world is {}"
# Now we can use .format() to insert something into this place:
sentence.format('Wales')

In [None]:
# Note: 'format()' method is a pre-defined piece of code that you must use;
# But 'sentence' is just a variable name, it can be anything:
x = "The best football team in the world is {}"
x.format('Manchester United')

In [None]:
# Next note that we can put a variable within the format():
sentence = "The best team this year is {}"
team = 'Manchester City'
sentence.format(team)
# This allows us to change the sentence

In [None]:
# // PRELIMINARIES 2 - Using the format method in a loop:

sentence = "The best team is {}"
teams = ['Manchester United', 'AC Milan', 'Barcelona', 'PSG', 'Bayern Munich', 'River Plate']

# // Begin a loop, dealing with series one by one:
for i in teams:

   # // Everything that follows the for loop is indented. (On my machine, three spaces)
   # // Build the URL for this iteration of the loop, and check what we are getting:
   topTeam = sentence.format(i)
   print(topTeam)

In [None]:
# // ASIDE - the loop in other langagues 1 - Stata

# // In Stata:
foreach i in "Manchester United" "AC Milan" "Barcelona" "PSG" "Bayern Munich" "River Plate"{
  display("The best team is `i'")
  }

# // Note: the backward and forward tick around i is important, `i'.

In [None]:
# // ASIDE - the loop in other langagues 2 - JavaScript - preliminary 1

# // Note that you can use JS in Collab by using "Cell Magic"
# // "Magics" are a set of commands that help you do various things.
# // See, e.g., this reference: https://colab.research.google.com/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/01.03-Magic-Commands.ipynb#scrollTo=IP7qwmbq60-g
# // One of them allow you to turn the cell into a JavaScrip cell

%%javascript
let message = "I am now using JS in Collab to do Data Science for Economists";
console.log(message);
document.querySelector("#output-area").appendChild(document.createTextNode(message));


In [None]:
# // ASIDE - the loop in other langagues 2 - JavaScript - preliminary 2

# // Declare Javacript:
%%javascript

// # Use "Template Literals" to put placeholders in string:
// # This is akin to the format() method used in python above.
var x = "Data Science project";
var y = "coming along nicely"
var message = `My ${x} is ${y}`

// # Now display what we have:
// # First in the concole:
console.log(message)
// # Next in the Collab output area
document.querySelector("#output-area").appendChild(document.createTextNode(message));



In [None]:
# // ASIDE - the loop in other langagues 2 - JavaScript.
# // This examples used the FOR OF loop:
# // The for of loop is a simple loop that is very close to the Python loop in Syntax:

%%javascript

let teams = ['Manchester United', 'AC Milan', 'Barcelona', 'PSG', 'Bayern Munich', 'River Plate']

for(i of teams){
  console.log(`The best team according to Javascript is ${i}`)
}

// Note: the backwards ticks are important in `Sentence`
// Compare this to the Python and Stata loops above.

In [None]:
# // ASIDE - the loop in other langagues 2 - JavaScript - football team loop:

# // Declare JS:
%%javascript

// # Initialise a viariable, as an array of team names:
let teams = ['Manchester United', 'AC Milan', 'Barcelona', 'PSG', 'Bayern Munich', 'River Plate']

// # JS loop syntax:
// for(initialise value, stopping condition, updating condition){
//    code to execute 1
//    code to exectte 2
//    ...
//    code to execute N
//    }

// # Loop:
for(let i=0; i<teams.length; i++){
  let z = teams[i];
  let message = `The best team is ${z}`;
  document.querySelector("#output-area").appendChild(document.createTextNode(message));
  document.querySelector("#output-area").appendChild(document.createTextNode(', '));
}

In [None]:
# // PRELIMINARIES 3 - How does the API we are looking at work?
# // Everything above is general, lets look at a specific API.

# // Federal Reserve Economic Data, aka FRED.
# // The API docs are here: https://fred.stlouisfed.org/docs/api/fred/

# // The general form of the API is as follows:
"https://api.stlouisfed.org/fred/series/observations?series_id={SeriesID}&api_key={APIkey}&file_type={fileType}"
# // Note: the convention that when replacing "{name here}" we write "Richard" not "{Richard}". That is we get rid of the curly brackets.

# // Some examples of the FRED API in action:
# // The examples below use my API key. It should work for you, but please sign up for your own.

# // 1. INFLATION
"https://api.stlouisfed.org/fred/series/observations?series_id=PCEPI&api_key=22ee7a76e736e32f54f5df0a7171538d&file_type=json"

# // 2. 10-YEAR GOVERNMETN BOND:
"https://api.stlouisfed.org/fred/series/observations?series_id=DGS10&api_key=22ee7a76e736e32f54f5df0a7171538d&file_type=json"

# // 3. UNEMPLOYMENT RATE:
"https://api.stlouisfed.org/fred/series/observations?series_id=UNRATE&api_key=22ee7a76e736e32f54f5df0a7171538d&file_type=json"

# // These are clearly very similar. The only thing that changes is the series Id.

####################################

In [None]:
# // PUTTING THINGS TOGETHER 1 - A loop of all our variables:

# // Set a base URL.
# // This includes everthing that does not change in our loop.
# // And a placeholder "{}" for the part that does change.
url_base = "https://api.stlouisfed.org/fred/series/observations?series_id={}&api_key=22ee7a76e736e32f54f5df0a7171538d&file_type=json"

# NOW PICK ALL THE SERIES THAT WE ARE INTERESTED IN:
fredSeries = ['PCEPI', 'CPIAUCSL', 'PAYEMS', 'DGS10', 'INDPRO', 'UNRATE', 'LES1252881600Q']

# // Begin a loop, dealing with series one by one:
for i in fredSeries:
   # // Build the URL for this iteration of the loop, and check what we are getting:
   URL = url_base.format(i)
   print(URL)

In [None]:
# // PUTTING THINGS TOGETHER 2 - Importing some tools that we will need:

# // Opening web sites and web scraping:
import requests

# // JSON. This helps us make JSON look prettier and easier to read
import json

# /// Files.  This is part of Collab - allows you to upload and download files
from google.colab import files

# // OS. Sometimes need this for finding working directory:
import os

In [None]:
## // An aside: checking which versions of thins are running
print(requests.__version__)
print(json.__version__)

In [None]:
## // Getting data from a single API call:

url = "https://api.stlouisfed.org/fred/series/observations?series_id=DGS10&api_key=22ee7a76e736e32f54f5df0a7171538d&file_type=json"

# We use'requests' which we installed above:
data = requests.get(url).json()

# Print what we got
data

In [None]:
# // Downloading the date from a single API call:

# // Based on the steps above, we have a variable "data" which has data on the US Government 10 year yield.

# // Set the filename, and check what we are getting:
fileName = "data_FRED-DGS10.json"
print(fileName)
# // Note: again the file name can be anything.

# /// Save the file:
with open(fileName, 'w', encoding='utf-8') as f:
  json.dump(data, f, ensure_ascii=False, indent=4)

# /// Download the file to local machine:
files.download('data_FRED-DGS10.json')

In [None]:
# // PUTTING IT ALL TOGETHER:

# // Set the base url:
url_base = "https://api.stlouisfed.org/fred/series/observations?series_id={}&api_key=22ee7a76e736e32f54f5df0a7171538d&file_type=json"

# // Set the base fileName:
file_base = "data_FRED-{}.json"

# // Pick the series that I want:
fredSeries = ['PCEPI', 'CPIAUCSL', 'PAYEMS', 'DGS10', 'INDPRO', 'UNRATE', 'LES1252881600Q']

# // Begin a loop, dealing with each series, one by one:
for i in fredSeries:

   # // In what follows below I print the iteration of the loop we are on:
   # // This is not necessary but can be helpful, esp with long loops:
   print("------Iteration Starts--------")
   print(i)

   # // Build the URL for this iteration of the loop, and check what we are getting:
   URL = url_base.format(i)
   print(URL)

   # // Request the html from the URL:
   data = requests.get(URL).json()
   print(data)

   # // Set the filename, and check what we are getting:
   fileName = file_base.format(i)
   print(fileName)

   # // Add some white space to our output. (This is purely so we can see what is happening below clearly)
   print("------Iteration Ends--------")

   # /// Save the file:
   with open(fileName, 'w', encoding='utf-8') as f:
     json.dump(data, f, ensure_ascii=False, indent=4)

   # /// Download the file to local machine:
   files.download(fileName)

**Richard Davies**
*Data Science for Economics and Policy 2023*