# Data sources

This notebook provides instructions on how to work with some of the data sources we work with in the class.

<ol>
    <li><a href="#FRED">FRED - Federal Reserve Economic Data</a></li>
    <li><a href="#PWT">PWT - Penn World Tables</a></li>
</ol>

In [1]:
import datetime
today = datetime.date.today().strftime('%d %B %Y')
print('Last update:',today) # '2021-05-03'

Last update: 01 September 2022


<a id="FRED"></a>
## FRED - Federal Reserve Economic Data

website: <a href="https://fred.stlouisfed.org/">FRED</a>

This database is hosted by the Federal Reserve Bank of St. Louis, and contains a large set of macroeconomic time series, primarily for the United States. You can either browse it by categories, or directly enter the codes for the economic time series that you are interested in.

When you open the main page, you can type in the particular code of the series directly in the search box. For example, <tt>GDPC1</tt> stands for real gross domestic product.

![FRED main page - GDPC1](attachment:data-sources-FRED-1.png)

Then confirm the selection on the next page, and you obtain the following graph. The top line shows the name of the variable, together with its code: <b>Real Gross Domestic Product (GDPC1)</b>. The next line shows the last update <b>Q2 2021</b>, units, and frequency for the variable.

Further to the right, you can select the dates for the graph <b>1947-01-01 to 2021-04-01</b>.

On the right, you can click <b>Download</b> and either download the data to Excel, or download the image of the graph.

Underneath, the <b>Edit graph</b> button allows you to transform the time series. You can change the units, e.g., from Billions of chained 2012 dollars to annual growth rate, period-to-period growth rate, or an index value relative to a particular year. You can also conduct more sophisticated mathematical operations, or add more time series into the graph by typing in the code of the additional time series into the corresponding box, and clicking <b>Add</b>. All these transformations can of course be conducted also using your preferred statistical tool, Excel, Stata, etc.

![FRED - GDPC1](attachment:data-sources-FRED-2.png)

If you do not know the code for the time series, you can either Browse data by categories or sources (bottom right in the screen shot below, or enter a search term into the search box). For example, here we want to find data for the unemployment rate.

![FRED main page - unemployment rate](attachment:data-sources-FRED-3.png)

The result of the search is a whole list of various time series that are relevant for the search term. Also, pay attention to the various formats the series are offered in. For example, the unemployment rate in the list below is offered either as Seasonally Adjusted, or as Not Seasonally Adjusted.

![FRED - unemployment rate](attachment:data-sources-FRED-4.png)

If you are interested in downloading the data in <b>Python</b>, you can directly import the series into a Pandas DataFrame using the following commands. More detail on how to process the data from FRED can be found in the Jupyter Notebook on <a href="01-some-facts-about-economic-growth.ipynb">Topic 1: Some facts about economic growth</a>.

In [2]:
# download the quarterly time series for real GDP (GDPC1) and real investment (GDPIC1)
import pandas as pd
import pandas_datareader as pdr
import datetime
data = pdr.data.DataReader(['GDPC1','GPDIC1'],'fred',start = datetime.datetime(1800,1,1),end = datetime.datetime(2050,1,1))

<a id="PWT"></a>
## PWT - Penn World Tables

website: <a href="https://www.rug.nl/ggdc/productivity/pwt/">Penn World Tables</a>

This database is developed and hosted by the University of Groningen and UC Davis, and it is the continuation of a project initiated at the University of Pennsylvania by Robert Summers, Irving Kravis, and Alan Heston. It contains a consistent cross-country panel on GDP, capital, labor and other variables adjusted for purchasing power parity (PPP).

![PWT - main page](attachment:data-sources-PWT-0.png)

Constructing historical data on quantities of capital annd PPP adjustments is a complex procedure that requires a substantial amount of modeling and measurement assumptions. The database therefore provides alternative measurements, their construction is described in the data manual provided on the website.

The database is provided in the form of a single file, either in Excel or Stata format.

In the <b>Excel file</b>, you can filter/sort the <tt>Data</tt> sheet by year or by country of interest, and then simply use the data from the relevant columns. Description of the column abbreviations is provided in the <tt>Legend</tt> sheet.

![PWT spreadsheet 1](attachment:data-sources-PWT-1.png)

![PWT spreadsheet 2](attachment:data-sources-PWT-2.png)