# ECON 148 Lab #

In this lab, we'll learn how to use APIs or Application Programming Interfaces. APIs are super powerful tools that allow different computer programs to interact with each other. In the context of data science, APIs are often used to access data from different sources.

<center><img src="data/675px-Rest-API.png"/></center>

<h><center>[Image Source](https://www.seobility.net/en/wiki/REST_API)</center><h>

Using APIs allows you to easily access data and reduces the amount of time it takes to refresh data with updates. In this notebook, we'll explore data from the US Energy Information Administration.

## Part 1: Get Your API Key

In most cases, you will need an API key in order to access an API. Some API keys involve paperwork or payment, but the EIA provides *free* API keys [here](https://www.eia.gov/opendata/register.php). Once you have submitted your information, you will recieve an confirmation email. Once you have confirmed, you will recieve your key.

**Question 1.1**: Request an EIA API Key and paste it below.

In [None]:
my_api_key = "your API Key here"

In [None]:
5hPnULbtrhHk3j8BWkHdX5cVHzQhMq6DRIWEYzKT

## Part 2: Acessing the API

Let's first see how an API call works. In this notebook, we'll be using the `requests` library to access the EIA data. As part of this, we'll use the `get` method to pull the data. We'll also be utilizing an API key, which is a unique identifier, much like a password that allows you to access the data. 


Let's start by importing the necessary libraries.

In [None]:
import requests
import os
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('fivethirtyeight')
plt.rcParams["figure.figsize"] = (20,8)

First, we'll create variable for the URL location of the data we want to pull. The EIA has a helpful [website](https://www.eia.gov/opendata/browser/) that allows you to select what types of data you would like to use and autogenerate a URL to use. In this example, we'll be looking at $CO_2$ or carbon dioxide emissions.

In [None]:
# The url that stores the EIA data
api_url = "https://api.eia.gov/v2/co2-emissions/co2-emissions-aggregates/data/"

# The API key param and your API key
api_key = "?api_key="+my_api_key

# Selects just California and pulls data values
api_data_pull = "&facets[stateId][]=CA&data[]=value"

# Makes a GET request to pull the data
response = requests.get(api_url+api_key+api_data_pull)

# The response from the API in JSON form
r = response.json()

In [None]:
r["response"]["data"][0]

As you can see, the data from the API is not in the normal tabular/CSV format we are used to seeing. That is because the data is being stored in a format called JSON or JavaScript Object Notation. This data structure uses a series of dictionaries to store data in key-value pairs. To convert the JSON file into something that we can manipulate with `pandas`, we can use the `from_dict` method.

In [None]:
emissions = pd.DataFrame.from_dict(r["response"]["data"])

In [None]:
emissions.head()

Now that we have the data, let's explore it. Let's start by looking at carbon dioxide emisssion in California over time.

We can start by selecting `Total carbon dioxide emissions from all sectors` and `All Fuels`

In [None]:
ca_all_emissions_all_fuels=emissions[(emissions["sectorId"]=="TT")&
                                     (emissions["fuel-name"]=="All Fuels")].sort_values("period")

In [None]:
ca_all_emissions_all_fuels.head()

Now let's plot these values over time. The `plt.plot` method plots a line plot using the first parameter as the x-axis and the second parameter as the y-axis. Labels are added using `plt.xlabel()` and `plt.ylabel()` for each axis. Finally, a title is added with the `plt.title()` method.

In [None]:
plt.plot(ca_all_emissions_all_fuels["period"],ca_all_emissions_all_fuels["value"])
plt.xlabel("Year")
plt.ylabel("Millions of Metric Tons of CO2")
plt.title("Emissions of $CO_2$ In California from All Sources and All Fuels")
plt.show();

**Question 2.1**: Create a dataframe called `ca_all_emissions_coal` and select data that has `sectorId == "TT"` and `fuel-name == "Coal"`.

*Hint: Look at the previous example if you need a guide.*

In [None]:
ca_all_emissions_coal=emissions[(emissions["sectorId"]=="TT")&
                                     (emissions["fuel-name"]=="Coal")].sort_values("period")

In [None]:
ca_all_emissions_coal.head()

**Question 2.2**: Create a line plot using `plt.plot()` with `period` on the x-axis and `value` on the y-axis using the `ca_all_emissions_coal` dataframe. Add in an x-axis label of `Year` and y-axis label of `Millions of Metric Tons of CO2`. Add an appropriate title.

In [None]:


plt.plot(ca_all_emissions_coal["period"], ca_all_emissions_coal["value"])
plt.xlabel("Year")
plt.ylabel("Millions of Metric Tons of CO2")
plt.title("Emissions of $CO_2$ In California for All Sources with Only Coal")
plt.show();