## Portland General Electric Demand and Temperature Analysis for 2021

### Sub analysis of demand forecast accuracy 

This is a quick analysis to identify if there is a correlation between Portland General Electric (PGE) hourly demand, demand forecast, and daytime mean temperature. 

This analysis will also determine the accuracy of PGE's demand forecast. 

This notebook will also serve as a guide for creating API calls on the EIA website. 

Background: PGE is a major public utility which distributes electricty to 44% of Oregon's inhabitants (including customers in Multnomah county). 

The year 2021 was selected as Oregon experienced [record temperatures](https://www.opb.org/article/2022/02/10/oregons-2021-heat-dome-notches-another-record/). Some parts of oregon reached 119F. 

The data for this analysis comes from the U.S. Energy Information Administration (https://www.eia.gov/opendata/). Data is made available via their public API. 

### Using EIA for energy data 
To use EIA for energy data, first, generate an API key using this link: 
https://www.eia.gov/opendata/register.php

Save the api key to an `.env` file in your project directory. 
The file should have a format similar to a bash environmental variable: 

EIA_API_KEY=your_key

This will later be sourced by the notebook. 

### Creating GET requests for EIA data 

EIA has a helpful API browser with a GET request formulater. 

For this project, the following is used:

API ROUTE:  
- Electricity
- Electric Power Operations (Daily and Hourly) 
- Hourly Demand, Demand Forecast, Generation, And Interchange  

Frequency: 
- Hourly
- Start: January 1, 2021 
- End: December 21, 2021  

Filtered by:
- Balancing Authority / Region: (PGE) Portland General Electric Company

These filters generate an **API URL** of:  
`https://api.eia.gov/v2/electricity/rto/region-data/data/?frequency=hourly&data[0]=value&sort[0][column]=period&sort[0][direction]=desc&offset=0&length=5000`

The API key needs to be added to all EIA urls. Add it immediately after the `?` symbol (which delimits the boundary between the URI object and query parameters). 

It becomes:  
`https://api.eia.gov/v2/electricity/rto/region-data/data/?api_key={EIA_API_KEY}&frequency=hourly&data[0]=value&sort[0][column]=period&sort[0][direction]=desc&offset=0&length=5000`


## Definitions from Hourly Electric Grid Monitor (EIA-930 data)
"Form EIA-930 data collection provides a centralized and comprehensive source for hourly operating data about the high-voltage bulk electric power grid in the Lower 48 states."
You can read more Form DIA-930 data collecton [here](https://www.eia.gov/electricity/gridmonitor/about). 


"Balancing Authorities (BAs)... are mainly responsible for balancing electricity supply, demand, and interchange on their electric systems in real time."

"**Demand** is a calculated value representing the amount of electricity load within a BA's electric system. A BA derives its demand value by taking the total metered net electricity generation within its electric system and subtracting the total metered net electricity interchange occurring between the BA and its neighboring BAs."

"**Demand forecast:** Each BA produces a day-ahead electricity demand forecast for every hour of the next day. These forecasts help BAs plan for and coordinate the reliable operation of their electric system."

"**Net generation and net generation by energy source:** Net generation represents the metered output of electric generating units in a BA's electric system. This generation only includes generating units that are managed by a BA or whose operations are visible to a BA."




Source: https://www.eia.gov/electricity/gridmonitor/about

In [76]:
import requests
import json
import pandas as pd
import numpy as np 
import os 
from dotenv import dotenv_values #reads key-value pairs from a .env file and can set them as environment variables

In [None]:
# This loads the shell environment into the jupyter notebook environment
# So it is possible to get environmental variables that are in the .env file in the shell into this environment 
%load_ext dotenv
%dotenv -o -v

# Get the EIA_API_KEY from the environment 
EIA_API_KEY = os.environ.get("EIA_API_KEY")

### EIA API Call 
Below, the API call is made to get PGE's hourly demand and demand forecast data. 
#### NOTE on Limits, and Pagination 
"EIA's API limits its data returns to the first 5,000 rows responsive to the request."
[source](https://www.eia.gov/opendata/documentation.php)

Therefore, 12 calls will be made; one for each month. Then, the 12 calls will be concatenated into one pandas dataframe. 

In [150]:
# months numbers as strings 
months = [('01','31'),('02','28'),('03','31'),('04','30'),('05','31'),('06','30'),('07','31'),('08','31'),('09','30'),('10','31'),('11','30'),('12','31')]
# empty dataframe to concatentate the data into 
dfs = []

for month in months: 
    eia_url = ('https://api.eia.gov/v2/electricity/rto/region-data/data/?' +
          f'api_key={EIA_API_KEY}&' +
          'frequency=hourly&' +
          'data[0]=value&' +
          'facets[respondent][]=PGE&' +
          f'start=2021-{month[0]}-01T00&end=2021-{month[0]}-{month[1]}T23&' +
          'sort[0][column]=period&sort[0][direction]=desc&offset=0&length=5000')
    response = requests.get(eia_url)
    eia_json = response.json()
    month_dataframe = pd.DataFrame.from_dict(eia_json['response']['data'])
    dfs.append(month_dataframe)


In [161]:
pge_all_data = pd.concat(dfs)

In [162]:
pge_all_data.shape

(35039, 7)

In [166]:
pge_all_data.head(10)

Unnamed: 0,period,respondent,respondent-name,type,type-name,value,value-units
0,2021-01-31T23,PGE,Portland General Electric Company,D,Demand,2539.0,megawatthours
1,2021-01-31T23,PGE,Portland General Electric Company,TI,Total interchange,-1870.0,megawatthours
2,2021-01-31T23,PGE,Portland General Electric Company,NG,Net generation,669.0,megawatthours
3,2021-01-31T23,PGE,Portland General Electric Company,DF,Day-ahead demand forecast,2536.0,megawatthours
4,2021-01-31T22,PGE,Portland General Electric Company,D,Demand,2582.0,megawatthours
5,2021-01-31T22,PGE,Portland General Electric Company,TI,Total interchange,-2016.0,megawatthours
6,2021-01-31T22,PGE,Portland General Electric Company,NG,Net generation,566.0,megawatthours
7,2021-01-31T22,PGE,Portland General Electric Company,DF,Day-ahead demand forecast,2566.0,megawatthours
8,2021-01-31T21,PGE,Portland General Electric Company,D,Demand,2610.0,megawatthours
9,2021-01-31T21,PGE,Portland General Electric Company,TI,Total interchange,-2072.0,megawatthours


### Filter data by type-name
Now, separate out the different data types in the dataframe 

In [172]:
demand = pge_all_data[pge_all_data['type-name'] == 'Demand'] 
generation = pge_all_data[pge_all_data['type-name'] == 'Net generation'] 
forecast = pge_all_data[pge_all_data['type-name'] == 'Day-ahead demand forecast'] 