# Overview
We wish to know the excess electricity production capacity in every state in the U.S. during a specific month. The idea is to find out how well states are poised to serve excess demand during extreme weather events. For example, if we find out the excess capacity of a state in july, it might indicate how well the state can handle spikes in demand during heat waves. Similarly, if excess capacity of a state in january might indicate how well it can handle cold waves. 

# Links
  1. [EIA - Statewise generation capacity](https://www.eia.gov/opendata/browser/electricity/state-electricity-profiles/capability?frequency=annual&data=capability;&start=2022&end=2022&sortColumn=period;&sortDirection=desc;)
  2. [EIA - Regional, statewise electricity consumption](https://www.eia.gov/opendata/browser/electricity/retail-sales?frequency=monthly&data=sales;&start=2024-01&end=2024-01&sortColumn=period;&sortDirection=desc;)

In [4]:
import pandas as pd
import requests
import json

In [19]:
# Obtain the power production capability data from EIA database
url = "https://api.eia.gov/v2/electricity/state-electricity-profiles/capability/data/?frequency=annual&data[0]=capability&start=2022&end=2022&sort[0][column]=period&sort[0][direction]=desc&offset=0&length=5000"
username  = "io61vpxQKgT3vR8pbLybFN6E63l5rbMlgds5t0nF"
# parameters = {
#                 "frequency": "annual",
#                 "data": [
#                     "capability"
#                 ],
#                 "facets": {},
#                 "start": "2022",
#                 "end": "2022",
#                 "sort": [
#                     {
#                         "column": "period",
#                         "direction": "desc"
#                     }
#                 ],
#                 "offset": 0,
#                 "length": 5000
#             }

resp = requests.get(url, auth=(username,""))
raw_data = pd.DataFrame(json.loads(resp.text)['response']['data'])

In [30]:
# Fix the data types right and explore the capabilities data
raw_data['capability'] = raw_data['capability'].astype('float')
print("Sample capabilities data")
print(raw_data.head())
print("Unique state values")
print(raw_data['stateDescription'].unique())
print("No. of unique state values: ", len(raw_data['stateDescription'].unique()))
print('Sanity check: Make sure there are not mixed units')
print(raw_data['capability-units'].unique())



Sample raw data
  period stateId stateDescription producertypeid      producerTypeDescription  \
0   2022      MS      Mississippi            IPP  Independent Power Producers   
1   2022      MS      Mississippi            IPP  Independent Power Producers   
2   2022      MS      Mississippi            IPP  Independent Power Producers   
3   2022      MS      Mississippi            TOT                  All sectors   
4   2022      MS      Mississippi            TOT                  All sectors   

  energysourceid energySourceDescription  capability capability-units  
0            SOL                   Solar       219.3        megawatts  
1          SOLPV              Solar - PV       219.3        megawatts  
2            WOO                    Wood       300.8        megawatts  
3            ALL                     All     14723.5        megawatts  
4            COL                    Coal      1444.0        megawatts  
Unique state values
['Mississippi' 'Montana' 'North Carolina' 'No

In [34]:
# Sum up capability for each state and remove data related to stateDescription of "United States" and "District of Columbia"
capability_df = raw_data.groupby('stateDescription')[['capability']].sum()
capability_df.drop(['United States', 'District of Columbia'], inplace=True)
capability_df['capability-units'] = raw_data.iloc[0]['capability-units']
print(capability_df.head())

                  capability capability-units
stateDescription                             
Alabama             145267.4        megawatts
Alaska               15310.4        megawatts
Arizona             148095.0        megawatts
Arkansas             72772.4        megawatts
California          454375.6        megawatts


In [41]:
# Get the monthly consumption data from EIA
url = "https://api.eia.gov/v2/electricity/retail-sales/data/?frequency=monthly&data[0]=sales&start=2024-01&end=2024-01&sort[0][column]=period&sort[0][direction]=desc&offset=0&length=5000"
resp = requests.get(url, auth=(username,""))
raw_data = pd.DataFrame(json.loads(resp.text)['response']['data'])


In [42]:

# Fix data types and explore the consumption data
raw_data['sales'] = raw_data['sales'].astype('float')
print("Sample consumption data")
print(raw_data.head())
print("Unique state values")
print(raw_data['stateDescription'].unique())
print("No. of unique state values: ", len(raw_data['stateDescription'].unique()))
print('Sanity check: Make sure there are not mixed units')
print(raw_data['sales-units'].unique())

Sample consumption data
    period stateid    stateDescription sectorid      sectorName        sales  \
0  2024-01    PACC  Pacific Contiguous      COM      commercial  13598.56698   
1  2024-01    PACC  Pacific Contiguous      IND      industrial   6217.68149   
2  2024-01    PACC  Pacific Contiguous      OTH           other          NaN   
3  2024-01    PACC  Pacific Contiguous      RES     residential  14495.83095   
4  2024-01    PACC  Pacific Contiguous      TRA  transportation     61.33932   

             sales-units  
0  million kilowatthours  
1  million kilowatthours  
2  million kilowatthours  
3  million kilowatthours  
4  million kilowatthours  
Unique state values
['Pacific Contiguous' 'Pacific Noncontiguous' 'U.S. Total'
 'South Atlantic' 'East South Central' 'West South Central' 'Mountain'
 'Maine' 'Maryland' 'Massachusetts' 'Michigan' 'Minnesota' 'Mississippi'
 'Missouri' 'Montana' 'Nebraska' 'Nevada' 'New Hampshire' 'New Jersey'
 'New Mexico' 'New York' 'North Carolin

In [50]:
states_only_consumption_df = raw_data[raw_data['stateDescription'].isin(capability_df.index)]
monthly_consumption_df = states_only_consumption_df.groupby('stateDescription')[['sales']].sum()
print("Monthly consumption data sample:")
print(monthly_consumption_df.head())
print("Sanity check: Ensure there are 50 and only 50 states in the monthly consumption dataframe:")
print("No. of states for which consumption data is available: ", len(monthly_consumption_df.index))

Monthly consumption data sample:
                        sales
stateDescription             
Alabama           16039.63043
Alaska             1176.41614
Arizona           12652.16965
Arkansas           9197.26295
California        40088.03973
Sanity check: Ensure there are 50 and only 50 states in the monthly consumption dataframe:
No. of states for which consumption data is available:  50


50
