# Energy Consumption

In [None]:
import requests
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import re

## Importing EIA data in physical units

The following imports eia.gov energy consumption data in physical units. This contains data for the following indicators (as well as their units):

* **CLRCP** = Coal consumed by the residential sector - thousand short tons
* **DFRCP** = Distillate fuel oil consumed by the residential sector - thousand barrels
* **ESRCP** = Electricity consumed by (i.e., sold to) the residential sector - million kilowatthours
* **KSRCP** = Kerosene consumed by the residential sector - thousand barrels
* **LGRCP** = LPG consumed by the residential sector - thousand barrels
* **NGRCP** = Natural gas consumed by (delivered to) the residential sector (including supplemental gaseous fuels) - million cubic feet
* **PARCP** = All petroleum products consumed by the residential sector - thousand barrels
* **WDRCP** = Wood consumed by the residential sector - thousand cords

The following indicators are not necessarily residential, but may be interesting to look at.

* **HYTCP** = Hydroelectricity, total net generation - million kilowatthours

In [None]:
!mkdir data

!curl https://www.eia.gov/state/seds/sep_update/use_all_phy_update.csv -o ./data/consumption_phy.csv
!curl https://www.eia.gov/state/seds/sep_use/total/csv/use_all_btu.csv -o ./data/consumption_btu.csv

mkdir: data: File exists
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2725k  100 2725k    0     0  2725k      0  0:00:01  0:00:01 --:--:-- 2294k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2643k  100 2643k    0     0  2643k      0  0:00:01  0:00:01 --:--:-- 1544k


In [None]:
# Imports EIA energy consumption data and deletes unnecessary columns and rows.
consumption_phy = pd.read_csv('./data/consumption_phy.csv')
consumption_phy = consumption_phy.drop(['Data_Status', '1960', '1961', '1962', '1963', '1964', '1965', '1966', '1967', '1968', '1969'], axis=1)
consumption_phy = consumption_phy.drop(consumption_phy.index[910:1040])
consumption_phy = consumption_phy.reset_index()
consumption_phy = consumption_phy.drop(consumption_phy.index[5590:5723])
consumption_phy = consumption_phy.drop('index', axis=1)

# Melts the data so that the years are in a single column.
consumption_phy = pd.melt(frame=consumption_phy, id_vars=['State', 'MSN'], var_name='Year', value_name='Reading')

In [None]:
consumption_phy.head()

Unnamed: 0,State,MSN,Year,Reading
0,AK,ABICP,1970,0.0
1,AK,ARICP,1970,274.0
2,AK,ARTCP,1970,274.0
3,AK,ARTXP,1970,274.0
4,AK,AVACP,1970,462.0


## Importing EIA data in BTU

The following imports eia.gov energy consumption data in BTU. This contains data for the following indicators:

* **GERCB** = Geothermal energy consumed by the residential sector.
* **LORCB** = The residential sector's share of electrical system energy losses.
* **SFRCB** = Supplemental gaseous fuels consumed by the residential sector.
* **SORCB** = Solar energy consumed by the residential sector.
* **TERCB** = Total energy consumed by the residential sector.
* **TERPB** = Total energy consumption per capita in the residential sector.
* **TNRCB** = Total energy consumed by the residential sector excluding the sector's share of electrical system energy losses.


The following indicators are not necessarily residential, but may be interesting to look at.

* **FFTCB** = Fossil fuels, total consumption.
* **HYTXB** = Hydropower, total end-use consumption.
* **NUETB** = Nuclear energy consumed for electricity generation, total.
* **RETCB** = Renewable energy total consumption.
* **TETCB** = Total energy consumption.
* **WYEGB** = Wind energy consumed for electricity generation by the electric power sector.
* **WYTXB** = Wind energy, total end-use consumption.

In [None]:
# Imports EIA energy consumption data and deletes unnecessary columns and rows.
consumption_btu = pd.read_csv('./data/consumption_btu.csv')
consumption_btu = consumption_btu.drop(['Data_Status', '1960', '1961', '1962', '1963', '1964', '1965', '1966', '1967', '1968', '1969'], axis=1)
consumption_btu = consumption_btu.drop(consumption_btu.index[1337:1528])
consumption_btu = consumption_btu.reset_index()
consumption_btu = consumption_btu.drop(consumption_btu.index[8213:8407])
consumption_btu = consumption_btu.drop('index', axis=1)

# Melts the data so that the years are in a single column.
consumption_btu = pd.melt(frame=consumption_btu, id_vars=['State', 'MSN'], var_name='Year', value_name='Reading')

In [None]:
consumption_btu.head()

Unnamed: 0,State,MSN,Year,Reading
0,AK,ABICB,1970,0.0
1,AK,ARICB,1970,1817.0
2,AK,ARTCB,1970,1817.0
3,AK,ARTXB,1970,1817.0
4,AK,AVACB,1970,2331.0


## Importing the state populations data

The following imports the state populations data which was imported and cleaned from another notebook.

In [None]:
# state_dict used for remapping the state names in us_state_pop to state abbreviations.
state_dict = {'Alaska': 'AK',
'Alabama': 'AL',
'Arkansas': 'AR',
'Arizona': 'AZ',
'California': 'CA',
'Colorado': 'CO',
'Connecticut': 'CT',
'Delaware': 'DE',
'Florida': 'FL',
'Georgia': 'GA',
'Hawaii': 'HI',
'Iowa': 'IA',
'Idaho': 'ID',
'Illinois': 'IL',
'Indiana': 'IN',
'Kansas': 'KS',
'Kentucky': 'KY',
'Louisiana': 'LA',
'Massachusetts': 'MA',
'Maryland': 'MD',
'Maine': 'ME',
'Michigan': 'MI',
'Minnesota': 'MN',
'Missouri': 'MO',
'Mississippi': 'MS',
'Montana': 'MT',
'North Carolina': 'NC',
'North Dakota': 'ND',
'Nebraska': 'NE',
'New Hampshire': 'NH',
'New Jersey': 'NJ',
'New Mexico': 'NM',
'Nevada': 'NV',
'NY': 'New York',
'Ohio': 'OH',
'Oklahoma': 'OK',
'Oregon': 'OR',
'Pennsylvania': 'PA',
'Rhode Island': 'RI',
'South Carolina': 'SC',
'South Dakota': 'SD',
'Tennessee': 'TN',
'Texas': 'TX',
'Utah': 'UT',
'Virginia': 'VA',
'Vermont': 'VT',
'Washington': 'WA',
'Wisconsin': 'WI',
'West Virginia': 'WV',
'Wyoming': 'WY'}

us_state_pop = pd.read_csv('./data/US State Populations (1970-2016).csv')
us_state_pop = us_state_pop.drop('Unnamed: 0', axis=1)
us_state_pop = us_state_pop.replace(to_replace=state_dict)
us_state_pop = us_state_pop.set_index(['State', 'Year'])
us_state_pop = us_state_pop.sort_index()
us_state_pop = us_state_pop.reset_index()

In [None]:
us_state_pop.head()

Unnamed: 0,State,Year,Population
0,AK,1970,302583
1,AK,1971,316000
2,AK,1972,326000
3,AK,1973,333000
4,AK,1974,345000


## Calculating per capita consumption

The following calculates per capital consumption for each indicator by dividing the reading for a given state in a given year by the population for that year.

In [None]:
# from consumption_phy:

# CLRCP = coal consumed by the residential sector - thousand short tons
CLRCP = consumption_phy.set_index('MSN')
CLRCP = CLRCP.loc['CLRCP']
CLRCP = CLRCP.reset_index()
CLRCP = CLRCP.set_index(['State', 'Year'])
CLRCP = CLRCP.sort_index()
CLRCP = CLRCP.reset_index()
CLRCP['Population'] = us_state_pop['Population']
CLRCP['Per capita consumption'] = CLRCP['Reading']/CLRCP['Population']
CLRCP = CLRCP[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
CLRCP = CLRCP.set_index(['State', 'Year'])

# DFRCP = distillate fuel oil consumed by the residential sector - thousand barrels
DFRCP = consumption_phy.set_index('MSN')
DFRCP = DFRCP.loc['DFRCP']
DFRCP = DFRCP.reset_index()
DFRCP = DFRCP.set_index(['State', 'Year'])
DFRCP = DFRCP.sort_index()
DFRCP = DFRCP.reset_index()
DFRCP['Population'] = us_state_pop['Population']
DFRCP['Per capita consumption'] = DFRCP['Reading']/DFRCP['Population']
DFRCP = DFRCP[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
DFRCP = DFRCP.set_index(['State', 'Year'])

# ESRCP = electrivity consumed by the residential sector (ie sold to) - million kilowatthours
ESRCP = consumption_phy.set_index('MSN')
ESRCP = ESRCP.loc['ESRCP']
ESRCP = ESRCP.reset_index()
ESRCP = ESRCP.set_index(['State', 'Year'])
ESRCP = ESRCP.sort_index()
ESRCP = ESRCP.reset_index()
ESRCP['Population'] = us_state_pop['Population']
ESRCP['Per capita consumption'] = ESRCP['Reading']/ESRCP['Population']
ESRCP = ESRCP[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
ESRCP = ESRCP.set_index(['State', 'Year'])

# KSRCP = kerosene consumed by the residential sector - thousand barrels
KSRCP = consumption_phy.set_index('MSN')
KSRCP = KSRCP.loc['KSRCP']
KSRCP = KSRCP.reset_index()
KSRCP = KSRCP.set_index(['State', 'Year'])
KSRCP = KSRCP.sort_index()
KSRCP = KSRCP.reset_index()
KSRCP['Population'] = us_state_pop['Population']
KSRCP['Per capita consumption'] = KSRCP['Reading']/KSRCP['Population']
KSRCP = KSRCP[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
KSRCP = KSRCP.set_index(['State', 'Year'])

# LGRCP = LPG consumed by the residential sector - thousand barrels
LGRCP = consumption_phy.set_index('MSN')
LGRCP = LGRCP.loc['LGRCP']
LGRCP = LGRCP.reset_index()
LGRCP = LGRCP.set_index(['State', 'Year'])
LGRCP = LGRCP.sort_index()
LGRCP = LGRCP.reset_index()
LGRCP['Population'] = us_state_pop['Population']
LGRCP['Per capita consumption'] = LGRCP['Reading']/LGRCP['Population']
LGRCP = LGRCP[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
LGRCP = LGRCP.set_index(['State', 'Year'])

# NGRCP = natural gas consumed by the residential sector (including supplemental gaseous fuel) - million cubic feet
NGRCP = consumption_phy.set_index('MSN')
NGRCP = NGRCP.loc['NGRCP']
NGRCP = NGRCP.reset_index()
NGRCP = NGRCP.set_index(['State', 'Year'])
NGRCP = NGRCP.sort_index()
NGRCP = NGRCP.reset_index()
NGRCP['Population'] = us_state_pop['Population']
NGRCP['Per capita consumption'] = NGRCP['Reading']/NGRCP['Population']
NGRCP = NGRCP[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
NGRCP = NGRCP.set_index(['State', 'Year'])

# PARCP = all petroleum products consumed by the residential sector - thousand barrels
PARCP = consumption_phy.set_index('MSN')
PARCP = PARCP.loc['PARCP']
PARCP = PARCP.reset_index()
PARCP = PARCP.set_index(['State', 'Year'])
PARCP = PARCP.sort_index()
PARCP = PARCP.reset_index()
PARCP['Population'] = us_state_pop['Population']
PARCP['Per capita consumption'] = PARCP['Reading']/PARCP['Population']
PARCP = PARCP[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
PARCP = PARCP.set_index(['State', 'Year'])

# WDRCP = wood consumed by the residential sector - thousand cords
WDRCP = consumption_phy.set_index('MSN')
WDRCP = WDRCP.loc['WDRCP']
WDRCP = WDRCP.reset_index()
WDRCP = WDRCP.set_index(['State', 'Year'])
WDRCP = WDRCP.sort_index()
WDRCP = WDRCP.reset_index()
WDRCP['Population'] = us_state_pop['Population']
WDRCP['Per capita consumption'] = WDRCP['Reading']/WDRCP['Population']
WDRCP = WDRCP[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
WDRCP = WDRCP.set_index(['State', 'Year'])

In [None]:
# from consumption_btu:

# GERCB = Geothermal energy consumed by the residential sector.

GERCB = consumption_btu.set_index('MSN')
GERCB = GERCB.loc['GERCB']
GERCB = GERCB.reset_index()
GERCB = GERCB.set_index(['State', 'Year'])
GERCB = GERCB.sort_index()
GERCB = GERCB.reset_index()
GERCB['Population'] = us_state_pop['Population']
GERCB['Per capita consumption'] = GERCB['Reading']/GERCB['Population']
GERCB = GERCB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
GERCB = GERCB.set_index(['State', 'Year'])

# LORCB = The residential sector's share of electrical system energy losses.
LORCB = consumption_btu.set_index('MSN')
LORCB = LORCB.loc['LORCB']
LORCB = LORCB.reset_index()
LORCB = LORCB.set_index(['State', 'Year'])
LORCB = LORCB.sort_index()
LORCB = LORCB.reset_index()
LORCB['Population'] = us_state_pop['Population']
LORCB['Per capita consumption'] = LORCB['Reading']/LORCB['Population']
LORCB = LORCB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
LORCB = LORCB.set_index(['State', 'Year'])

# SFRCB = Supplemental gaseous fuels consumed by the residential sector.
SFRCB = consumption_btu.set_index('MSN')
SFRCB = SFRCB.loc['SFRCB']
SFRCB = SFRCB.reset_index()
SFRCB = SFRCB.set_index(['State', 'Year'])
SFRCB = SFRCB.sort_index()
SFRCB = SFRCB.reset_index()
SFRCB['Population'] = us_state_pop['Population']
SFRCB['Per capita consumption'] = SFRCB['Reading']/SFRCB['Population']
SFRCB = SFRCB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
SFRCB = SFRCB.set_index(['State', 'Year'])

# SORCB = Solar energy consumed by the residential sector.
SORCB = consumption_btu.set_index('MSN')
SORCB = SORCB.loc['SORCB']
SORCB = SORCB.reset_index()
SORCB = SORCB.set_index(['State', 'Year'])
SORCB = SORCB.sort_index()
SORCB = SORCB.reset_index()
SORCB['Population'] = us_state_pop['Population']
SORCB['Per capita consumption'] = SORCB['Reading']/SORCB['Population']
SORCB = SORCB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
SORCB = SORCB.set_index(['State', 'Year'])

# TERCB = Total energy consumed by the residential sector.
TERCB = consumption_btu.set_index('MSN')
TERCB = TERCB.loc['TERCB']
TERCB = TERCB.reset_index()
TERCB = TERCB.set_index(['State', 'Year'])
TERCB = TERCB.sort_index()
TERCB = TERCB.reset_index()
TERCB['Population'] = us_state_pop['Population']
TERCB['Per capita consumption'] = TERCB['Reading']/TERCB['Population']
TERCB = TERCB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
TERCB = TERCB.set_index(['State', 'Year'])

# TERPB = Total energy consumption per capita in the residential sector.
TERPB = consumption_btu.set_index('MSN')
TERPB = TERPB.loc['TERPB']
TERPB = TERPB.reset_index()
TERPB = TERPB.set_index(['State', 'Year'])
TERPB = TERPB.sort_index()
TERPB = TERPB.reset_index()
TERPB['Population'] = us_state_pop['Population']
TERPB['Per capita consumption'] = TERPB['Reading']/TERPB['Population']
TERPB = TERPB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
TERPB = TERPB.set_index(['State', 'Year'])

# TNRCB = Total energy consumed by the residential sector excluding the sector's share of
# electrical system energy losses.
TNRCB = consumption_btu.set_index('MSN')
TNRCB = TNRCB.loc['TNRCB']
TNRCB = TNRCB.reset_index()
TNRCB = TNRCB.set_index(['State', 'Year'])
TNRCB = TNRCB.sort_index()
TNRCB = TNRCB.reset_index()
TNRCB['Population'] = us_state_pop['Population']
TNRCB['Per capita consumption'] = TNRCB['Reading']/TNRCB['Population']
TNRCB = TNRCB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
TNRCB = TNRCB.set_index(['State', 'Year'])

# Look at the following only if necessary.

# FFTCB = Fossil fuels, total consumption.
FFTCB = consumption_btu.set_index('MSN')
FFTCB = FFTCB.loc['FFTCB']
FFTCB = FFTCB.reset_index()
FFTCB = FFTCB.set_index(['State', 'Year'])
FFTCB = FFTCB.sort_index()
FFTCB = FFTCB.reset_index()
FFTCB['Population'] = us_state_pop['Population']
FFTCB['Per capita consumption'] = FFTCB['Reading']/FFTCB['Population']
FFTCB = FFTCB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
FFTCB = FFTCB.set_index(['State', 'Year'])

# HYTXB = Hydropower, total end-use consumption.
HYTXB = consumption_btu.set_index('MSN')
HYTXB = HYTXB.loc['HYTXB']
HYTXB = HYTXB.reset_index()
HYTXB = HYTXB.set_index(['State', 'Year'])
HYTXB = HYTXB.sort_index()
HYTXB = HYTXB.reset_index()
HYTXB['Population'] = us_state_pop['Population']
HYTXB['Per capita consumption'] = HYTXB['Reading']/HYTXB['Population']
HYTXB = HYTXB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
HYTXB = HYTXB.set_index(['State', 'Year'])

# NUETB = Nuclear energy consumed for electricity generation, total.
NUETB = consumption_btu.set_index('MSN')
NUETB = NUETB.loc['NUETB']
NUETB = NUETB.reset_index()
NUETB = NUETB.set_index(['State', 'Year'])
NUETB = NUETB.sort_index()
NUETB = NUETB.reset_index()
NUETB['Population'] = us_state_pop['Population']
NUETB['Per capita consumption'] = NUETB['Reading']/NUETB['Population']
NUETB = NUETB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
NUETB = NUETB.set_index(['State', 'Year'])

# RETCB = Renewable energy total consumption.
RETCB = consumption_btu.set_index('MSN')
RETCB = RETCB.loc['RETCB']
RETCB = RETCB.reset_index()
RETCB = RETCB.set_index(['State', 'Year'])
RETCB = RETCB.sort_index()
RETCB = RETCB.reset_index()
RETCB['Population'] = us_state_pop['Population']
RETCB['Per capita consumption'] = RETCB['Reading']/RETCB['Population']
RETCB = RETCB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
RETCB = RETCB.set_index(['State', 'Year'])

# TETCB = Total energy consumption.
TETCB = consumption_btu.set_index('MSN')
TETCB = TETCB.loc['TETCB']
TETCB = TETCB.reset_index()
TETCB = TETCB.set_index(['State', 'Year'])
TETCB = TETCB.sort_index()
TETCB = TETCB.reset_index()
TETCB['Population'] = us_state_pop['Population']
TETCB['Per capita consumption'] = TETCB['Reading']/TETCB['Population']
TETCB = TETCB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
TETCB = TETCB.set_index(['State', 'Year'])

# WYEGB = Wind energy consumed for electricity generation by the electric power sector.
WYEGB = consumption_btu.set_index('MSN')
WYEGB = WYEGB.loc['WYEGB']
WYEGB = WYEGB.reset_index()
WYEGB = WYEGB.set_index(['State', 'Year'])
WYEGB = WYEGB.sort_index()
WYEGB = WYEGB.reset_index()
WYEGB['Population'] = us_state_pop['Population']
WYEGB['Per capita consumption'] = WYEGB['Reading']/WYEGB['Population']
WYEGB = WYEGB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
WYEGB = WYEGB.set_index(['State', 'Year'])

# WYTXB = Wind energy, total end-use consumption.
WYTXB = consumption_btu.set_index('MSN')
WYTXB = WYTXB.loc['WYTXB']
WYTXB = WYTXB.reset_index()
WYTXB = WYTXB.set_index(['State', 'Year'])
WYTXB = WYTXB.sort_index()
WYTXB = WYTXB.reset_index()
WYTXB['Population'] = us_state_pop['Population']
WYTXB['Per capita consumption'] = WYTXB['Reading']/WYTXB['Population']
WYTXB = WYTXB[['State', 'Year', 'Population', 'MSN', 'Reading', 'Per capita consumption']]
WYTXB = WYTXB.set_index(['State', 'Year'])

You can look up data for each source of energy by loading up the appropriate dataframe (the name is the same as the MSN code).

In [None]:
LORCB.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Population,MSN,Reading,Per capita consumption
State,Year,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
AK,1970,302583,LORCB,7074.0,0.023379
AK,1971,316000,LORCB,8316.0,0.026316
AK,1972,326000,LORCB,8963.0,0.027494
AK,1973,333000,LORCB,9488.0,0.028492
AK,1974,345000,LORCB,10533.0,0.03053
