# Introduction

Since the industrial revolution, Earth has experienced a human-sourced increase in greenhouse gases, contributing to a global climate change. How we generate energy is amoung the top sources for greenhouse gases so let's take a look at the recent trends of various sources of energy in the US. 

Is the US shifting to more renewable sources of energy?

In [17]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib notebook

In [18]:
eia = pd.read_csv("EIA_energy_generation_for_all_sectors.csv")

eia[["2001","2002","2003","2004","2005","2006","2007","2008","2009","2010","2011","2012",'2013',"2014",'2015',"2016","2017"]].apply(pd.to_numeric)


Unnamed: 0,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016,2017
0,3736644,3858452,3883185,3970555,4055423,4064702,4156745,4119388,3950331,4125060,4100141,4047765,4065964,4093606,4077601,4076675,4034268
1,1903956,1933130,1973737,1978301,2012873,1990511,2016456,1985801,1755904,1847290,1733430,1514043,1581115,1581710,1352398,1239149,1205835
2,114647,78701,102734,100391,99840,44460,49505,31917,25972,23337,16086,13403,13820,18276,17372,13008,12414
3,10233,15867,16672,20754,22385,19706,16234,14325,12964,13724,14096,9787,13344,11955,10877,11197,8976
4,639129,691006,649908,710100,760960,816441,896590,882981,920979,987697,1013689,1225894,1124836,1126609,1333482,1378307,1296415
5,9039,11463,15600,15252,13464,14177,13453,11707,10632,11313,11566,11898,12853,12022,13117,12807,12469
6,768826,780064,763733,788528,781986,787219,806425,806208,798855,806968,790204,769331,789016,797166,797178,805694,804950
7,216961,264329,275806,268417,270321,289246,247510,254831,273445,260203,319355,276240,268565,259367,249080,267812,300333
8,70769,79109,79486,83067,87330,96526,105238,126101,144279,167172,193982,218333,253509,279212,295162,341633,386278
9,6737,10354,11187,14144,17811,26589,34450,55363,73886,94652,120177,140822,167840,181655,190719,226993,254303


This dataset is from the Department of [US Energy Information Administration](https://www.eia.gov/electricity/data/browser/#/topic/0?agg=2,0,1&fuel=vtvv&geo=g&sec=g&linechart=ELEC.GEN.ALL-US-99.A~ELEC.GEN.COW-US-99.A~ELEC.GEN.NG-US-99.A~ELEC.GEN.NUC-US-99.A~ELEC.GEN.HYC-US-99.A~ELEC.GEN.WND-US-99.A~ELEC.GEN.TSN-US-99.A&columnchart=ELEC.GEN.ALL-US-99.A~ELEC.GEN.COW-US-99.A~ELEC.GEN.NG-US-99.A~ELEC.GEN.NUC-US-99.A~ELEC.GEN.HYC-US-99.A~ELEC.GEN.WND-US-99.A&map=ELEC.GEN.ALL-US-99.A&freq=A&ctype=linechart&ltype=pin&rtype=s&maptype=0&rse=0&pin=). It contains the energy production data for the years 2001 through 2017 specified by sector in thousand megawatt hours and lists a source key for each sector.

The United States Energy Information Administration(EIA) is a principal agency of the U.S. Federal Statistical System responsible for collecting, analyzing, and disseminating energy information. The EIA is part of the US Department of Energy. By law, EIA’s products are prepared independently of policy considerations. EIA neither formulates nor advocates any policy conclusions.

## What is the total energy generation trend in the United States?

In order to contextualize our analysis, we need to first understand how US energy demand is changing over time. 

In [19]:
eia_tot = eia.iloc[0]
total_prod = [eia_tot['2001'], eia_tot['2002'], eia_tot['2003'], eia_tot['2004'], eia_tot['2005'], eia_tot['2006'], eia_tot['2007'], eia_tot['2008'], eia_tot['2009'], eia_tot['2010'], eia_tot['2011'], eia_tot['2012'], eia_tot['2013'], eia_tot['2014'], eia_tot['2015'], eia_tot['2016'], eia_tot['2017']]
total_prod = pd.to_numeric(total_prod)
years = pd.to_numeric(np.arange(2001,2018))

plt.figure(figsize=(8, 4))
plt.plot(years,total_prod, marker='', color='b', linewidth=2)
plt.ylabel("Log Thousand Megawatt Hours")
plt.yscale("log")
plt.xlabel("Year")
plt.xticks(ticks=years, rotation=45)
plt.title("Energy generated annually in the US")
plt.ylim(bottom=3700000, top=4200000 )
plt.show()


<IPython.core.display.Javascript object>

Wow, we see interesting activity!  Significant growth in the first 6 years then a harsh dip in 2009. The expnential-like rise could be from the wide acceptance of electornics in the household and workplace. It's possible the dip is a result of the infamous Bank Bail-Out(Emergency Economic Stabilization Act) in October of 2008. Persons and companies alike experienced quite a financial hardship as a resulf of the housing bubble popping. Since that happened in late 2008, it makes sense we wouldn't see the affects of it till the following year. Promptly afterwards, 2010 seemed closer to pre-dip values and has largely stayed the same since but with a slight downward tail. The downward tail could resutlt from a reduction in energy use from a mindful public and companies or even from a higher adoption of personal solar panels or higher efficiency electrical equipment, some combination, or something else entirely. We'd have to continue to observe to see where this trend goes.

We've seen the trend over the 16 years of available data but let's take a closer look at the most recent year of info we have.

## What is the energy production source breakdown for 2017?

In [33]:
# Data to plot
labels = 'Natural Gas', 'Coal', 'Nuclear', 'Hydro-Electric', 'Wind', 'Other' #'Biomass', 'Solar', 'Geo-Thermal', 'Petroleum'
sizes = [nat_gas_prod['2017'], coal_prod["2017"], nuc_prod['2017'], hydro_prod['2017'], wind_prod['2017'], (bio_prod['2017'] + solar_prod['2017'] + petrol_prod['2017'] + geo_prod['2017'])  ]
colors = ['plum', 'peru', 'SpringGreen', 'lightcoral', 'lightskyblue','slategray'] #,'r','g','b'
explode = (0.05, 0, 0, 0, 0, 0)  # explode 1st slice
 
# Plot
plt.figure(figsize=(10,10))
plt.pie(sizes, explode=explode, labels=labels, colors=colors, autopct='%1.1f%%', shadow=False, startangle=90) 
plt.axis('equal')
plt.title("US Energy Production Sources, 2017")
plt.show()

<IPython.core.display.Javascript object>

Now we have a visual breakdown of the percent of the energy source for the US in 2017!
The top 5 energy source contibutors for 2017 are:
   1. Natural Gas (32.3%)
   2. Coal (30%)
   3. Nuclear (20%)
   4. Hydro-Electric (7.5%)
   5. Wind (6.3%)

Together, the top 5 make up 96.1%, which shows that we have invested in only a few types of energy production with the more renewable or environmentally friends types at the bottom of the top 5. 

But one may ask, has it always looked like this? or more specifically:

## What are the trends of energy generation by source type from 2001 to 2017?

In [38]:
#print(eia)
years = pd.to_numeric(np.arange(2001,2018))

coal_prod = eia.iloc[1]
coal_production = [coal_prod['2001'], coal_prod['2002'], coal_prod['2003'], coal_prod['2004'], coal_prod['2005'], coal_prod['2006'], coal_prod['2007'], coal_prod['2008'], coal_prod['2009'], coal_prod['2010'], coal_prod['2011'], coal_prod['2012'], coal_prod['2013'], coal_prod['2014'], coal_prod['2015'], coal_prod['2016'], coal_prod['2017']]
coal_production = pd.to_numeric(coal_production)

petrol_prod = eia.iloc[2]+eia.iloc[3]
petrol_production = [petrol_prod['2001'], petrol_prod['2002'], petrol_prod['2003'], petrol_prod['2004'], petrol_prod['2005'], petrol_prod['2006'], petrol_prod['2007'], petrol_prod['2008'], petrol_prod['2009'], petrol_prod['2010'], petrol_prod['2011'], petrol_prod['2012'], petrol_prod['2013'], petrol_prod['2014'], petrol_prod['2015'], petrol_prod['2016'], petrol_prod['2017']]
petrol_production = pd.to_numeric(petrol_production)

nat_gas_prod = eia.iloc[4]
nat_gas_production = [nat_gas_prod['2001'], nat_gas_prod['2002'], nat_gas_prod['2003'], nat_gas_prod['2004'], nat_gas_prod['2005'], nat_gas_prod['2006'], nat_gas_prod['2007'], nat_gas_prod['2008'], nat_gas_prod['2009'], nat_gas_prod['2010'], nat_gas_prod['2011'], nat_gas_prod['2012'], nat_gas_prod['2013'], nat_gas_prod['2014'], nat_gas_prod['2015'], nat_gas_prod['2016'], nat_gas_prod['2017']]
nat_gas_production = pd.to_numeric(nat_gas_production)

nuc_prod = eia.iloc[6]
nuc_production = [nuc_prod['2001'], nuc_prod['2002'], nuc_prod['2003'], nuc_prod['2004'], nuc_prod['2005'], nuc_prod['2006'], nuc_prod['2007'], nuc_prod['2008'], nuc_prod['2009'], nuc_prod['2010'], nuc_prod['2011'], nuc_prod['2012'], nuc_prod['2013'], nuc_prod['2014'], nuc_prod['2015'], nuc_prod['2016'], nuc_prod['2017']]
nuc_production = pd.to_numeric(nuc_production)

hydro_prod = eia.iloc[7]
hydro_production = [hydro_prod['2001'], hydro_prod['2002'], hydro_prod['2003'], hydro_prod['2004'], hydro_prod['2005'], hydro_prod['2006'], hydro_prod['2007'], hydro_prod['2008'], hydro_prod['2009'], hydro_prod['2010'], hydro_prod['2011'], hydro_prod['2012'], hydro_prod['2013'], hydro_prod['2014'], hydro_prod['2015'], hydro_prod['2016'], hydro_prod['2017']]
hydro_production = pd.to_numeric(hydro_production)

wind_prod = eia.iloc[9]
wind_production = [wind_prod['2001'], wind_prod['2002'], wind_prod['2003'], wind_prod['2004'], wind_prod['2005'], wind_prod['2006'], wind_prod['2007'], wind_prod['2008'], wind_prod['2009'], wind_prod['2010'], wind_prod['2011'], wind_prod['2012'], wind_prod['2013'], wind_prod['2014'], wind_prod['2015'], wind_prod['2016'], wind_prod['2017']]
wind_production = pd.to_numeric(wind_production)

solar_prod = eia.iloc[10]
solar_production = [solar_prod['2001'], solar_prod['2002'], solar_prod['2003'], solar_prod['2004'], solar_prod['2005'], solar_prod['2006'], solar_prod['2007'], solar_prod['2008'], solar_prod['2009'], solar_prod['2010'], solar_prod['2011'], solar_prod['2012'], solar_prod['2013'], solar_prod['2014'], solar_prod['2015'], solar_prod['2016'], solar_prod['2017']]
solar_production = pd.to_numeric(solar_production)

geo_prod = eia.iloc[11]
geo_production = [geo_prod['2001'], geo_prod['2002'], geo_prod['2003'], geo_prod['2004'], geo_prod['2005'], geo_prod['2006'], geo_prod['2007'], geo_prod['2008'], geo_prod['2009'], geo_prod['2010'], geo_prod['2011'], geo_prod['2012'], geo_prod['2013'], geo_prod['2014'], geo_prod['2015'], geo_prod['2016'], geo_prod['2017']]
geo_production = pd.to_numeric(geo_production)

bio_prod = eia.iloc[12]
bio_production = [bio_prod['2001'], bio_prod['2002'], bio_prod['2003'], bio_prod['2004'], bio_prod['2005'], bio_prod['2006'], bio_prod['2007'], bio_prod['2008'], bio_prod['2009'], bio_prod['2010'], bio_prod['2011'], bio_prod['2012'], bio_prod['2013'], bio_prod['2014'], bio_prod['2015'], bio_prod['2016'], bio_prod['2017']]
bio_production = pd.to_numeric(bio_production)

plt.figure(figsize=(8,8))

plt.plot(years,total_prod, label="Total", color='k', linewidth=4, linestyle='dashed')
plt.plot(years,nat_gas_production, label="Natural Gas", color="plum", linewidth=4)
plt.plot(years,coal_production, label="Coal", color="peru", linewidth=4)
plt.plot(years,nuc_production, label="Nucelar", color="SpringGreen", linewidth=4)
plt.plot(years,hydro_production, label="Hydro-Electric", color="lightcoral", linewidth=4)
plt.plot(years,wind_production, label="Wind", color="lightskyblue", linewidth=4)
plt.plot(years,bio_production, label="Biomass",color="SaddleBrown", linewidth=4)
plt.plot(years,solar_production, label="Solar", color="Gold", linewidth=4)
plt.plot(years,petrol_production, label="Petroleum-Based", color="mediumorchid", linewidth=4)
plt.plot(years,geo_production, label="Geo-Thermal", color="wheat", linewidth=4)

plt.xlabel("Year")
plt.ylabel("Log Thousand MWH")
plt.yscale("log")
plt.xticks(ticks=years,rotation=45)
plt.title("US Energy Production")
plt.legend()
#plt.ylim(bottom=400, top= 3000000)
plt.show()

<IPython.core.display.Javascript object>

Great, we can see that coal energy production is on the decline, likely due in part to the accurate public perception of coal as a dirty energy source. It is known to be non-renewable and it's main byproduct carbon dioxide is largest contributor to the greenhouse effect due to its sheer volume. Coal energy production currently rund around 33% with opportunity to get up to 40%. [CoalSource](https://www.worldcoal.org/reducing-co2-emissions/high-efficiency-low-emission-coal)

Energy production from natural Gas(methane) is on the rise, possibly due to the negative public perception of coal, but is nonetheless a green house gas, and actually has 25 times the global warming potential than carbon dioxide; it just isn't as volumous so it hasn't gotten the negative spotlight just yet.The American Gas Association claims a 90% efficiency for methane and notes this as 3 times the energy efficiency than electrical power efficiency.[NatGasSource1](http://climatechange.lta.org/get-started/learn/co2-methane-greenhouse-effect/),[NatGas Source2](https://www.aga.org/policy/environment/energy-efficiency-natural-gas-utilities/)

Nuclear energy production has largely remained the same over the 16 year period. Depsite accounting for roughly 20% of the energy production of the united states, one of it's biggest obstables remains: "Not in my backyard" mentality. More nuclear facilties are planned to come online after 2020. [Source](http://www.world-nuclear.org/information-library/country-profiles/countries-t-z/usa-nuclear-power.aspx)

Wind and Solar energy production have done quite well, both of which have a lower efficiency(45% and 22% respectively) than other energy production methods but have the benefits of being renewable and having little to no environmental impact. [SolarSource](https://news.energysage.com/best-solar-panels-complete-ranking/). [WindSource](https://greenliving.lovetoknow.com/Efficiency_of_Wind_Energy)


# Further Research

There's a lot of information that we've uncovered here but what would certainly go a long way would be a regression model with a specified confidence limit to predict how the overall trend of energy production will go and for each of the renewable and harmful sectors