<h1><center>DTSC 5301 | Final Project Report</center></h1>
<h2><center>Team: Energy</center></h2>

In [12]:
# As the notebook needs to be rendered in nbviewer for interactive plots
!pip install plotly



### Authors



*   Rona Guo, rugu3582@colorado.edu
*   Sasi Jyothirmai Bonu, sabo8713@colorado.edu
*   Shivam Pandey, shpa5426@colorado.edu
*   Tushar Sharma, tush4938@colorado.edu
*   Veronica Martinez, veronica.martinez@colorado.edu
*   Wyett Considine, wyco0384@colorado.edu





### Table Of Contents

1. Introduction
2. Questions
3. Data Source
4. Analysis
5. Conclusion
6. Citations



### Introduction


Climate change is the main issue in the public discourse on energy. A climate catastrophe puts our current well-being, the well-being of those who will follow us, and the surrounding natural world in jeopardy. Many international treaties have been signed with the hopes (ambitions?) of reducing CO2 emissions. However, large-scale alternatives to fossil fuels that are secure, affordable, and low-carbon are still lacking in the world. Our analysis seeks to study the make-up of energy consumption by source over the past 5 decades, to quantitatively evaluate the complaince by region, and to assess the total energy consumption by region.





### Questions
- How has global per capita energy consumption changed over time?
- How does the renewable vs. non-renewable makeup of this energy consumption compare?


### Data Source
Data was accessed from [Our World in Data](https://ourworldindata.org/explorers/energy?tab=table&facet=none&country=USA~GBR~CHN~OWID_WRL~IND~BRA~ZAF&Total+or+Breakdown=Total&Energy+or+Electricity=Primary+energy&Metric=Per+capita+consumption). The dataset used in this analysis contains energy consumption per capita, measured by kWh per person per year (1965-2022) broken out by country and energy source. This data was originally sourced from the U.S. Energy Information Administration, Energy Institute Statistical Review of World Energy, Gapminder (v7), United Nations, World Population Prospects, HYDE (v3.2), Gapminder (Systema Globalis). Using five decades of energy data should provide a nice historical sample to begin investigating answers to our questions.

### Analysis
To analyze the data, we used Python to read in our dataset, clean the data, and visualize the data to make comparisons. Below is the source code along with an explanation of steps taken and interpretations of the results.



In [1]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# If using Jupyter in VSC:
# import plotly.io as pio
# pio.renderers.default='notebook'


#### Cleaning the Data

After downloading the dataset, we cleaned the data to prepare it for analysis. This included renaming the column headers for improved readability and aggregating the data into two categories, renewable and non-renewable energy sources. We also calculated total energy consumption for each category as well as an overall total of energy consumption to create a plot for comparison.

In [2]:
#Get data
energyDF = pd.read_csv('https://docs.google.com/spreadsheets/d/e/2PACX-1vSKFSOeC0K8NxNq_0ILnFa8bQguWi8QZopcNYLXLpc5NbXCffx5TV5rmPhSjnaT4aa0hoYlbXwoBY53/pub?gid=970132790&single=true&output=csv',
                    header = 0)

#Note: the __PerCap is measured in kWh or kWh equivelants
energyDF.rename(columns={'Entity':'Entity', 'Code':'EntCode', 'Year':'Year', 'Coal per capita (kWh)':'CoalPerCap',
       'Oil per capita (kWh)':'OilPerCap', 'Gas per capita (kWh)':'GasPerCap',
       'Nuclear per capita (kWh - equivalent)':'NuclearPerCap',
       'Hydro per capita (kWh - equivalent)':'HydroPerCap',
       'Wind per capita (kWh - equivalent)':'WindPerCap',
       'Solar per capita (kWh - equivalent)':'SolarPerCap',
       'Other renewables per capita (kWh - equivalent)':'OtherRenewablesPerCap'}, inplace=True)

energyDF.drop('EntCode', axis=1, inplace=True)
energyDF = energyDF.fillna(0)

renewables = ['NuclearPerCap', 'HydroPerCap', 'WindPerCap', 'SolarPerCap','OtherRenewablesPerCap']
nonrenewables = ['CoalPerCap', 'OilPerCap', 'GasPerCap']

energyDF.head(2)

Unnamed: 0,Entity,Year,CoalPerCap,OilPerCap,GasPerCap,NuclearPerCap,HydroPerCap,WindPerCap,SolarPerCap,OtherRenewablesPerCap
0,Africa,1965,1006.3736,1064.3536,29.777592,0.0,127.91771,0.0,0.0,0.0
1,Africa,1966,980.1728,1123.739,32.45205,0.0,139.12254,0.0,0.0,0.0


In [3]:
energyDF['TotalEnerygyPerCap'] = energyDF[renewables].sum(axis=1) + energyDF[nonrenewables].sum(axis=1)
energyDF['TotalNonRenewablePerCap'] = energyDF[nonrenewables].sum(axis=1)
energyDF['TotalRenewablePerCap'] = energyDF[renewables].sum(axis=1)
energyDF.head(5)

Unnamed: 0,Entity,Year,CoalPerCap,OilPerCap,GasPerCap,NuclearPerCap,HydroPerCap,WindPerCap,SolarPerCap,OtherRenewablesPerCap,TotalEnerygyPerCap,TotalNonRenewablePerCap,TotalRenewablePerCap
0,Africa,1965,1006.3736,1064.3536,29.777592,0.0,127.91771,0.0,0.0,0.0,2228.422502,2100.504792,127.91771
1,Africa,1966,980.1728,1123.739,32.45205,0.0,139.12254,0.0,0.0,0.0,2275.48639,2136.36385,139.12254
2,Africa,1967,976.7317,1091.7719,31.268766,0.0,141.5766,0.0,0.0,0.0,2241.348966,2099.772366,141.5766
3,Africa,1968,990.00665,1125.0367,30.886887,0.0,161.39375,0.0,0.0,0.0,2307.323987,2145.930237,161.39375
4,Africa,1969,973.52277,1118.1862,35.162052,0.0,183.53688,0.0,0.0,0.0,2310.407902,2126.871022,183.53688


To visualize a comparison between renewable and non-renewable energy consumption over the past 50 years, we opted for a stacked bar chart. First we aggregated the data by decade so that the data can fit nicely on a plot.

In [4]:
# extract only data from 1970 and after.
energyDF = energyDF[energyDF['Year']>=1970]

# create a new column containing the decade for each data point
energyDF['Decade'] = energyDF['Year'] - (energyDF['Year']%10)

# group data by decade
df_decade = energyDF.groupby(by=['Decade','Entity']).sum().reset_index()

df_decade = df_decade[df_decade['Decade']!=2020]

#### Visualizations

With the data restructured for analysis, we then created a stacked bar chart of total world energy consumption per capita per decade for five decades. From the chart we can quickly see that renewable energy consumption has increased over that past 50 years. During the same time period, non-renewable energy consumption decreased in the 80s and 90s, but then increased more than renewable energy consumption did in the 2000s and 2010s.

In [5]:
# Total world energy consumption (Renewable vs Non-Renewable), bar chart

# Create a new datafram with only 'World' values and sort by decade.
df_world = df_decade[df_decade['Entity']=='World']
df_world = df_world.sort_values(by='Decade')

# Create a customized plot to compare renewable vs. non-renewable consumption
fig = go.Figure()

# fig.add_trace(go.Scatter(x=df_world['Decade'].sort_values(),
#                          y=df_world['TotalEnerygyPerCap']/1000,
#                          name='Total Energy',
#                          marker=dict(color='black', size=8),
#                          line=dict(width=3)
#                          ))

fig.add_trace(go.Bar(x=df_world['Decade'].sort_values(),
                     y=df_world['TotalRenewablePerCap']/1000, #convert to MWh
                     name='Reneweable Enerygy',
                     marker=dict(color='#009e60')
                     ))

fig.add_trace(go.Bar(x=df_world['Decade'].sort_values(),
                     y=df_world['TotalNonRenewablePerCap']/1000, #convert to MWh
                     name='Non-Renewable Energy',
                     marker=dict(color='#ff6700')
                     ))

fig.update_layout(autosize=False,
                  width=1000,
                  height=700,
                  barmode='stack',
                  bargap=0.7,
                  xaxis_title=dict(text='Decade'),
                  yaxis_title=dict(text='Energy Consumption Per Cap (MWh)'),
                  title=dict(text='World Per Capita Energy Consumption and Source Over the Past 5 Decades'))

fig.show()

We also looked at data from the United States. The United States is one of the largest energy users in the world, so we wanted to see what energy sources make up this consumption and if renewable energy usage is increasing or decreasing. This is important because climate change is accelerating and where energy is sourced impacts the outcome of the next few decades.

What we see from the bar chart is that the United States uses less energy per capita in this decade than in the previous four decades. Additionally, the United States is using less non-renewable energy than in the 1970s and has increased renewable energy consumption over the years. However, the per capita energy consumption is still significantly larger than what we saw globally at around 800 MWh vs. ~200 MWh.

It's important to note that the total global consumption we saw in the first chart includes the United States. It would be a better comparison to see how the United States compares to the rest of the world.

In [6]:
# Total USA energy consumption (Renewable vs Non-Renewable), Bar Chart

# Create a new datafram with only 'United States' values and sort by decade.
df_US = df_decade[df_decade['Entity']=='United States']
df_US = df_US.sort_values(by='Decade')

# Create a customized plot to compare renewable vs. non-renewable consumption
fig = go.Figure()

# fig.add_trace(go.Scatter(x=df_US['Decade'].sort_values(),
#                          y=df_US['TotalEnerygyPerCap']/1000,
#                          name='Total Energy',
#                          marker=dict(color='black', size=8),
#                          line=dict(width=3)
#                          ))

fig.add_trace(go.Bar(x=df_US['Decade'].sort_values(),
                     y=df_US['TotalRenewablePerCap']/1000,  #convert to MWh
                     name='Reneweable Enerygy',
                     marker=dict(color='#009e60')
                     ))

fig.add_trace(go.Bar(x=df_US['Decade'].sort_values(),
                     y=df_US['TotalNonRenewablePerCap']/1000, #convert to MWh
                     name='Non-Renewable Energy',
                     marker=dict(color='#ff6700')
                     ))

fig.update_layout(autosize=False,
                  width=1000,
                  height=700,
                  barmode='stack',
                  bargap=0.7,
                  xaxis_title=dict(text='Decade'),
                  yaxis_title=dict(text='Energy Consumption Per Cap (MWh)'),
                  title=dict(text='United States Per Capita Energy Consumption and Source Over the Past 5 Decades'))

fig.show()

Next we looked at regional data...

Potentail biases - Regions contain a mix of wealthy and less affluent countries. These dynamics and impacts to the data are not understandable just looking at regional totals...Additional analysis comparing regions that are similiar to each other could provide a more telling story...

In [7]:
opec_countries = ['Algeria','Iran','Iraq','Kuwait','Saudi Arabia','United Arab Emirates','Venezuela']

us_and_canada = ['United States', 'Canada']

region_considered = ['Asia', 'European Union (27)', 'OPEC Countries', 'US & Canada']

df_region = df_decade[(df_decade['Entity'].isin(['Asia', 'European Union (27)']+opec_countries+us_and_canada))]

df_region['Entity'] = df_region['Entity'].apply(lambda x: 'OPEC' if x in opec_countries else x)

df_region['Entity'] = df_region['Entity'].apply(lambda x: 'US & Canada' if x in us_and_canada else x)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [8]:
fig = make_subplots(rows=2,
                    cols=2,
                    subplot_titles=("US & Canada", "European Union", "Asia", "OPEC Countries"))

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='US & Canada']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='US & Canada']['TotalRenewablePerCap']/1000,
                     name='Reneweable Enerygy',
                     marker=dict(color='#009e60')
                     ), row=1, col=1)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='US & Canada']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='US & Canada']['TotalNonRenewablePerCap']/1000,
                     name='Non-Renewable Energy',
                     marker=dict(color='#ff6700')
                     ), row=1, col=1)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='European Union (27)']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='European Union (27)']['TotalRenewablePerCap']/1000,
                    #  name='Reneweable Enerygy',
                     marker=dict(color='#009e60',),
                     showlegend = False
                     ), row=1, col=2)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='European Union (27)']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='European Union (27)']['TotalNonRenewablePerCap']/1000,
                    #  name='Non-Renewable Energy',
                     marker=dict(color='#ff6700'),
                     showlegend = False
                     ), row=1, col=2)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='Asia']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='Asia']['TotalRenewablePerCap']/1000,
                    #  name='Reneweable Enerygy',
                     marker=dict(color='#009e60'),
                     showlegend = False
                     ), row=2, col=1)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='Asia']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='Asia']['TotalNonRenewablePerCap']/1000,
                    #  name='Non-Renewable Energy',
                     marker=dict(color='#ff6700'),
                     showlegend = False
                     ), row=2, col=1)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='OPEC']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='OPEC']['TotalRenewablePerCap']/1000,
                    #  name='Reneweable Enerygy',
                     marker=dict(color='#009e60'),
                     showlegend = False
                     ), row=2, col=2)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='OPEC']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='OPEC']['TotalNonRenewablePerCap']/1000,
                    #  name='Non-Renewable Energy',
                     marker=dict(color='#ff6700'),
                     showlegend = False
                     ), row=2, col=2)

fig.update_traces(marker_line_width=0)

fig.update_layout(height=900,
                  width=1000,
                  bargap=0.4,
                  barmode='stack',
                  title_text="Energy Consumption by Region (MWh)",
                  xaxis=dict(tickvals = df_region['Decade'].unique()),
                  xaxis2=dict(tickvals = df_region['Decade'].unique()),
                  xaxis3=dict(tickvals = df_region['Decade'].unique()),
                  xaxis4=dict(tickvals = df_region['Decade'].unique())
                  )

fig.show()

When all four charts are compared, it is abundantly clear that the US and Canadian region along with the OPEC countries consume around 5 to 10 times the energy consumed per person in the European Union, wich comprises of around 27 countries, and almost around 13 to 26 times the energy consumed per person in the European region. It also shows that OPEC countries are the highest consumers of non-renewable energy sources.

Among the regions best utilizing their non-renewable energy sources, which are US and Canada and the European Union, renewable energy in the 2010 decade only made around 17% and 33%, respectively, of the total energy consumed per capita.

In [9]:
energyDF_region = energyDF[(energyDF['Entity'].isin(['Asia', 'European Union (27)']+opec_countries+us_and_canada))]

energyDF_region['Entity'] = energyDF_region['Entity'].apply(lambda x: 'OPEC' if x in opec_countries else x)

energyDF_region['Entity'] = energyDF_region['Entity'].apply(lambda x: 'US & Canada' if x in us_and_canada else x)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [10]:
fig = make_subplots(rows=2,
                    cols=2,
                    subplot_titles=("US & Canada", "European Union", "Asia", "OPEC Countries"))

fig.add_trace(go.Scatter(x=energyDF_region[energyDF_region['Entity']=='US & Canada']['Year'].sort_values(),
                     y=energyDF_region[energyDF_region['Entity']=='US & Canada']['TotalEnerygyPerCap']/1000,
                     name='Total Enerygy',
                     marker=dict(color='black')
                     ), row=1, col=1)

fig.add_trace(go.Scatter(x=energyDF_region[energyDF_region['Entity']=='European Union (27)']['Year'].sort_values(),
                     y=energyDF_region[energyDF_region['Entity']=='European Union (27)']['TotalEnerygyPerCap']/1000,
                    #  name='Reneweable Enerygy',
                     marker=dict(color='black',),
                     showlegend = False
                     ), row=1, col=2)

fig.add_trace(go.Scatter(x=energyDF_region[energyDF_region['Entity']=='Asia']['Year'].sort_values(),
                     y=energyDF_region[energyDF_region['Entity']=='Asia']['TotalEnerygyPerCap']/1000,
                    #  name='Reneweable Enerygy',
                     marker=dict(color='black'),
                     showlegend = False
                     ), row=2, col=1)

fig.add_trace(go.Scatter(x=energyDF_region[energyDF_region['Entity']=='OPEC']['Year'].sort_values(),
                     y=energyDF_region[energyDF_region['Entity']=='OPEC']['TotalEnerygyPerCap']/1000,
                    #  name='Reneweable Enerygy',
                     marker=dict(color='black'),
                     showlegend = False
                     ), row=2, col=2)

fig.update_traces(marker_line_width=0)

fig.update_layout(height=900,
                  width=1000,
                  bargap=0.4,
                #   barmode='stack',
                  title_text="Total Energy Consumption by Region (MWh)",
                  xaxis=dict(tickvals = df_region['Decade'].unique()),
                  xaxis2=dict(tickvals = df_region['Decade'].unique()),
                  xaxis3=dict(tickvals = df_region['Decade'].unique()),
                  xaxis4=dict(tickvals = df_region['Decade'].unique())
                  )

fig.show()

By region (US and Canada, OPEC countries, Asia, and the European Union), the line graphs display the trend in total energy use per person during the previous five decades. The US and Canada have essentially followed the same pattern as the European Union, however the US and Canada have experienced greater swings than the EU. Energy consumption peaked per person in the US and Canada in 1985 and 1988 at about 118 MWh. In contrast, the EU's energy consumption peaked at 43 MWh in 2004 and 2006. Only 36% of what was used at its height in the US and Canada was used by the European Union. The tendency appears to be declining, which is good news.

With having a turbulent graph, OPEC nations have turned out to be the worst energy consumers. In fact, the OPEC countries recently saw the highest-ever energy consumption of 209 MWh in 2014, which is the highest any region has ever seen. Nevertheless, since 2014, there has been a sharp decline in energy use, and in 2022, they used less energy per person than the US&Canada and the European Union.

Asian energy consumption per person initially appears to be rising practically continuously, which would be concerning. However, even at its peak (19 MWh in 2022), Asian energy consumption per capita is still less than that of the EU and the US & Canada. Asia, though, may surpass other regions if the trend persists.

### Conclusion


Before the analysis, we posed two questions.

- How has global per capita energy consumption changed over time?
- How does the renewable vs. non-renewable makeup of this energy consumption compare?


In order to draw a conclusion, we will attempt to present a summary of the solutions based on the analysis in the report.

Based on the Data Visualizations created we have observed the following.

- The global per capita energy consumption has seen a drastic increase over the decade and its trajectory has been influenced by various factors, including technological advancements, economic development, population growth, and shifts in energy sources. Due the these advancements, the demand for energy has seen steady growth with no room for slowing down as of yet. Energy consumption will always increase amongst a civilization as we strive to move forward. this is fueled by our desire to improve our lives and make the right changes to increase the quality of life.

- Up until 1990, the amount of total energy utilized per person remained basically constant. Since 1990, the average person's energy usage has been rising at a 7%-per-decade rate. The proportion of renewable energy, on the other hand, has been rising relatively flatly at a pace of about 1% every decade with Asia seeing the most growth in its energy consumption when compared to the rest of the regions. This increase in demand has only encountered 1 major issue which is which energy source is utilized. This debates for the fight between renewable and Non-renewable energy sources

- The rise of renewable sources of energy and the environmental movement has made the consumers more Eco-friendly. Efforts to combat climate change and reduce greenhouse gas emissions are driving the global transition toward a greater reliance on renewable energy sources. However, the pace and success of this transition depend on numerous factors, including political will, economic considerations, and technological advancements. In some countries and regions, the transition to renewables has been more pronounced, with a growing percentage of electricity coming from wind, solar, and hydroelectric power. the OPEC countries have not yet reached Sustainability therefore they rely more on non-renewable sources of energy. this is caused due to a lack of knowledge and awareness which in turn causes harm to both the environment and the consumers. The most amount of growth in the renewable energy sector has been observed in the region of the US & Canada with an average of 2% growth per decade with Europe being the next biggest region to increase its growth by 3% per-decade.


### Citations

1. Hannah Ritchie, Max Roser and Pablo Rosado (2022) - "Energy". Published online at OurWorldInData.org. Retrieved from: 'https://ourworldindata.org/energy' [Online Resource]

2. Data from Feenstra et al. (2015) Penn World Table v10.0 via Our World in Data.

*   Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer (2015), “The Next Generation of the Penn World Table” American Economic Review, 105(10), 3150-3182, available for download at www.ggdc.net/pwt.

*   Max Roser (2013) – “Economic Growth”. Published online at OurWorldInData.org. Retrieved from: ‘https://ourworldindata.org/economic-growth’ [Online Resource]
