# A Study of Energy Sources used for Global Electricity Generation

___
### __Goals of the Visualizations__

There has been a worldwide movement in recent years to adopt more renewable energy sources in order to mitigate the impacts of global warming.  Calls are being made to increase the use of renewable energy for the generation of electricity by moving away from fossil fuels.

Power is generated by converting some other form of energy from its raw form into electricity.  These raw sources can be low carbon intensive energy such as hydro, solar, wind, biofuels, and nuclear or high carbon intensive energy such as coal, oil, and natural gas.

Currently humanity worldwide is overly dependent on the burning of fossil fuels for the generation of electricity.  Despite the good intentions of implementing the use of renewables, the reality is that we will need to remain critically dependent on fossil fuels for electricity generation for many years to come.

The goal of these visualization will be to show the overwhelming reliance humans have on fossil fuels for the generation of electricity and the sobering reality of the challenge of moving away from these sources to overcome global warming.

___
### __Dataset Import & Transform__

The “Data on Energy” dataset was selected from the "Our World in Data" github repository (https://github.com/owid/energy-data).  This dataset contains a robust amount of information of energy producton and consumption.

A subset of the data was taken from 1985 to 2018 focusing on the types of energy which are used to generate electricity for each continent.  The "energy-data.csv" and "continents.csv" contains the raw data from the Our World in Data repository.  This dataset was quickly cleaned up externally using a Knime workflow to produce the "energy.csv" file.  This "energy" dataset will be used to study how electricity is generated globally.

#### *Knime workflow:*

![workflow](./workflow.svg)

In [3]:
# Importing Python packages
import pandas as pd
import altair as alt

In [48]:
# Import energy dataframe
energy = pd.read_csv('https://raw.githubusercontent.com/ryan-bulger/electricity-sources/main/data/energy.csv')

# Set N/A = 0
energy = energy.fillna(0)

In [76]:
# Format energy dataframe

# Change GDP to trillions of dollars
energy['GDP ($T)'] = energy['GDP ($)'] / 10**12

# Energy source groups
energy['Low Carbon Sources (TWh)'] = sum([energy['Biofuel Power (TWh)'], energy['Hydro Power (TWh)'], energy['Other Renewable Power (TWh)'], \
                                            energy['Solar Power (TWh)'], energy['Wind Power (TWh)'], energy['Nuclear Power (TWh)']])
energy['High Carbon Sources (TWh)'] = sum([energy['Coal Power (TWh)'], energy['Oil Power (TWh)'], energy['Natural Gas Power (TWh)']])

# Per capita calculations
energy['GDP per capita ($/person)'] = energy['GDP ($)'] / energy['Population']
energy['Electricity Generation per capita (kWh per person)'] = sum([energy['Low Carbon Sources (TWh)'], \
                                                                    energy['High Carbon Sources (TWh)']]) / energy['Population'] * 10**9
energy['GHGs per capita (Tonnes CO2e per person)'] = energy['GHGs from Electricity Generation (MM tonnes of CO2e)'] / energy['Population'] * 10**6

# Display energy dataframe
energy

Unnamed: 0,Year,Continent,Biofuel Power (TWh),Coal Power (TWh),Natural Gas Power (TWh),Hydro Power (TWh),Nuclear Power (TWh),Oil Power (TWh),Other Renewable Power (TWh),Solar Power (TWh),...,Population,GDP ($),Electricity Demand (TWh),GHGs from Electricity Generation (MM tonnes of CO2e),Low Carbon Sources (TWh),High Carbon Sources (TWh),GDP per capita ($/person),Electricity Generation per capita (kWh per person),GHGs per capita (Tonnes CO2e per person),GDP ($T)
0,1985,Africa,0.00,135.403,9.236,10.904,5.315,11.911,0.00,0.00,...,540131194,1225082245818,0.00,0.00,16.219,156.550,2268.119782,319.864881,0.000000,1.225082
1,1986,Africa,0.00,135.988,10.289,11.488,8.803,13.242,0.00,0.00,...,555618223,1251227005661,0.00,0.00,20.291,159.519,2251.954587,323.621495,0.000000,1.251227
2,1987,Africa,0.00,142.773,11.716,11.911,6.167,15.011,0.00,0.00,...,571509829,1267464622308,0.00,0.00,18.078,169.500,2217.747724,328.214828,0.000000,1.267465
3,1988,Africa,0.00,143.032,13.156,13.110,10.493,16.779,0.00,0.00,...,587756630,1323389864320,0.00,0.00,23.603,172.967,2251.594957,334.441144,0.000000,1.323390
4,1989,Africa,0.00,148.462,13.715,13.629,11.099,17.412,0.00,0.00,...,604293660,1364475496994,0.00,0.00,24.728,179.589,2257.967586,338.108793,0.000000,1.364475
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
199,2014,South America,56.44,48.470,219.320,636.750,19.720,96.790,0.00,0.82,...,408493800,6223611926995,1094.33,243.65,729.250,364.580,15235.511352,2677.715060,0.596459,6.223612
200,2015,South America,58.74,52.190,233.360,636.930,20.410,96.790,0.00,1.71,...,412362690,6391625133915,1127.52,254.33,745.080,382.340,15500.008340,2734.049484,0.616763,6.391625
201,2016,South America,59.78,52.190,217.070,653.650,22.650,79.470,0.00,3.21,...,416164870,5922122056469,1129.29,235.09,780.100,348.730,14230.230573,2712.458646,0.564896,5.922122
202,2017,South America,60.59,49.390,222.240,659.150,20.570,66.320,0.06,5.39,...,419903920,5944422958512,1136.69,226.66,797.330,337.950,14156.626493,2703.666115,0.539790,5.944423


___
### __Tasks of the Visualizations__

#### *Task Elicitation*  

>**Task 1**  
- Goal:
  - To understand how the mixture of energy sources used for global electricity generation have changed over time.
- Means:
    - Users will navigate the visualizations by scanning through the years to see what the breakdown has been between high and low carbon generation sources.
- Characteristics:
    - The task intention is to determine if there has been a trend downwards or upwards of the use of high carbon sources compared to low carbon sources.
- Target data:
    - The data used will be created from the total amount of electricity generated from each generation type.  These generation types will be broken out into two groups: High-carbon sources (Coal, Oil, & Natural Gas), and Low-carbon sources (Wind, Solar, Hydro, Biofuels, Nuclear, and Other Renewables).  These generation types and groups will be compared to the Year data.
- Workflow:
    - The visualization will be broken down into two charts.  The first chart will show a 100% area chart comparing the two generation groups (High-carbon & Low-carbon sources) on the y-axis versus the Year on the x-axis.  Users will be able to hover over this chart and a vertical line will track the cursor to show which year is currently being evaluated.  A second chart will be a waterfall chart showing the total generation and a breakdown of each generation type for the year that is being highlighted.
- Roles:
    - This visualization will be targeted towards the general public to assist in their understanding of global electricity generation.
 
>**Task 2**  
- Goal:
    - To understand how greenhouse gasses and the generation of electricity are connected.
- Means:
    - The users will use this visualization to make relationships between continents of the amount of greenhouse gasses that are being emitted over time.
- Characteristics:
    - Users will evaluate the low-level characteristics of the visualization by comparing the overall trends of greenhouse gasses over time.
- Target data:
    - The greenhouse gas emissions will be compared to the Year for each individual continent.
- Workflow:
    - There will be 6 area charts, one for each continent, faceted into 3 rows by 2 columns.  Users will scan and compare the trends between each continent.
- Roles:
    - This visualization will be targeted towards the general public to assist in their understanding of global electricity generation.

>**Task 3**  
- Goal:
    - To understand how the population and per capita GDP of a continent affects the type of electricity generation and the subsequent per capita amount of greenhouse gas emissions that come from that generation.
- Means:
    - Users will be able to organize the data by filtering the view to a single continent to understand how generation and emissions are related to GDP.
- Characteristics:
    - The users will observe the high-level characteristics of patterns and overall trends for each continent, and how those trends are comparable to the other continents.
- Target data:
    - The GDP per capita will be compared to both the Electricity Generation per capita and the Greenhouse Gasses per capita.
- Workflow:
    - There will be two bubble charts with the GDP per capita on the x-axis for both charts.  The first chart will compare the Generation per capita on the y-axis to the GDP per capita for each continent, and the second chart will compare the Greenhouse Gasses per capita on the y-axis to the GDP per capita for each continent.  The size of the bubbles will indicate the population.  All years will be displayed with bubbles getting a darker hue by year to allow the user to observe how the data changes over time.
- Roles:
    - This visualization will be targeted towards the general public to assist in their understanding of global electricity generation.

___
### __Visualization Implementation__

##### *Low-fidelity Prototyping:*

<img src="Task1-2.jpg" width=500>
<img src="Task3.jpg" width=500>

#### *Summary & Justification of the Key Design Elements:*

fsdaf

___
### __Visualization Evaluation__

The target question you want to answer:  
- .....  

The 3 people who you recruited to answer that question:  
- ....

Evaluation Procedure  
- The kinds of measures you would use to answer your data (e.g., insight depth, use cases, accuracy) and what these measures would tell you about the core question
- The approach you will use to answer that question (e.g., a journaling study, a formal experiment, etc.)
- How you would instantiate those methods (i.e., what would your participants do?)
- What criteria would you use to indicate that your visualization was successful

Results of the evaluation
- ....

How has your plan has changed after the evaluation was completed?
- ....

___
### __Visualizations__

#### Task 1 - Mixture of Global Electricity Generation Types

In [74]:
# Create generation dataframe

# Renaming cols
generation = energy.rename({'Biofuel Power (TWh)':'Biofuel',
                            'Coal Power (TWh)':'Coal',
                            'Natural Gas Power (TWh)':'Nat. Gas',
                            'Hydro Power (TWh)':'Hydro',
                            'Nuclear Power (TWh)':'Nuclear',
                            'Oil Power (TWh)':'Oil',
                            'Other Renewable Power (TWh)':'Other',
                            'Solar Power (TWh)':'Solar',
                            'Wind Power (TWh)':'Wind'
                            },
                            axis=1)

# Pivoting dataframe from wide to long format
generation = generation.melt(id_vars=['Year'], 
                            value_vars=['Coal','Biofuel','Nat. Gas','Hydro','Nuclear','Oil','Other','Solar','Wind'],
                            var_name='EnergyType',
                            value_name='TWh')

# Adding High/Low Carbon nomenclature to dataframe
high_c = ['Coal', 'Oil', 'Nat. Gas']
generation['SourceType'] = ['High Carbon Sources' if x in high_c else 'Low Carbon Sources' for x in generation['EnergyType']]

# Display generation dataframe
generation

Unnamed: 0,Year,EnergyType,TWh,SourceType
0,1985,Coal,135.403,High Carbon Sources
1,1986,Coal,135.988,High Carbon Sources
2,1987,Coal,142.773,High Carbon Sources
3,1988,Coal,143.032,High Carbon Sources
4,1989,Coal,148.462,High Carbon Sources
...,...,...,...,...
1831,2014,Wind,15.520,Low Carbon Sources
1832,2015,Wind,27.290,Low Carbon Sources
1833,2016,Wind,40.810,Low Carbon Sources
1834,2017,Wind,51.570,Low Carbon Sources


In [54]:
# Create High/Low Carbon area chart
bar_genType = alt.Chart(generation).mark_bar(
    color='grey',
    opacity=0.7,
    size=13
).encode(
    x=alt.X(
        'Year:N',
        title=None),
    y=alt.Y(
        'sum(TWh):Q',
        stack=True,
        #axis=alt.Axis(format='%'),
        title='Terawatt Hours'),
    color=alt.Color(
        'SourceType:N', 
        scale=alt.Scale(range=['#303030','green']),
        legend=alt.Legend(
            direction='horizontal',
            orient='top'),
            title=None),
    order = alt.Order('SourceType', sort='descending'),
).properties(
    height=300,
    width=450,
)

# Create Generation Energy Type bar chart
bar_genSource = alt.Chart(generation).mark_bar().encode(
    x=alt.X(
        'EnergyType:N',
        sort='-y',
        title=None,
        ),
    y=alt.Y(
        'sum(TWh):Q',
        title='Terawatt Hours',
        scale=alt.Scale(
            domainMin=0
        )),
    color=alt.Color(
        'EnergyType',
        scale=alt.Scale(
            domain=['Coal','Oil','Nat. Gas','Hydro','Solar','Wind','Biofuel','Nuclear','Other'],
            range=['#303030','#303030','#303030','green','green','green','green','green','green'],
        ),
        legend=None
    )
)

In [123]:
# Selected interval
interval = alt.selection_multi(fields=['Year'], on='mouseover')

# Background genType bar chart
bar_genType_background = bar_genType.add_selection(interval)

# Selected genType bar chart 
bar_genType_selected = bar_genType.transform_filter(interval).mark_bar().encode(
    y=alt.Y(
        'sum(TWh):Q',
        stack=True),
    color=alt.Color(
        'SourceType:N', 
        scale=alt.Scale(range=['#303030','green']),
        legend=alt.Legend(
             direction='horizontal',
             orient='top'),
             title=None)
)

# Selected genSource bar chart
bar_genSource_selected = bar_genSource.transform_filter(interval)

# Concatenate charts
alt.concat(bar_genType_background + bar_genType_selected, bar_genSource_selected
    ).resolve_scale(color='independent',
    ).properties(title=alt.TitleParams(
        text='Global Electricity Generation Sources',
        fontSize=20,
        subtitle='*Hover mouse to highlight a year.  Hold down <Shift> to select multiple years.',
        subtitleFontSize=10
    ))

#### Task 2 - Greenhouse Gas Emissions from Electricity Generation

In [73]:
# Create Greehouse Gasses dataframe

# Filter to years >= 2000 because GHGs were not recorded for all continents pervious to 2000
ghg = energy[energy['Year']>1999]

# Rename cols
ghg.rename(columns={'GHGs from Electricity Generation (MM tonnes of CO2e)':'GHGs'}, inplace=True)
ghg = ghg[['Year','Continent','GHGs']]

# Display ghg dataframe
ghg

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  ghg.rename(columns={'GHGs from Electricity Generation (MM tonnes of CO2e)':'GHGs'}, inplace=True)


Unnamed: 0,Year,Continent,GHGs
15,2000,Africa,236.50
16,2001,Africa,244.20
17,2002,Africa,256.58
18,2003,Africa,273.31
19,2004,Africa,287.44
...,...,...,...
199,2014,South America,243.65
200,2015,South America,254.33
201,2016,South America,235.09
202,2017,South America,226.66


In [127]:
# Create GHGs visualization

# Continent selector
selection = alt.selection_multi(fields=['Continent'], on='mouseover')

# Faceted GHGs area charts
ghg_facet = alt.Chart(ghg).mark_area(
    opacity=0.7,
).encode(
    x=alt.X('Year:N', title=None, axis=None),
    y=alt.Y('GHGs', title=None, axis=None),
    color=alt.Color('Continent:N', legend=None),
    opacity=alt.condition(selection, alt.value(1), alt.value(0.4)),
    facet=alt.Facet(
        'Continent:N',
        columns=2,
        title=None,
        header=alt.Header(labelOrient='bottom', labelAnchor='start')),
).properties(
    width=200,
    height=80,
).add_selection(selection)

# GHGs details bar chart
ghg_zoom = alt.Chart(ghg).mark_bar(color='#303030').encode(
    x=alt.X('Year:N', title=None),
    y=alt.Y('sum(GHGs):Q', title='Emissions (CO2e)'),
).properties(
    width=400,
    height=300,
).transform_filter(selection)

# Concatenating to create GHGs visualization
alt.concat(ghg_facet, ghg_zoom
).resolve_scale(color='independent',
).properties(title=alt.TitleParams(
    text='Greenhouse Gas Emissions Created by Electricity Generation',
    fontSize=20,
    subtitle='*Hover mouse over area charts to view discrete emissions details.',
    subtitleFontSize=10)
)


#### Task 3 - Influence that Population and GDP have on Electricity Demand and Greenhouse Gases

In [81]:
# Create GDP dataframe

# Filter to years >= 2000 because GHGs were not recorded for all continents pervious to 2000
gdp = energy[energy['Year']>1999]

# Display gdp dataframe
gdp

Unnamed: 0,Year,Continent,Biofuel Power (TWh),Coal Power (TWh),Natural Gas Power (TWh),Hydro Power (TWh),Nuclear Power (TWh),Oil Power (TWh),Other Renewable Power (TWh),Solar Power (TWh),...,Population,GDP ($),Electricity Demand (TWh),GHGs from Electricity Generation (MM tonnes of CO2e),Low Carbon Sources (TWh),High Carbon Sources (TWh),GDP per capita ($/person),Electricity Generation per capita (kWh per person),GHGs per capita (Tonnes CO2e per person),GDP ($T)
15,2000,Africa,2.15,194.58,96.99,74.51,13.01,38.55,0.43,0.00,...,804634502,2150734280526,424.24,236.50,90.33,330.12,2672.933208,522.535386,0.293922,2.150734
16,2001,Africa,2.14,199.19,103.82,81.21,10.72,39.17,0.48,0.00,...,824299002,2292132649491,440.66,244.20,95.01,342.18,2780.705356,530.377932,0.296252,2.292133
17,2002,Africa,2.13,205.12,120.74,83.40,11.99,38.02,0.39,0.02,...,844449025,2450288937929,464.51,256.58,98.36,363.88,2901.642213,547.386504,0.303843,2.450289
18,2003,Africa,2.00,219.94,127.92,81.69,12.66,39.52,0.79,0.02,...,865145959,2622270527264,487.25,273.31,97.77,387.38,3031.015171,560.772428,0.315912,2.622271
19,2004,Africa,2.23,229.72,144.04,86.54,14.28,36.65,1.03,0.02,...,886457068,2823553364349,516.33,287.44,104.88,410.41,3185.211632,581.291547,0.324257,2.823553
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
199,2014,South America,56.44,48.47,219.32,636.75,19.72,96.79,0.00,0.82,...,408493800,6223611926995,1094.33,243.65,729.25,364.58,15235.511352,2677.715060,0.596459,6.223612
200,2015,South America,58.74,52.19,233.36,636.93,20.41,96.79,0.00,1.71,...,412362690,6391625133915,1127.52,254.33,745.08,382.34,15500.008340,2734.049484,0.616763,6.391625
201,2016,South America,59.78,52.19,217.07,653.65,22.65,79.47,0.00,3.21,...,416164870,5922122056469,1129.29,235.09,780.10,348.73,14230.230573,2712.458646,0.564896,5.922122
202,2017,South America,60.59,49.39,222.24,659.15,20.57,66.32,0.06,5.39,...,419903920,5944422958512,1136.69,226.66,797.33,337.95,14156.626493,2703.666115,0.539790,5.944423


In [121]:
# Create GDP vs Generation chart
gdp_gen = alt.Chart(gdp).mark_circle().encode(
    x='GDP per capita ($/person):Q',
    y='Electricity Generation per capita (kWh per person):Q',
    color='Continent:N',
    opacity='Population:Q',
    size=alt.Size('Year:O', legend=alt.Legend(symbolLimit=10)),
).properties(
    height=400,
    width=300
).interactive()

# Create GDP vs GHGs chart
gdp_ghg = gdp_gen.encode(
    y='GHGs per capita (Tonnes CO2e per person):Q'
)

# Create Per Capatia Demand / GHGs visualization 
alt.hconcat(gdp_gen, gdp_ghg).properties(
    title=alt.TitleParams(
        text='Per Capita Electricity Demand and Greenhouse Gas Emissions',
        fontSize=20,
        subtitle='*Scroll to zoom and drag to move around charts.',
        subtitleFontSize=10)
)

___
### __Synthesis of Visualization Findings__

- what elements of your approach worked well
- what elements you would refine in future iterations
    - task 1
        - dont like 1990 and 1999 steps in 100% area chart
        - no easy way to make waterfall chart in altair
    - I didn't like how the 2nd viz relied on motion (users can't remeber where things were) 