# A Study of Energy Sources used for Global Electricity Generation

___
### __Goals of the Visualizations__

There has been a worldwide movement in recent years to adopt more renewable energy sources in order to mitigate the impacts of global warming.  Calls are being made to increase the use of renewable energy for the generation of electricity by moving away from fossil fuels.

Power is generated by converting some other form of energy from its raw form into electricity.  These raw sources can be low carbon intensive energy such as hydro, solar, wind, biofuels, and nuclear or high carbon intensive energy such as coal, oil, and natural gas.

Currently humanity worldwide is overly dependent on the burning of fossil fuels for the generation of electricity.  Despite the good intentions of implementing the use of renewables, the reality is that we will need to remain critically dependent on fossil fuels for electricity generation for many years to come.

The goal of these visualization will be to show the overwhelming reliance humans have on fossil fuels for the generation of electricity and the sobering reality of the challenge of moving away from these sources to overcome global warming.

___
### __Dataset Import & Transform__

The “Data on Energy” dataset was selected from the "Our World in Data" github repository (https://github.com/owid/energy-data).  This dataset contains a robust amount of information of energy producton and consumption.

A subset of the data was taken from 1965 to 2018 focusing on the types of energy which are used to generate electricity for each continent.  The "energy-data.csv" and "continents.csv" contains the raw data from the Our World in Data repository.  This dataset was quickly cleaned up externally using a Knime workflow to produce the "energy.csv" file.  This "energy" dataset will be used to study how electricity is generated globally.

#### *Knime workflow:*

![workflow](./workflow.svg)

#### *Initializing Python packages:*

In [3]:
import pandas as pd
import altair as alt

#### *Importing the "energy" dataframe & adding new columns:*

In [304]:
energy = pd.read_csv('https://raw.githubusercontent.com/ryan-bulger/electricity-sources/main/data/energy.csv')
energy = energy.fillna(0)
energy

Unnamed: 0,Year,Continent,Biofuel Power (TWh),Coal Power (TWh),Natural Gas Power (TWh),Hydro Power (TWh),Nuclear Power (TWh),Oil Power (TWh),Other Renewable Power (TWh),Solar Power (TWh),Wind Power (TWh),Population,GDP ($),Electricity Demand (TWh),GHGs from Electricity Generation (MM tonnes of CO2e)
0,1965,Africa,0.00,0.00,0.00,3.382,0.00,0.00,0.00,0.00,0.00,217004448,467523374071,0.00,0.00
1,1966,Africa,0.00,0.00,0.00,3.270,0.00,0.00,0.00,0.00,0.00,222484248,477387315747,0.00,0.00
2,1967,Africa,0.00,0.00,0.00,3.335,0.00,0.00,0.00,0.00,0.00,228103883,484779771596,0.00,0.00
3,1968,Africa,0.00,0.00,0.00,4.646,0.00,0.00,0.00,0.00,0.00,233886377,508462028666,0.00,0.00
4,1969,Africa,0.00,0.00,0.00,5.741,0.00,0.00,0.00,0.00,0.00,239862748,553104580527,0.00,0.00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
319,2014,South America,56.44,48.47,219.32,636.750,19.72,96.79,0.00,0.82,15.52,408493800,6223611926995,1094.33,243.65
320,2015,South America,58.74,52.19,233.36,636.930,20.41,96.79,0.00,1.71,27.29,412362690,6391625133915,1127.52,254.33
321,2016,South America,59.78,52.19,217.07,653.650,22.65,79.47,0.00,3.21,40.81,416164870,5922122056469,1129.29,235.09
322,2017,South America,60.59,49.39,222.24,659.150,20.57,66.32,0.06,5.39,51.57,419903920,5944422958512,1136.69,226.66


In [305]:

#Change Year column to datetime
energy['Year'] = pd.to_datetime(energy['Year'], format='%Y')

# Energy source groups
energy['Low Carbon Sources (TWh)'] = sum([energy['Biofuel Power (TWh)'], energy['Hydro Power (TWh)'], energy['Other Renewable Power (TWh)'], \
                                            energy['Solar Power (TWh)'], energy['Wind Power (TWh)'], energy['Nuclear Power (TWh)']])
energy['High Carbon Sources (TWh)'] = sum([energy['Coal Power (TWh)'], energy['Oil Power (TWh)'], energy['Natural Gas Power (TWh)']])

# Per capita calculations
energy['GDP per capita ($/person)'] = energy['GDP ($)'] / energy['Population']
energy['Electricity Generation per capita (TWh/1MM people)'] = sum([energy['Low Carbon Sources (TWh)'], \
                                                                    energy['High Carbon Sources (TWh)']]) / energy['Population'] * 1000000
energy['GHGs per capita (MM Tonnes CO2e per MM people)'] = energy['GHGs from Electricity Generation (MM tonnes of CO2e)'] / energy['Population'] * 1000000

#### *Displaying the final version of the "energy" dataframe:*

In [306]:
energy

Unnamed: 0,Year,Continent,Biofuel Power (TWh),Coal Power (TWh),Natural Gas Power (TWh),Hydro Power (TWh),Nuclear Power (TWh),Oil Power (TWh),Other Renewable Power (TWh),Solar Power (TWh),Wind Power (TWh),Population,GDP ($),Electricity Demand (TWh),GHGs from Electricity Generation (MM tonnes of CO2e),Low Carbon Sources (TWh),High Carbon Sources (TWh),GDP per capita ($/person),Electricity Generation per capita (TWh/1MM people),GHGs per capita (MM Tonnes CO2e per MM people)
0,1965-01-01,Africa,0.00,0.00,0.00,3.382,0.00,0.00,0.00,0.00,0.00,217004448,467523374071,0.00,0.00,3.382,0.00,2154.441434,0.015585,0.000000
1,1966-01-01,Africa,0.00,0.00,0.00,3.270,0.00,0.00,0.00,0.00,0.00,222484248,477387315747,0.00,0.00,3.270,0.00,2145.712876,0.014698,0.000000
2,1967-01-01,Africa,0.00,0.00,0.00,3.335,0.00,0.00,0.00,0.00,0.00,228103883,484779771596,0.00,0.00,3.335,0.00,2125.258743,0.014621,0.000000
3,1968-01-01,Africa,0.00,0.00,0.00,4.646,0.00,0.00,0.00,0.00,0.00,233886377,508462028666,0.00,0.00,4.646,0.00,2173.970264,0.019864,0.000000
4,1969-01-01,Africa,0.00,0.00,0.00,5.741,0.00,0.00,0.00,0.00,0.00,239862748,553104580527,0.00,0.00,5.741,0.00,2305.921137,0.023935,0.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
319,2014-01-01,South America,56.44,48.47,219.32,636.750,19.72,96.79,0.00,0.82,15.52,408493800,6223611926995,1094.33,243.65,729.250,364.58,15235.511352,2.677715,0.596459
320,2015-01-01,South America,58.74,52.19,233.36,636.930,20.41,96.79,0.00,1.71,27.29,412362690,6391625133915,1127.52,254.33,745.080,382.34,15500.008340,2.734049,0.616763
321,2016-01-01,South America,59.78,52.19,217.07,653.650,22.65,79.47,0.00,3.21,40.81,416164870,5922122056469,1129.29,235.09,780.100,348.73,14230.230573,2.712459,0.564896
322,2017-01-01,South America,60.59,49.39,222.24,659.150,20.57,66.32,0.06,5.39,51.57,419903920,5944422958512,1136.69,226.66,797.330,337.95,14156.626493,2.703666,0.539790


___
### __Tasks of the Visualizations__

#### *Task Elicitation*  

>**Task 1**  
- Goal:
  - To understand how the mixture of energy sources used for global electricity generation have changed over time.
- Means:
    - Users will navigate the visualizations by scanning through the years to see what the breakdown has been between high and low carbon generation sources.
- Characteristics:
    - The task intention is to determine if there has been a trend downwards or upwards of the use of high carbon sources compared to low carbon sources.
- Target data:
    - The data used will be created from the total amount of electricity generated from each generation type.  These generation types will be broken out into two groups: High-carbon sources (Coal, Oil, & Natural Gas), and Low-carbon sources (Wind, Solar, Hydro, Biofuels, Nuclear, and Other Renewables).  These generation types and groups will be compared to the Year data.
- Workflow:
    - The visualization will be broken down into two charts.  The first chart will show a 100% area chart comparing the two generation groups (High-carbon & Low-carbon sources) on the y-axis versus the Year on the x-axis.  Users will be able to hover over this chart and a vertical line will track the cursor to show which year is currently being evaluated.  A second chart will be a waterfall chart showing the total generation and a breakdown of each generation type for the year that is being highlighted.
- Roles:
    - This visualization will be targeted towards the general public to assist in their understanding of global electricity generation.
 
>**Task 2**  
- Goal:
    - To understand how greenhouse gasses and the generation of electricity are connected.
- Means:
    - The users will use this visualization to make relationships between continents of the amount of greenhouse gasses that are being emitted over time.
- Characteristics:
    - Users will evaluate the low-level characteristics of the visualization by comparing the overall trends of greenhouse gasses over time.
- Target data:
    - The greenhouse gas emissions will be compared to the Year for each individual continent.
- Workflow:
    - There will be 6 area charts, one for each continent, faceted into 3 rows by 2 columns.  Users will scan and compare the trends between each continent.
- Roles:
    - This visualization will be targeted towards the general public to assist in their understanding of global electricity generation.

>**Task 3**  
- Goal:
    - To understand how the population and per capita GDP of a continent affects the type of electricity generation and the subsequent per capita amount of greenhouse gas emissions that come from that generation.
- Means:
    - Users will be able to organize the data by filtering the view to a single continent to understand how generation and emissions are related to GDP.
- Characteristics:
    - The users will observe the high-level characteristics of patterns and overall trends for each continent, and how those trends are comparable to the other continents.
- Target data:
    - The GDP per capita will be compared to both the Electricity Generation per capita and the Greenhouse Gasses per capita.
- Workflow:
    - There will be two bubble charts with the GDP per capita on the x-axis for both charts.  The first chart will compare the Generation per capita on the y-axis to the GDP per capita for each continent, and the second chart will compare the Greenhouse Gasses per capita on the y-axis to the GDP per capita for each continent.  The size of the bubbles will indicate the population.  All years will be displayed with bubbles getting a darker hue by year to allow the user to observe how the data changes over time.
- Roles:
    - This visualization will be targeted towards the general public to assist in their understanding of global electricity generation.

___
### __Visualization Implementation__

##### *Low-fidelity Prototyping:*

<img src="Task1-2.jpg" width=500>
<img src="Task3.jpg" width=500>

#### *Summary & Justification of the Key Design Elements:*

fsdaf

___
### __Visualization Evaluation__

The target question you want to answer:  
- .....  

The 3 people who you recruited to answer that question:  
- ....

Evaluation Procedure  
- The kinds of measures you would use to answer your data (e.g., insight depth, use cases, accuracy) and what these measures would tell you about the core question
- The approach you will use to answer that question (e.g., a journaling study, a formal experiment, etc.)
- How you would instantiate those methods (i.e., what would your participants do?)
- What criteria would you use to indicate that your visualization was successful

Results of the evaluation
- ....

How has your plan has changed after the evaluation was completed?
- ....

___
### __Visualizations__

#### Task 1 - Mixture of Global Electricity Generation Types

In [430]:
# Create energy_mix dataframe
energy_mix = energy.rename({'Low Carbon Sources (TWh)':'Low Carbon', 'High Carbon Sources (TWh)':'High Carbon'}, axis=1)
energy_mix = energy_mix.melt(id_vars=['Year'], value_vars=['Low Carbon','High Carbon'], var_name='SourceType', value_name='TWh')
energy_mix

Unnamed: 0,Year,SourceType,TWh
0,1965-01-01,Low Carbon,3.382
1,1966-01-01,Low Carbon,3.270
2,1967-01-01,Low Carbon,3.335
3,1968-01-01,Low Carbon,4.646
4,1969-01-01,Low Carbon,5.741
...,...,...,...
643,2014-01-01,High Carbon,364.580
644,2015-01-01,High Carbon,382.340
645,2016-01-01,High Carbon,348.730
646,2017-01-01,High Carbon,337.950


In [432]:
# Create Generation Type base chart
area_chart = alt.Chart(energy_mix).mark_area().encode(
    x=alt.X(
        'Year:T',
        title=None,
        axis=alt.Axis()),
    y=alt.Y(
        'sum(TWh):Q',
        stack='normalize',
        axis=alt.Axis(format='%'),
        title=None),
    color=alt.Color(
        'SourceType:N', 
        scale=alt.Scale(range=['#303030','green']),
        legend=alt.Legend(
            direction='horizontal',
            orient='top'),
            title=None),
    order = alt.Order('SourceType', sort='ascending'),
).properties(
    height=300,
    width=450,
)
area_chart

In [444]:
base = alt.Chart(energy_mix).mark_area(
    color='blue',
    opacity=0.3
).encode(
    x='Year:T',
    y='sum(TWh):Q',
    color=alt.Color(
        'SourceType:N', 
        scale=alt.Scale(range=['#303030','green']),
        legend=alt.Legend(
            direction='horizontal',
            orient='top'),
            title=None),
)

brush = alt.selection_interval(encodings=['x'], empty='all')

background = base.add_selection(brush)
selected = base.transform_filter(brush).mark_area()

background + selected | selected

In [383]:
# Add vertical line to Generation Type area chart

# Create a selection that chooses the nearest point & selects based on Year
nearest = alt.selection(
    type='single',
    nearest=True,
    on='mouseover',
    fields=['Year'],
    empty='none')

# Year value on mouse hover
x_selectors = alt.Chart(energy_mix).mark_point().encode(
    x='Year:T',
    opacity=alt.value(0),
).add_selection(nearest)

# Draw a vertical line at the location of the mouse hover
rules = alt.Chart(energy_mix).mark_rule(color='black').encode(
    x='Year:T',
).transform_filter(nearest)

# Layer line on top of Generation Type area chart
area_chart = alt.layer(area_chart, x_selectors, rules)

area_chart

In [365]:
# Create generation dataframe
generation = energy.rename({'Biofuel Power (TWh)':'Biofuel',
                            'Coal Power (TWh)':'Coal',
                            'Natural Gas Power (TWh)':'Nat. Gas',
                            'Hydro Power (TWh)':'Hydro',
                            'Nuclear Power (TWh)':'Nuclear',
                            'Oil Power (TWh)':'Oil',
                            'Other Renewable Power (TWh)':'Other',
                            'Solar Power (TWh)':'Solar',
                            'Wind Power (TWh)':'Wind'
                            },
                            axis=1)
generation = generation.melt(id_vars=['Year'], 
                            value_vars=['Coal','Biofuel','Nat. Gas','Hydro','Nuclear','Oil','Other','Solar','Wind'],
                            var_name='Source Type',
                            value_name='TWh')
generation

Unnamed: 0,Year,Source Type,TWh
0,1965-01-01,Coal,0.00
1,1966-01-01,Coal,0.00
2,1967-01-01,Coal,0.00
3,1968-01-01,Coal,0.00
4,1969-01-01,Coal,0.00
...,...,...,...
2911,2014-01-01,Wind,15.52
2912,2015-01-01,Wind,27.29
2913,2016-01-01,Wind,40.81
2914,2017-01-01,Wind,51.57


In [386]:
# Create Total Generation waterfall chart
breakdown = alt.Chart(generation).mark_bar().encode(
    x=alt.X(
        'Source Type:N',
        sort='-y'),
    y=alt.Y(
        'sum(TWh):Q',
        title='Terawatt Hours'),
    color=alt.Color(
        'Source Type',
        scale=alt.Scale(
            domain=['Coal','Oil','Nat. Gas','Hydro','Solar','Wind','Biofuel','Nuclear','Other'],
            range=['#303030','#303030','#303030','green','green','green','green','green','green'],
        ),
        legend=None
    )
)
breakdown

In [387]:
# Concatenate  Generation Type area chart with Waterfall chart
charts = area_chart & breakdown
charts.properties(
    title='Mixture of Global Electricity Generation Types'
).configure_title(
    fontSize=20
)

In [151]:
import altair as alt
from vega_datasets import data

states = alt.topo_feature(data.us_10m.url, 'states')
source = data.income.url

alt.Chart(source).mark_geoshape().encode(
    shape='geo:G',
    color='pct:Q',
    tooltip=['name:N', 'pct:Q'],
    facet=alt.Facet('group:N', columns=2),
).transform_lookup(
    lookup='id',
    from_=alt.LookupData(data=states, key='id'),
    as_='geo'
).properties(
    width=300,
    height=175,
).project(
    type='albersUsa'
)

___
### __Synthesis of Visualization Findings__

- what elements of your approach worked well
- what elements you would refine in future iterations
    - task 1
        - dont like 1990 and 1999 steps in 100% area chart
        - no easy way to make waterfall chart in altair
    - I didn't like how the 2nd viz relied on motion (users can't remeber where things were) 