![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

<a href="https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fcallysto%2Fcurriculum-notebooks&branch=master&subPath=Science/CanadianElectricityGeneration/canadian-electricity-generation.ipynb&depth=1" target="_parent"><img src="https://raw.githubusercontent.com/callysto/curriculum-notebooks/master/open-in-callysto-button.svg?sanitize=true" width="123" height="24" alt="Open in Callysto"/></a>

# Canadian Electricity Generation

We are going to investigate how our electricity is produced in Canada. We'll start by importing the data from a csv file that we've prepared using data from the year 2018 from [Natural Resources Canada](https://www.nrcan.gc.ca/our-natural-resources/energy-sources-distribution/10728). We can then visualize it with a [stacked bar graph](https://chartio.com/learn/charts/stacked-bar-chart-complete-guide).

Run all of the code in this notebook by clicking the `►►` button above, or by clicking on the `Cell` menu and selecting `Run All`.

In [None]:
import pandas as pd
elec = pd.read_csv('canadian-electricity-generation.csv')

import plotly.express as px
px.bar(elec, x='Area', y='Percent', color='Source', title='Electricity Generation in Canada by Area')

<br><br>
The stacked bar graph we just created is interactive. You can click on a word in the legend remove or add it, or double-click a word to show only that source.

What are some things you observe about energy generation in Canada based on this graph?

## Filtering the Data

We can also show only the renewable energy sources from the data set using the code in the following cell.

You can change

`elec[elec['Renewable']=='Renewable']`

to

`elec[elec['Renewable']=='Nonrenewable']`

and then click the `►Run` button if you'd like to see a graph with the nonrenewable sources.

*(you might also want to change `title='Renewable` to `title='Nonrenewable`)*

In [None]:
px.bar(elec[elec['Renewable']=='Renewable'], x='Area', y='Percent', color='Source', title='Renewable Electricity Generation in Canada')

<br><br>
We can also look at a single province or territory in a pie chart using the following code.

Feel free to change `area = 'Alberta'` to something like `area = 'Ontario'` then run the cell.

In [None]:
area = 'Alberta'
px.pie(elec[elec['Area']==area], names='Source', values='Percent', title='Electricity Generation in '+area)

## Population Data

Since the provinces and territories have different population sizes, we probably want to look at electricity production relative to population. We will combine our electricity dataset with a csv file that includes population data from the [Statistics Canada 2016 Census](https://www12.statcan.gc.ca/census-recensement/2016/dp-pd/prof/index.cfm?Lang=E). We'll multiply the electricity generation percents by this population data to get a new data table.

In [None]:
pop = pd.read_csv('canadian-populations.csv')
df = pd.merge(elec, pop, on='Area', how='left')
df['Proportion'] = df['Percent']*df['Population']
df

From this new data table we can generate a bar graph that shows likely production amounts in Canada.

In [None]:
px.bar(df, x='Source', y='Proportion', text='Area', title='Proportional Electricity Generation in Canada')

What do you notice from the graph above?

## Sunburst Charts

These are another type of visualization we can use with this data set. This first sunburst chart shows proprotional production by province or territory, then by source.

In [None]:
px.sunburst(df, path=['Area','Source'], values='Proportion', title='Electricity Generation in Canada')

We can also include whether the source is renewable or not.

In [None]:
px.sunburst(df, path=['Area','Renewable','Source'], values='Proportion', title='Electricity Generation in Canada')

Notice what happens if we chance the order of the columns we are using from the data set.

In [None]:
px.sunburst(df, path=['Renewable','Area','Source'], values='Proportion', title='Electricity Generation in Canada')

In [None]:
px.sunburst(df, path=['Source','Area'], values='Proportion', title='Electricity Generation in Canada')

We can also create a sunburst using data from only one province or territory.

In [None]:
area = 'Alberta'
px.sunburst(df[df['Area']==area], path=['Renewable','Source'], values='Percent', title='Electricity Generation in '+area)

Or we can filter the data to show only the territories.

In [None]:
px.sunburst(df[df['Type']=='Territory'], path=['Area','Renewable','Source'], values='Proportion', title='Electricity Generation in the Territories')

## Alternate Data Sources

We can also obtain data from tables on a Wikipedia article. These datasets are likely more current, but may contain errrors or stations that are not yet operational.

As an example, let's get the data for [electrical generating stations in Alberta](https://en.wikipedia.org/wiki/List_of_generating_stations_in_Alberta). The article format or ordering may have changed and the code may need to be updated to match, it would be a good idea to compare the article to the categories list in the code.

In [None]:
df = pd.read_html('https://en.wikipedia.org/wiki/List_of_generating_stations_in_Alberta')
categories = ['coal','natural gas','dual fuel','biomass','geothermal','hydroelectric','wind','solar']
values = []
for i, c in enumerate(categories):
  total = int(pd.to_numeric(df[i+1]['Capacity (MW)'], errors='coerce').sum())
  values.append(total)
print(categories)
print(values)

px.pie(names=categories, values=values)

## Conclusion

We created some bar, pie, and sunburst charts from data about electricity generation in Canada, including population data.

What other visualizations do you think we could create from these data sets?

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)