![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

# Downloading and Visualizing Statistics Canada Open Data

You can use this notebook as a sample of what it looks like to download, parse, visualize and interpret data in the context of our hackathon. 

In this notebook we will use a dataset provided by Statistics Canada. Statistics Canada produces statistics that help Canadians better understand their country—its population, resources, economy, society and culture. Learn more about StatsCan in their website https://www.statcan.gc.ca/eng/about/about?MM=as


The questions we will aim to answer are: if we were colonizing Mars, how would we generate electricity? What can we learn from current electric power generation? Are there electric power generation methods that would be suitable candidates for Mars? 

We start by citing our source:

Statistics Canada.  Table  25-10-0015-01   Electric power generation, monthly generation by type of electricity
DOI:   https://doi.org/10.25318/2510001501-eng

Run the cell below to import modules and functions.

In [1]:
from time import sleep
%run -i ./StatsCan/helpers.py
%run -i ./StatsCan/scwds.py
%run -i ./StatsCan/sc.py

The product ID for the dataset Statistics Canada. Table 25-10-0015-01 Electric power generation, monthly generation by type of electricity DOI: https://doi.org/10.25318/2510001501-eng is `25-10-0015-01`. 

A list of product IDS you can use is provided in the main Statistics Canada directory.

Run the cell below to download the data.

In [25]:
# # Download data 
productId = "25-10-0015-01"

        
download_tables(str(productId))


df_fullDATA = zip_table_to_dataframe(productId)


# Clean up full dataset - remove internal use columns
cols = list(df_fullDATA.loc[:,'REF_DATE':'UOM'])+ ['SCALAR_FACTOR'] +  ['VALUE']
df_less = df_fullDATA[cols]
df_less2 = df_less.drop(["DGUID"], axis=1)

# Display only first five entries
df_less2.head()

Downloading Dataset: 100%|██████████| 1/1 [00:01<00:00,  1.90s/it]


PARSING DATA AS PANDAS DATAFRAME


Unnamed: 0,REF_DATE,GEO,Class of electricity producer,Type of electricity generation,UOM,SCALAR_FACTOR,VALUE
0,2008-01-01,Canada,Total all classes of electricity producer,Total all types of electricity generation,Megawatt hours,units,59082501.0
1,2008-01-01,Canada,Total all classes of electricity producer,Hydraulic turbine,Megawatt hours,units,36647695.0
2,2008-01-01,Canada,Total all classes of electricity producer,Conventional steam turbine,Megawatt hours,units,11021441.0
3,2008-01-01,Canada,Total all classes of electricity producer,Nuclear steam turbine,Megawatt hours,units,8963878.0
4,2008-01-01,Canada,Total all classes of electricity producer,Internal combustion turbine,Megawatt hours,units,105643.0


### Important categories: Electricity Producers

StatsCan data is extensive. 

Run the cell below to get all possible categories for the class of electricity producer. 

In [3]:
df_less2["Class of electricity producer"].unique()

array(['Total all classes of electricity producer',
       'Electricity producers, electric utilities',
       'Electricity producers, industries'], dtype=object)

### Important Categories: Type of Electricity Generation

In [4]:
df_less2["Type of electricity generation"].unique()

array(['Total all types of electricity generation', 'Hydraulic turbine',
       'Conventional steam turbine', 'Nuclear steam turbine',
       'Internal combustion turbine', 'Combustion turbine',
       'Tidal power turbine', 'Wind power turbine', 'Solar',
       'Other types of electricity generation',
       'Total electricity production from combustible fuels',
       'Total electricity production from non-renewable combustible fuels',
       'Total electricity production from biomass'], dtype=object)

### Plotting our data

Substitute the values in the variables `class_of_electricity_producer` and `type_electricity_generation` - an example has been given for you.

## Plotting different kinds of electric power generation by province

The list of type of electricity generation is 

- 'Total all types of electricity generation', 
- 'Hydraulic turbine',
- 'Conventional steam turbine', 'Nuclear steam turbine',
- 'Internal combustion turbine', 'Combustion turbine',
- 'Tidal power turbine', 
- 'Wind power turbine', 
- 'Solar',
- 'Other types of electricity generation',
- 'Total electricity production from combustible fuels',
- 'Total electricity production from non-renewable combustible fuels',
- 'Total electricity production from biomass'

There is not water, or biological material to generate electric energy in Mars.... we need help from provinces specializing in Solar and Wind power production to get us started. 

In [5]:
import plotly_express as px

In [60]:
def plot_class_type_electricity(class_of_electricity_producer,type_electricity_generation):
    geo_condition = df_less2["GEO"] != "Canada"
    condition_1 = df_less2["Class of electricity producer"] ==class_of_electricity_producer
    condition_2 = df_less2["Type of electricity generation"] == type_electricity_generation

    final_df = df_less2[geo_condition & condition_1 & condition_2]
    
    fig = px.line(final_df,
       x="REF_DATE",
       y="VALUE",
       title=class_of_electricity_producer + " " + type_electricity_generation,
      color ="GEO")
    return fig

In [61]:
# Subsetting data
type_electricity_generation = "Solar"
class_of_electricity_producer = "Total all classes of electricity producer"
plot_class_type_electricity(class_of_electricity_producer,type_electricity_generation)

Ontario produces the most electricity power via Solar during the Summer months. 

In [62]:
# Subsetting data
type_electricity_generation = "Wind power turbine"
class_of_electricity_producer = "Total all classes of electricity producer"
plot_class_type_electricity(class_of_electricity_producer,type_electricity_generation)


Ontario and Quebec produce the most energy via wind turbines. 

In [58]:
pivot_electricity_generation = df_less2.pivot_table(index="GEO",columns="Type of electricity generation")

In [24]:
pivot_electricity_generation["VALUE"]

Type of electricity generation,Combustion turbine,Conventional steam turbine,Hydraulic turbine,Internal combustion turbine,Nuclear steam turbine,Other types of electricity generation,Solar,Tidal power turbine,Total all types of electricity generation,Total electricity production from biomass,Total electricity production from combustible fuels,Total electricity production from non-renewable combustible fuels,Wind power turbine
GEO,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
Alberta,661239.2,2569751.0,142840.3,4299.044118,,8592.146667,725.666667,,3791423.0,104357.666667,4098578.0,4027359.0,186818.245614
British Columbia,111143.3,243084.7,3274962.0,3309.291829,,0.0,42.6,0.0,3642181.0,241470.333333,348774.3,111261.3,79909.453333
Canada,1394119.0,5288155.0,20898010.0,61989.555556,7772006.0,9447.241758,68623.873239,1251.204678,33917260.0,537662.333333,7358602.0,7151804.0,961112.242105
Manitoba,1233.475,11319.76,2424846.0,952.874459,,0.0,0.106667,,1930160.0,6040.666667,7039.787,1713.667,39442.414966
New Brunswick,122737.1,308867.1,226034.9,91.5,238142.6,0.0,0.0,0.0,730771.5,30389.0,299338.3,271699.0,45913.0
Newfoundland and Labrador,15294.4,71817.11,2928675.0,3698.508621,,0.0,0.0,0.0,2306858.0,2849.0,110499.5,150745.3,8791.530612
Northwest Territories,4220.333,,17458.35,19964.222222,,,15.013333,,38751.17,0.0,27998.59,27942.67,1128.336283
Nova Scotia,39075.69,505310.5,67908.82,,,0.0,0.0,1251.204678,589848.5,13492.333333,443061.6,545532.3,42487.204678
Nunavut,0.0,,0.0,11926.614719,,,0.0,,13147.86,0.0,15968.94,18073.5,0.0
Ontario,443665.7,621881.2,2054824.0,2969.246032,7410314.0,45.21519,68162.21831,,8286783.0,67727.0,754144.1,714738.7,367467.263158


In [87]:
piv_not_canada = pivot_electricity_generation[pivot_electricity_generation.index != "Canada"]
px.bar(piv_not_canada,
       x=piv_not_canada.index,
       y=piv_not_canada["VALUE"]["Wind power turbine"],
       title="Average Electricity Generation using Wind Power Turbine by Province"
      ).update_xaxes(categoryorder = "total ascending")

We see that, on average, Ontario is the largest producer of electricity using wind power turbines, followed by Quebec. 

In [88]:
px.density_heatmap(df_less2,"Type of electricity generation","GEO",
                   title="Type of electricity generation: all types, all provinces")

## Conclusion

If all we have in Mars is wind and the sun, the first attempt at gethering electric power would be to learn more from what Quebec and Ontario have done to implemnent this type of electricity generation, how costly it is, and what is needed to make it happen. We would also need to study optimal locations in Mars to maxiize generation of electricity. 

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)