<a href="https://colab.research.google.com/github/luke-scot/emissions-tracking/blob/main/energy_consumption.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Energy consumption


Run the first two cells setup the notebook.





In [2]:
%%capture
"""Installation and downloads"""
# Install floweaver and display widget packages
%pip install floweaver ipysankeywidget

# Import necessary packages
from floweaver import *

"""Import example data"""
# Import packages
import gdown, os
from google.colab import files

# Import and unzip files -> You can then view them in the left files panel
folder, zip_path = 'example_data', 'example_data.zip'
if not os.path.exists(folder): 
  gdown.download('https://drive.google.com/uc?id=1qriY29v7eKJIs07UxAw5RlJirfwuLnyP', zip_path ,quiet=True)
  ! unzip $zip_path -d 'example_data'
  ! rm $zip_path

In [3]:
"""Display setup"""
# Enable widget display for Sankeys in Colab
from google.colab import output
output.enable_custom_widget_manager()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

## Task 1 - US example 

Step through this section to see an example for the US based on the [Sankey diagrams of US energy consumption from the Lawrence Livermore National Laboratory](https://flowcharts.llnl.gov/) (thanks to John Muth for the suggestion and transcribing the data)

In [9]:
"""Load the dataset"""
dataset = Dataset.from_csv('example_data/us-energy-consumption.csv',
                           dim_process_filename='example_data/us-energy-consumption-processes.csv')

In [10]:
"""Define the order the nodes appear in"""
sources = ['Solar', 'Nuclear', 'Hydro', 'Wind', 'Geothermal',
           'Natural_Gas', 'Coal', 'Biomass', 'Petroleum']

uses = ['Residential', 'Commercial', 'Industrial', 'Transportation']

In [11]:
"""define the Sankey diagram definition"""
nodes = {
    'sources': ProcessGroup('type == "source"', Partition.Simple('process', sources), title='Sources'),
    'imports': ProcessGroup(['Net_Electricity_Import'], title='Net electricity imports'),
    'electricity': ProcessGroup(['Electricity_Generation'], title='Electricity Generation'),
    'uses': ProcessGroup('type == "use"', partition=Partition.Simple('process', uses)),
    
    'energy_services': ProcessGroup(['Energy_Services'], title='Energy services'),
    'rejected': ProcessGroup(['Rejected_Energy'], title='Rejected energy'),
    
    'direct_use': Waypoint(Partition.Simple('source', [
        # This is a hack to hide the labels of the partition, there should be a better way...
        (' '*i, [k]) for i, k in enumerate(sources)
    ])),
}

ordering = [
    [[], ['sources'], []],
    [['imports'], ['electricity', 'direct_use'], []],
    [[], ['uses'], []],
    [[], ['rejected', 'energy_services'], []]
]

bundles = [
    Bundle('sources', 'electricity'),
    Bundle('sources', 'uses', waypoints=['direct_use']),
    Bundle('electricity', 'uses'),
    Bundle('imports', 'uses'),
    Bundle('uses', 'energy_services'),
    Bundle('uses', 'rejected'),
    Bundle('electricity', 'rejected'),
]

In [12]:
"""Define the colours to roughly imitate the original Sankey diagram"""
palette = {
    'Solar': 'gold',
    'Nuclear': 'red',
    'Hydro': 'blue',
    'Wind': 'purple',
    'Geothermal': 'brown',
    'Natural_Gas': 'steelblue',
    'Coal': 'black',
    'Biomass': 'lightgreen',
    'Petroleum': 'green',
    'Electricity': 'orange',
    'Rejected energy': 'lightgrey',
    'Energy services': 'dimgrey',
}

And here's the result!

In [13]:
sdd = SankeyDefinition(nodes, bundles, ordering,
                       flow_partition=dataset.partition('type'))
weave(sdd, dataset, palette=palette) \
    .to_widget(width=700, height=450, margins=dict(left=100, right=120), debugging=True)

VBox(children=(SankeyWidget(groups=[{'id': 'sources', 'type': 'process', 'title': 'Sources', 'nodes': ['source…

You can save a copy of the Sankey by adding `.auto_save_png('filename.png')` or `.auto_save_svg('filename.svg')` to the end of the `weave` call in the previous box.

## Task 2 - Create your own

Follow the steps below to create an equivalent Sankey for your own country.

  1. Find and download the IEA World Energy Balances Highlights spreadsheet, from the webpage: https://www.iea.org/reports/world-energy-balances-overview. Then upload it to Colab using the `upload` button in the left panel.

  2. Import the Excel sheet to a pandas DataFrame. To find appropriate functions for the next steps either have a look at the [pandas documentation](https://pandas.pydata.org/docs/reference/index.html), or remember [your best friend](https://www.google.com/) when writing code.



In [14]:
"""Read in an Excel file"""
import pandas as pd
fileName = 'WorldEnergyBalancesHighlights2021.xlsx'
sheetName = 'TimeSeries_1971-2020'
data = pd.read_excel(fileName,sheet_name=sheetName,header=1)

3. Filter the DataFrame for only the desired country

In [39]:
"""Get desired country"""
country = 'Australia'
countryData = data.loc[data['Country']=='Australia']

4. Filter the DataFrame to only contain 'Product', 'Flow' and value for the latest full year.

In [41]:
"""Get values for latest year"""
lastYear = max([colName for colName in data.columns if isinstance(colName, int)])
filterData = countryData[['Product','Flow',lastYear]]

5. Find all sources as all unique values in product column, and all targets as all unique sources in flow column.

In [49]:
"""Get all unique types of sources and targets listed in products and flows respectively"""
sources = filterData['Product'].unique()
targets = filterData['Flow'].unique()
display(sources, targets)

array(['Coal, peat and oil shale', 'Crude, NGL and feedstocks',
       'Oil products', 'Natural gas', 'Nuclear', 'Renewables and waste',
       'Electricity', 'Heat', 'Total', 'Fossil fuels',
       'Renewable sources'], dtype=object)

array(['Production (PJ)', 'Imports (PJ)', 'Exports (PJ)',
       'Total energy supply (PJ)',
       'Electricity, CHP and heat plants (PJ)',
       'Oil refineries, transformation (PJ)',
       'Total final consumption (PJ)', 'Industry (PJ)', 'Transport (PJ)',
       'Residential (PJ)', 'Commercial and public services (PJ)',
       'Other final consumption (PJ)', 'Electricity output (GWh)'],
      dtype=object)

6. Fetching the Sankey definition for the US energy consumption example, adapt this to fit with your new source and target values.

In [106]:
sources = ['Coal, peat and oil shale', 'Crude, NGL and feedstocks', 'Oil products', 'Natural gas', 'Nuclear', 'Renewables and waste']
imports = ['Imports (PJ)']
electricity = ['Electricity']
uses = ['Industry (PJ)', 'Transport (PJ)', 'Residential (PJ)', 'Commercial and public services (PJ)','Other final consumption (PJ)']
transformation = ['Electricity, CHP and heat plants (PJ)', 'Oil refineries, transformation (PJ)']

In [123]:
"""Define the Sankey diagram definition"""
nodes = {
    'sources': ProcessGroup('type == "source"', Partition.Simple('process', sources), title='Sources'),
    # 'imports': ProcessGroup(imports, title='Net electricity imports'),
    # 'electricity': ProcessGroup(electricity, title='Electricity Generation'),
    'uses': ProcessGroup('type == "use"', partition=Partition.Simple('process', uses)),
    # 'transformation': ProcessGroup(transformation, title='Energy services'),
    #'rejected': ProcessGroup(['Rejected_Energy'], title='Rejected energy')
    
    # 'direct_use': Waypoint(Partition.Simple('source', [
    #     # This is a hack to hide the labels of the partition, there should be a better way...
    #     (' '*i, [k]) for i, k in enumerate(sources)
    # ])),
}

ordering = [
    [[], ['sources'], []],
    # [['imports'], ['electricity']],
    [[],['uses'],[]]#, ['transformation']]
   # [[], ['rejected', 'energy_services'], []]
]

bundles = [
    # Bundle('sources', 'electricity'),
    Bundle('sources', 'uses'),
    # Bundle('electricity', 'uses'),
    # Bundle('imports', 'uses'),
    # Bundle('sources', 'transformation'),
    # Bundle('uses', 'rejected'),
    # Bundle('electricity', 'rejected'),
]

In [124]:
# """define the Sankey diagram definition"""
# nodes = {
#     'sources': ProcessGroup('type == "source"', Partition.Simple('process', sources), title='Sources'),
#     'imports': ProcessGroup(['Net_Electricity_Import'], title='Net electricity imports'),
#     'electricity': ProcessGroup(['Electricity_Generation'], title='Electricity Generation'),
#     'uses': ProcessGroup('type == "use"', partition=Partition.Simple('process', uses)),
    
#     'energy_services': ProcessGroup(['Energy_Services'], title='Energy services'),
#     'rejected': ProcessGroup(['Rejected_Energy'], title='Rejected energy'),
    
#     'direct_use': Waypoint(Partition.Simple('source', [
#         # This is a hack to hide the labels of the partition, there should be a better way...
#         (' '*i, [k]) for i, k in enumerate(sources)
#     ])),
# }

# ordering = [
#     [[], ['sources'], []],
#     [['imports'], ['electricity', 'direct_use'], []],
#     [[], ['uses'], []],
#     [[], ['rejected', 'energy_services'], []]
# ]

# bundles = [
#     Bundle('sources', 'electricity'),
#     Bundle('sources', 'uses', waypoints=['direct_use']),
#     Bundle('electricity', 'uses'),
#     Bundle('imports', 'uses'),
#     Bundle('uses', 'energy_services'),
#     Bundle('uses', 'rejected'),
#     Bundle('electricity', 'rejected'),
# ]

In [125]:
"""Define the colours to roughly imitate the original Sankey diagram"""
palette = {
    'Coal, peat and oil shale': 'black',
    'Crude, NGL and feedstocks':'grey',
    'Oil products': 'purple',
    'Natural gas': 'steelblue',
    'Nuclear': 'red',
    'Renewables and waste':'green',
    'Electricity': 'orange',
    'Heat': 'red',
    'Fossil Fuels': 'darkgrey',
    'Renewable sources':'lightgreen'
}

<floweaver.dataset.Dataset at 0x7f40f2d27490>

In [129]:
sdd = SankeyDefinition(nodes, bundles, ordering,
                       flow_partition=dataset.partition('type'))
weave(sdd, Dataset(filterData), palette=palette) \
    .to_widget(width=700, height=450, margins=dict(left=100, right=120), debugging=True)

UndefinedVariableError: ignored

3. Check out the IEA Sankey as a guide from: https://www.iea.org/sankey/

  4. Organise data into the correct form for the csv files (using the two files: us-energy-consumption.csv and us-energy-consumption-processes.csv as your base - find these in your Downloads folder). This will involve allocating the IEA balance categories to the labels used in the US example. Flows into and out of each slice need to be balanced.  And the IEA has no allocation to 'energy services' and 'rejected energy' so you will need to guess, or use the ratios from the US example.  You can also check your numbers using SankeyMATIC. 

  5. Adapt the notebook to create a Sankey for your country.

  6. Save a copy by running the final cell