# Using SankeyMATIC

SankeyMATIC is an online Sankey drawing software, which  builds on the open source tool D3.js and its Sankey library. SankeyMATIC unlocks the capabilities of the D3 Sankey tool for anyone to use.

## 1. Familiarise

Open SankeyMATIC in a web browser: http://sankeymatic.com

Read the home page, and spend some time playing with the example in the "Build a Sankey Diagram" tab, including understanding what each of the options does.

# 2. Prepare an example dataset

Here you will be loading a dataset, preparing it for importing into SankeyMATIC, and then returning to SankeyMATIC to visualise it.

1. In the [`example_data/us-energy-consumption.csv`](./example_data/us-energy-consumption.csv), file convert the data into the form "source [value] target" required for SankeyMATIC. 
   Below is an example of preparing this data with Python.
   You can also use Excel, although this is not recommended; consider the `&` function or `CONCATENATE`.
1. Paste the data into SankeyMATIC and "Preview". 
   This example is based on the [Sankey diagrams of US energy consumption from the Lawrence Livermore National Laboratory](https://flowcharts.llnl.gov/commodities/energy). 
   Play with the functions in SankeyMATIC, including dragging the flows, to get it to look like the LLNL example. 
   Take a screenshot.  
1. Make a note of some of the frustrations you have using this online tool.  

### Create simple flow table

In [None]:
# Example Python script to convert data in the file

import pandas as pd

# Load the data
df = pd.read_csv("example_data/us-energy-consumption.csv")

# View the data
display(df.head())

In [None]:
# Prepare the data for SankeyMATIC
sankey_matic_data = df["source"] + " [" + df["value"].astype(str) + "] " + df["target"]

# View the data
display(sankey_matic_data.head())

In [None]:
# Save the data
sankey_matic_data.to_csv("outputs/sankey_matic_input.txt", index=False, header=False)

### Add colour

You can add colour settings per specific flow (`source [value] target #colour`) or per node (`:node #colour`).

Here, we will use colour hex keys to define node colours.

In [None]:
# We can also add colour information using HEX keys
colours = {
    "Solar": "#FFD700",
    "Nuclear": "#808080",
    "Hydro": "#1E90FF",
    "Wind": "#00BFFF",
    "Geothermal": "#FF4500",
    "Natural_Gas": "#FFA500",
    "Coal": "#2F4F4F",
    "Biomass": "#228B22",
    "Petroleum": "#A52A2A",
    "Net_Electricity_Import": "#000000",
    "Electricity_Generation": "#696969",
    "Residential": "#FF69B4",
    "Commercial": "#8A2BE2",
    "Industrial": "#D2691E",
    "Transportation": "#DC143C",
}
# SankeyMATIC expects the format ":Node Colour"
colours_df = pd.Series([f":{k} {v}" for k, v in colours.items()])

# We can then append this to the original data
sankey_matic_data_with_colours = pd.concat([sankey_matic_data, colours_df])

# We can see that the original data is at the top, and the colours settings at the bottom
display(sankey_matic_data_with_colours.head())
display(sankey_matic_data_with_colours.tail())

In [None]:
# Save the data
sankey_matic_data_with_colours.to_csv(
    "outputs/sankey_matic_input_with_colours.txt", index=False, header=False
)

### Add settings

Settings can be appended to the text file in the format `<setting_group> <setting> <value>`.

In the settings below you can see an indentation, which indicates that a setting is part of the group mentioned above without an indentation.
For instance, `w` and `h` are part of the `size` setting group.
To update `h` to 900 pixels, we would add to our text file: `size h 900`.
To update both `w` and `h` we could write:

```yaml
size w 900
  h 900
```
or 

```yaml
size w 900
size h 900
```

```yaml

// === Settings ===

size w 600
 h 600
margin l 12
 r 12
 t 18
 b 20
bg color #ffffff
 transparent N
node w 12
 h 50
 spacing 75
 border 0
 theme a
 color #888888
 opacity 1
flow curvature 0.5
 inheritfrom outside-in
 color #999999
 opacity 0.45
layout order automatic
 justifyorigins N
 justifyends N
 reversegraph N
 attachincompletesto nearest
labels color #000000
 hide N
 highlight 0.75
 fontface sans-serif
 linespacing 0.2
 relativesize 110
 magnify 100
labelname appears Y
 size 16
 weight 400
labelvalue appears Y
 fullprecision Y
 position below
 weight 400
labelposition autoalign 0
 scheme auto
 first before
 breakpoint 5
value format ',.'
 prefix ''
 suffix ''
themeoffset a 2
 b 0
 c 0
 d 0
meta mentionsankeymatic Y
 listimbalances Y
```

Here, we will just update one setting, to add a unit to the data, using the `suffix` command.

In [None]:
setting = "value suffix ' PJ'"

sankey_matic_data_with_colours_and_suffix = pd.concat(
    [sankey_matic_data_with_colours, pd.Series([setting])]
)

sankey_matic_data_with_colours_and_suffix.to_csv(
    "outputs/sankey_matic_input_with_colours_and_suffix.txt", index=False, header=False
)