# Interactive Visualisation and Dashboards

In [1]:
import pypsa
import atlite
import pandas as pd
import geopandas as gpd
import xarray as xr
import matplotlib.pyplot as plt
plt.style.use('bmh')


The namespace `pypsa.networkclustering` is deprecated and will be removed in PyPSA v0.24. Please use `pypsa.clustering.spatial instead`. 



In [2]:
from urllib.request import urlretrieve
from os.path import basename

urls = [
    "https://tubcloud.tu-berlin.de/s/2oogpgBfM5n4ssZ/download/PORTUGAL-2013-01-era5.nc",
]
for url in urls:
    urlretrieve(url, basename(url))

## Load Example Data

First, let's load a few example datasets you know from previous tutorials.

A PyPSA network:

In [3]:
n = pypsa.Network("https://tubcloud.tu-berlin.de/s/kpWaraGc9LeaxLK/download/network-cem.nc")

INFO:pypsa.io:Retrieving network data from https://tubcloud.tu-berlin.de/s/kpWaraGc9LeaxLK/download/network-cem.nc
INFO:pypsa.io:Imported network network-cem.nc has buses, carriers, generators, global_constraints, loads, storage_units


In [4]:
n.optimize(solver_name='cbc');

INFO:linopy.model: Solve linear problem using Cbc solver
INFO:linopy.io:Writing objective.
Writing constraints.: 100%|[38;2;128;191;255m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████[0m| 15/15 [00:00<00:00, 60.60it/s][0m
Writing continuous variables.: 100%|[38;2;128;191;255m████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████[0m| 7/7 [00:00<00:00, 181.76it/s][0m
INFO:linopy.io: Writing time: 0.3s


Welcome to the CBC MILP Solver 
Version: 2.10.10 
Build Date: Apr 19 2023 

command line - cbc -printingOptions all -import /tmp/linopy-problem-q80d5gso.lp -solve -solu /tmp/linopy-solve-ofc25d67.sol (default strategy 1)
Option for printingOptions changed from normal to all
Presolve 25230 (-25147) rows, 18665 (-3241) columns and 69120 (-32766) elements
Perturbing problem by 0.001% of 232.80852 - largest nonzero change 0.0003084482 ( 7.1472055%) - largest zero change 0.00030841019
0  Obj 0 Primal inf 1.7299319e+08 (2190)
351  Obj 1485.0331 Primal inf 2.0215782e+08 (2517)
702  Obj 19742.419 Primal inf 2.3731013e+08 (2833)
1053  Obj 1.1540031e+10 Primal inf 2.1548549e+08 (2266)
1404  Obj 1.7500237e+10 Primal inf 2.4173489e+08 (3141)
1755  Obj 1.7500464e+10 Primal inf 2.714411e+08 (3128)
2106  Obj 1.7503766e+10 Primal inf 4.7081187e+08 (3183)
2457  Obj 1.7504025e+10 Primal inf 3.9448973e+08 (3277)
2808  Obj 1.7504176e+10 Primal inf 4.4155116e+08 (3342)
3159  Obj 1.7504243e+10 Primal inf 4.

INFO:linopy.constants: Optimization successful: 
Status: ok
Termination condition: optimal
Solution: 21906 primals, 50377 duals
Objective: 6.58e+10
Solver model: not available
Solver message: Optimal - objective value 65813180032.18475342




22391  Obj 6.5698207e+10 Primal inf 4690398.9 (574)
22742  Obj 6.5805975e+10 Primal inf 1013843.1 (147)
23003  Obj 6.5813274e+10
Optimal - objective value 6.581318e+10
After Postsolve, objective 6.581318e+10, infeasibilities - dual 0 (0), primal 0 (0)
Optimal objective 6.581318003e+10 - 23003 iterations time 3.982, Presolve 0.02
Total time (CPU seconds):       4.46   (Wallclock seconds):       4.27



Wind, solar and demand time series:

In [5]:
url = "https://tubcloud.tu-berlin.de/s/nwCrNLrtL6LAN3W/download/time-series-lecture-2.csv"
ts = pd.read_csv(url, index_col=0, parse_dates=True)

Power plants in Europe

In [6]:
url = "https://raw.githubusercontent.com/PyPSA/powerplantmatching/master/powerplants.csv"
ppl = pd.read_csv(url, index_col=0)
geometry = gpd.points_from_xy(ppl['lon'], ppl['lat'])
ppl = gpd.GeoDataFrame(ppl, geometry=geometry, crs=4326)

NUTS2 regions:

In [7]:
url = "https://tubcloud.tu-berlin.de/s/RHZJrN8Dnfn26nr/download/NUTS_RG_10M_2021_4326.geojson"
nuts = gpd.read_file(url).set_index('id').query("LEVL_CODE == 2")

An `atlite` cutout:

In [8]:
cutout = atlite.Cutout("PORTUGAL-2013-01-era5.nc")

## Limitations of Static Plotting with Matplotlib

You will agree that using `matplotlib` for static plotting is great for reports, but that it's lacking some features for interactive visualisation.

There are many Python-based interactive plotting libraries out there, and it can be [confusing to keep an overview](https://medium.com/mlearning-ai/top-python-libraries-for-data-visualization-static-and-interactive-visualization-e5f1bc72de41). This tutorial introduces you to two of them:

- [hvPlot](https://hvplot.holoviz.org/index.html), which is a high-level API mostly for [bokeh](https://docs.bokeh.org/en/latest/) plots that integrates nicely with `pandas`.
- [plotly.express](https://plotly.com/python/plotly-express/), which is a high-level API for [plotly](https://plotly.com) plots.

These two tools allow you to produce shiny interactive figures with minimal code, however, at the expense of fewer customisation options.

## hvPlot

<img src="https://hvplot.holoviz.org/assets/diagram.svg" width="900px" /> 

`.hvplot()` is a powerful and interactive Pandas-like `.plot()` API. You just replace `.plot()` with `.hvplot()` and you get an interactive figure. Simple as that.

It can be installed via `conda` or `mamba` in the following way:

```sh
conda install -c pyviz hvplot geoviews
```

Documentation can be found here: https://hvplot.holoviz.org/index.html

To use it, we have to import `hvplot.pandas`, which makes the `.hvplot` accessor available on Pandas DataFrame and Series objects, which means that after that `df.hvplot` becomes a valid statement while before that it would raise an error.

Let's try it by plotting onshore wind time series for the year...

... or the load time series for February

We can also plot geographic data with **hvPlot**, for instance, the locations of all hard coal power plants in Europe.

The `geo=True` declares that the data will be plotted in a geographic coordinate system.
Once **hvPlot** knows that your data is in geo-coordinates, you can use the `tiles` keyword argument to overlay a the plot on top of map tiles.

:::{note}
For a list of available tiles, look [here](https://holoviews.org/reference/elements/bokeh/Tiles.html).
:::

Like in `geopandas`, we can tell **hvPlot** to plot the point sizes and colors according to columns of the `pandas.DataFrame`. We can also change the opacity with `alpha` and the colormap with `cmap`. 

There are a few more options of the graph we can tweak in the `opts()` section, like which tools should be activated by default.

All this does not only work with points but also shapes. We can also pick the columns that should be shown when hovering on a shape using `hover_cols`.

We can also use **hvPlot** for `xarray` datasets (e.g. `atlite` cutouts).

For that, we need to import the corresponding `xarray` accessors.

So let's try it by plotting the wind speeds in Portugal as provided by ERA5. The nice thing you will notice is that it will automatically open a panel for dimensions that we did not select explicitly. In this case we can easily sweep across the time dimension. Notice also the customisation options we use here.

We can also plot the time series of solar generation in Germany on a heatmap:

**hvPlot** also offers stacked area charts that come in handy for plotting the power dispatch of a solved PyPSA network:

**hvPlot** also has a nice explorer that can be displayed in a Jupyter notebook and that can be used to quickly create customized plots.

## Plotly Express

> The `plotly.express` module (usually imported as px) contains functions that can create entire figures at once. Plotly Express is a built-in part of the `plotly` library, and is the recommended starting point for creating most common figures. Every Plotly Express function uses graph objects internally and returns a plotly.graph_objects.Figure instance. Throughout the plotly documentation, you will find the Plotly Express way of building figures at the top of any applicable page, followed by a section on how to use graph objects to build similar figures. Any figure created in a single function call with Plotly Express could be created using graph objects alone, but with between 5 and 100 times more code.

Documentation is available here: https://plotly.com/python/plotly-express/

It can be installed via `conda` or `mamba` in the following way:

```sh
conda install -c conda-forge plotly
```

:::{note}
We need to import `plotly.io` and `plotly.offline`, so that the interactive plots are also visible on the course's static website.
:::

Let's reproduce the plots we previously created with **hvPlot**. Onshore wind capacity factor time series:

Load time series in February:

Hard coal power plants in Europe:

The integration with `xarray` datasets is not as nice as in **hvPlot**.

But in `plotly`, hovering information on the area chart works much better.

In [None]:
dispatch = pd.concat([n.generators_t.p, n.storage_units_t.p], axis=1).loc["2015-02"].div(1e3)

## Interactive Dashboards

There are many different options for building interactive dashboards. Some are brand new, some have been around for a few years.

<img src="https://global-uploads.webflow.com/5d3ec351b1eba4332d213004/5f99e10dafbd69a99c875340_C8_qX8dvzv60T4LVZ9GftX-ZH-VJzq3sjUroWWH5XSWw8RFHnCCPPrC6jB3EFVuQdwiqhoEMQKFV-dFz7t6fqaRpSZGvBKI0i1Utj38_j9a54GXMuzi1BiepdIMjOK4ATVdF2131.png" width="900px" /> 

Each of them has different characteristics, for instance in terms of customisation options and ease of use.

<img src="https://global-uploads.webflow.com/5d3ec351b1eba4332d213004/5fa3b0d4a2043bcf84d49134_z87mnMfsPGOF7L3sGULQBusJnJTWGZHWtoizufuDR1q1A6JggFWO9IYHXf8wFyqgKhuG6hEGOPM4Acb-qmNXxwCFW95DPX9r7Syewkejb7itbmm8I_os2XI8bightYGJq7Gg-FXo.png" width="900px" /> 

If you want to read a detailed comparison, the best one I found is this one:

https://www.datarevenue.com/en-blog/data-dashboarding-streamlit-vs-dash-vs-shiny-vs-voila


> **Just tell me which one to use**
>
> As always, “it depends” – but if you’re looking for a quick answer, you should probably use:
>
>    - Dash if you already use Python for your analytics and you want to build production-ready data dashboards for a larger company.
>    - **Streamlit if you already use Python for your analytics and you want to get a prototype of your dashboard up and running as quickly as possible.**
>    - Shiny if you already use R for your analytics and you want to make the results more accessible to non-technical teams.
>    - Jupyter if your team is very technical and doesn’t mind installing and running developer tools to view analytics.
>    - Voila if you already have Jupyter Notebooks and you want to make them accessible to non-technical teams.
>    - Flask if you want to build your own solution from the ground up.
>    - Panel if you already have Jupyter Notebooks, and Voila is not flexible enough for your needs.

In this tutorial, we look at `streamlit` because it is the easiest to get to results quickly. However, compared to other dashboarding libraries, it has more limited configuration options.

Documentation for this package can be found here: https://docs.streamlit.io/

Streamlit can be installed, for example, with `conda`, `mamba` or `pip`:

```sh
conda install -c conda-forge streamlit'>=1.18'
```

or

```sh
pip install streamlit
```

:::{note}
This tutorial requires `streamlit>=1.18`.
:::

This tutorial is stored on Github with instructions how to install, run and deploy it:

https://github.com/fneum/streamlit-tutorial

You can see a live demo of the final product here:

https://ppm-dash.streamlit.app/
