# Investigation of Python data visualization tools

## Introduction

See the very useful 3-part post on the [Anaconda website](https://www.anaconda.com/python-data-visualization-2018-why-so-many-libraries/)

![python-vis-landscape](https://www.anaconda.com/wp-content/uploads/2019/01/PythonVisLandscape.jpg)

## Matplotlib (https://matplotlib.org/3.1.1/index.html)

`matplotlib` is by far the most popular and widely used package in the scientific data visualization community. It is an old and very mature project, and yet still sees very active development.
It is the go-to library for creating publishable vector graphics pdf figures in articles, and specialises in 2D plots.

It is either already installed on most systems, or very easy to install on all platforms. Its strength is on high-quality static figures, and is behind other projects on the interactivity side of things.

It has some 3D capabilities through the `Axes3D`.

To see what `matplotlib` can do, a good place to go is [its sample plots page](https://matplotlib.org/tutorials/introductory/sample_plots.html).

A small example is shown below, to illustrate both the syntax and the output generated.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

In [None]:
N = 50
x = np.arange(N)
y = np.arange(N)
z = np.random.rand(N, N)

fig, ax = plt.subplots(2, 2)

ax[0][0].plot(x, z[:, 0])
ax[0][0].set_title("1D line plot")

ax[0][1].hist(z[:, 0], bins=20)
ax[0][1].set_title("1D histogram")

ax[1][0].imshow(z)
ax[1][0].set_title("2D image")

ax[1][1].scatter(z[:, 0], z[:, 1], c=z[:, 2])
ax[1][1].set_title("Scatter plot")

### Interface

When talking about `matplotlib`, a common complaint that arises is that the API is a little difficult to get around, especially for a non-experienced user/programmer. This is somewhat true, but in some sense a little inevitable in a highly customizable tool. It is however relatively straightforward to hide the interface from the user by creating pre-made wrapper functions that will cover 90% of the cases the users are after.

Examples of this are very common, for example the `pandas` library uses `matplotlib` for creating plots from the contents of its data frames, via a simple `.plot` method.

### Interactivity

Another common misconception is that interactivity is virtually not supported in `matplotlib`. In fact, the library allows multiple interactivity options with mouse events (scrolling, clicking), as well as via its own set of `widgets`. These include sliders, radio buttons, check boxes, etc... and are a nice alternative to `ipywidgets`, guaranteed to work out-of-the-box with matplotlib figures.

The image below is a diagram listing all the available widgets.
![mpl widgets](https://matplotlib.org/3.1.0/_images/inheritance-9d71a95f6d3ec40d1246549117ad7959f7b88c66.png)

Interactivity is available both in the default Qt render window, and in the Jupyter notebook. Note that for the latter, a magic command `%matplotlib notebook` needs to be added **in a separate cell** at the start of the notebook, otherwise only static images are produced. Hiding this cell from the notebook inside a library import (to save the user from having to enter it in every new notebook) is apparently not yet possible. 

[A few noteable examples of interactive plots](examples/matplotlib.ipynb#interactive_examples)

### Performance

`matplotlib` is very fast, especially for generating images with regularly-sized pixels via its `imshow` function which is much faster than other web-based solutions listed in this report.

Saving figures to vector graphics (pdf) can however sometimes be quite slow.

Interactivity with plots shows mixed performance. Zooming onto lines or images is fast, but intercation with widgets feels sluggish.

### Standard use cases
See [here](examples/matplotlib.ipynb) for a notebook demonstrating how some standard use cases are achieved using `matplotlib`.

### Summary

<span style="color:green">**Pros:**</span>

- Widespread availability on existing systems
- Easily installable
- Large well-maintained project with many developers

<span style="color:red">**Cons:**</span>


## Plotly (https://plot.ly/python/)

`plotly` is a very popular library for creating highly interactive plots to be used in a web-browser. It is Javascript based, but also has a complete Python API, making it well suited for Jupyter notebook use.

It boasts a high performance, especially for scatter plots, by making use of the WebGL technology (see [here](https://plot.ly/python/webgl-vs-svg/)).

One very nice feature of `plotly` is that its output is standalone html code that can be embedded in any web-site.

The same example as for `matplotlib` is shown below

In [None]:
import plotly.graph_objs as go
from plotly.subplots import make_subplots

In [None]:
fig = make_subplots(rows=2, cols=2)

fig.add_trace(go.Scatter(x=x, y=z[:, 0]), row=1, col=1)

fig.add_trace(go.Histogram(x=z[:, 0], xbins={"size": 0.05}), row=1, col=2)

fig.add_trace(go.Heatmap(z=z, showscale=False), row=2, col=1)

fig.add_trace(go.Scatter(x=z[:, 0], y=z[:, 1], mode='markers',
    marker={"color": z[:, 2]}), row=2, col=2)

fig.show()

### Performance

`plotly`'s performance is very good for line/scatter plots, even for very large datasets (1M points), via the `Scattergl` method.

Performance for 3D visualization seems reasonable.

### Dash

The makers of `plotly` also offer a platform called `Dash` which aims to make it very easy to create web apps with interactive menus, buttons, sliders and plots. Examples can be found here: https://dash-gallery.plotly.host/Portal/, and some are indeed very impressive.

It is also now possible to embed `Dash` apps as a separate window in the JupyterLab environment, to allow both clickable and scriptable interaction with an app: https://github.com/plotly/jupyterlab-dash

This could be a very nice alternative to developing GUIs for something like interfaces in `Mantid`.

### Community
A large community uses `plotly`.
Developers are responsive to posts on the forum, and seem very happy to help.
The main developer appears to be https://github.com/jonmmease.

### Standard use cases
See [here](examples/plotly.ipynb) for a notebook with the `plotly` standard use cases.

### Summary

<span style="color:green">**Pros:**</span>

- Easy to install
- Large well-maintained project
- Very complete in terms of features
- 3D capable

<span style="color:red">**Cons:**</span>

- Not easily useable outside of a web environment. It is possible to use a custom renderer and open a `plotly`
  graph inside a Qt window (see [here](examples/plotly.ipynb#plotly_in_qt))

## PyQtGraph (http://www.pyqtgraph.org/)

`pyqtgraph` is a very impressive tool based on OpenGL that provides 1D-3D data visualization with astonishing performance. Zooming in on a line made up of millions of points is flawless, and the same goes for a 5000x5000 image.

The OpenGL shaders can be used to create beautiful effects in 3D environments.

The installation is simple via `pip` or `conda` and comes with a large set of useful examples.

### Interactivity

Interaction with graphs via mouse events is extremely smooth, even for large datasets. It is also possible to define regions-of-interest which can be moved/reshaped with the mouse.

As the name suggests, `pyqtgraph` integrates very well in `Qt` windows, and it is relatively simple to connect graphs to sliders, buttons to enhance interactivity.

### Concerns

The project however does have some major drawbacks.

#### Lone developer

`pyqtgraph` is only being developed by one person, Luke Campagnola, and development has been slow for the last ~2 years. There are 200+ issues open on github and 100 PRs, including some that are several years old. While it seems many different people are creating PRs, all of them seem to fail the CI tests. However, they are still getting merged regardless

#### Use in Jupyter has limitations

When using `pyqtgraph` in a Jupyter notebook, one cannot embed the plots in the notebook, a separate `Qt` window has to be spawned. This causes problems if running on a remote machine, as well as making it difficult to remember which plot corresponded to which cell in a lengthy notebook.

In addition, when spawning a `Qt` window via the
```Python
app = QtGui.QApplication([])
QtGui.QApplication.exec_()
```
method, the cell in the notebook is locked and everything below can only be executed once the plot window has been killed, making the use of Jupyter rather pointless. This can however be alleviated via the use of the magic command
```Python
%gui qt
```

#### 3D visualization is not fully developed

While representing scatter plots, surfaces, etc... works very well in 3D, it feels like development was never fully completed. For instance, there are no annotated axes for 3D plots. Only a mesh that serves as a point of reference for the eye is available, with no tick marks nor any labels/annotations. While it may not be too difficult to add, it is not obvious development would be swift, judging by the number of open PRs.

### Standard use cases
See [here](examples/pyqtgraph.ipynb) for a notebook with the `pyqtgraph` standard use cases.

### Summary

<span style="color:green">**Pros:**</span>

- Easy to install
- Lightning fast performance
- Very complete in terms of features, including clickable/dragable points/lines
- Good integration with `Qt`
- 3D capable

<span style="color:red">**Cons:**</span>

- Only one developer
- Cannot embed plots in Jupyter notebook
- 3D visualization is not fully developed (missing axes)

## Bqplot

## Holoviz

## QtCharts

## Ipyvolume

## Pyvista