# Python for (open) Neuroscience

_Lecture 3.4_ - Data visualization

Luigi Petrucco

Jean-Charles Mariani

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vigji/python-cimec/blob/main/lectures/Lecture3.4_Data-visualization.ipynb)

## Outline

<span style="color:indianred">Some color</span>

 - Some theory
 - The basic plot _galateo_
 - Interactive plots
 - `seaborn` hacks
 - Configuring `matplotlib`
 - `matplotlib` hacks
 - 3D visualizations with `napari`

A great reference

![tufte_book](https://kurtgippert.cdn.bibliopolis.com/pictures/015820.jpg?v=1498601505)

In [None]:
from matplotlib import pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd

## The data-to-ink ratio principle

    “Graphical excellence is that which gives 
    the greatest number of ideas in the shortest 
    time with the least ink in the smallest space”

When plotting, always strive to maximise the information-per-ink amount!

### A low hanging fruit: ban dynamite plots!

Following this principle, dynamite plots are the worst kind of plot you can conceive:
 - each bar is conveying only 2 numbers with a lot of space! 
 - SEMs hinder the understanding of data variability
 

In [None]:
plt.rc('axes.spines', **{'bottom':True, 'left':True, 'right':True, 'top':True})
np.random.seed(40)
data1 = np.abs(np.random.normal(0, 2, 30)) + 5
data2 = -data1
data2 -= (np.mean(data2) - np.mean(data1)) 

In [None]:
plt.figure(figsize=(2, 2))
sns.barplot([data1, data2], errorbar="se")

In [None]:
plt.figure(figsize=(2, 2))
sns.swarmplot([data1, data2])

Respect each data point! They're costly and painful to get. If they're not out in your chart (and/or publicly available) they are lost forever!

### Dynamite plots must die! Do your part!

Remove the clutter

In [None]:
f, ax = plt.subplots(figsize=(1.5, 3))
sns.swarmplot(data1)
# sns.despine(bottom=True)
ax.set(xlabel="Some data", ylabel="Count")
plt.show()

In [None]:
f, ax = plt.subplots(figsize=(1.5, 3))
sns.swarmplot(data1)
sns.despine(bottom=True)
ax.set(xlabel="Some data", ylabel="Count", xticks=[])
plt.show()

Do not use different layers of information in a reduntant way:

In [None]:
x = np.random.normal(0, 1, 300)
y = x + np.random.normal(0, 1, 300)
plt.figure()
plt.scatter(x, y, c=y)
plt.show()

In [None]:
x = np.random.normal(0, 1, 300)
y = x + np.random.normal(0, 1, 300)
plt.figure()
plt.scatter(x, y)
plt.show()

Stratify information in the plot

In [None]:
plt.figure(figsize=(2, 2))
sns.swarmplot([data1, data2])

## Colormaps!

Colormaps do not only have to be beautiful, but also informative

Perceptual linearity is important! That's why you should avoid jet colormaps 

In [None]:
import numpy as np
from scipy.ndimage import gaussian_filter
np.random.seed(42)
img = np.random.normal(0, 1, (300, 300))
img = gaussian_filter(-img, 50)

In [None]:
f, axs = plt.subplots(2,2,figsize=(4, 3))
for i, lims in enumerate([(0.008, None), (0.01, -0.01)]):
    for j, cmap in enumerate(["jet", "pink"]):
        axs[i, j].imshow(img, cmap=cmap, vmax=lims[0], vmin=lims[1])
        axs[i, j].axis("off")

## Practical plot hacks

### `seaborn`

 - Great library for quick visualizations

 - super well integrated with `pandas`. works very well for datasets having the right shape

 - we will look at only some functions. You should check out more in the [gallery](https://seaborn.pydata.org/examples/index.html)

### Basic seaborn plotting functions

### `sns.swarmplot()`

### `sns.boxplot()`

In [None]:
plt.figure(figsize=(3, 1))
sns.boxplot(pd.DataFrame(dict(data1=data1, data2=data2)), orient="h")

In [None]:
?sns.boxplot

### `sns.violinplot()`

Kernel density estimation (KDE) is a way of estimating a cocnt

### Advanced, multi-layer plots

### lineplot with error bars

In [None]:
fmri = sns.load_dataset("fmri")
fmri.head()

In [None]:
sns.lineplot(data=fmri, x="timepoint", y="signal", hue="event")

### lineplot with colors based on feature

In [None]:
dots = sns.load_dataset("dots").query("align == 'dots'")
dots.head()

In [None]:
sns.lineplot(
    data=dots, x="time", y="firing_rate", hue="coherence", style="choice",
)

### Multiple linear regression plots

In [None]:
penguins = sns.load_dataset("penguins")

g = sns.lmplot(
    data=penguins,
    x="bill_length_mm", y="bill_depth_mm", hue="species",
    height=3
)

### KDE joint plot

In [None]:

f = plt.figure()
# Show the joint distribution using kernel density estimation
g = sns.jointplot(
    data=penguins,
    x="bill_length_mm", y="bill_depth_mm", hue="species",
    kind="kde",
    height=4
)

## From plots to figures

General recommandation:

    Do **not** edit your figures in 
    Inkscape/illustrator after generating them!

Data changes all the times (new inclusions/exclusions, different preprocessing...). You do not want to manually edit figures multiple times!

### A good general approach

Fix as much as possible already on the plot, so if you have to make a similar figure or update if with new data (or new pipelines) it's 0 time.

Illustrator can be use to compose the final figure linking individual panels

### Ensure good formatting in exports

Kernel density estimation (KDE) is a way of estimating a cocnt

Plotting many lines

Label your curves

Despine axes

Use lines and color spans

Contour plots

(Practical 3.4.0)