# Matploblib

- Henry Webel at [NNF CPR](https://www.cpr.ku.dk/staff/rasmussen-group/?pure=en/persons/662319)
- Python Tsumanmi 2020 at [SUND]()
- Session : `Day 1, 15:30 -18.00`


<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/pythontsunami/teaching/blob/matplotlib/matplotlib.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
</table>

### Saving the notebook in Drive
Save a copy in your drive if you want to save your changes: `File` -> `Save a copy in Drive`


![Save Colab Notebook in Google Drive](figures/colab_save_in_drive.png)

## Tutorial selection
- [cheat sheets](https://github.com/rougier/matplotlib-cheatsheet)
- [SciPy 2019](https://github.com/story645/mpl_tutorial)
- [Usage Guide](https://matplotlib.org/tutorials/introductory/usage.html#sphx-glr-tutorials-introductory-usage-py)
- [The lifecycle of a plot](https://matplotlib.org/tutorials/introductory/lifecycle.html)


## Objectives

1. Matplotlibs two [APIs](https://matplotlib.org/3.1.1/api/index.html#usage-patterns): pyplot API vs object-orientated API like interface vs OOM
2. Distinction between figure, axes and axis
3. labels, ticks, legends, annotations
4. Some use-cases
5. No confusion when you search for help on [stackoverflow](https://stackoverflow.com/questions/tagged/matplotlib)

> The goal is not to introduce most parts of the API, but to make it accessible

## [Matplotlib](https://matplotlib.org/3.1.1/index.html)

- versatile set of instruction for plotting figures
- widely used by third party libraries
- supports many [backends](https://matplotlib.org/tutorials/introductory/usage.html#backends) (application, machine or operating system specific)

In [None]:
import matplotlib

matplotlib.__version__

In [None]:
matplotlib.get_backend()

## Matplotlib API

- `pyplot` global functionality is a copy of `matlab`-plotting functinality.
- recommendation by matplotlib: Use object orientated plotting, see [usage guide](https://matplotlib.org/3.1.1/tutorials/introductory/usage.html#figure)


### Example
    1.  `pyplot` plotting
    2. Object Orientated plotting

### Pyplot-API

In [None]:
import matplotlib.pyplot as plt

fig = plt.figure()
ax = plt.subplot()

### Object-Oriented API

In [None]:
import matplotlib

fig = matplotlib.figure.Figure()
ax = fig.add_subplot(1, 1, 1)
fig

### Mixing up both

> This is in my impression the most common use

In [None]:
import matplotlib.pyplot as plt

fig, ax = plt.subplots()

## Naming

- common language vs (painful) _automated programming interface_ (API) naming
- A *figure* can contain multiple **ax*e*s**, each of which in a 2D plot has two [**ax*e*s**](https://www.merriam-webster.com/dictionary/axis#:~:text=Language%20Learners%20Dictionary-,axis,axes%5C%20%CB%88ak%2D%E2%80%8B%CB%8Cs%C4%93z%20%5C) (which in singular are the **x-axis** and **y-axis**)

![Figure, Axes and Axis](figures/matplotlib/fig_axes_axis.png)

## Anatomy of Figure
- [code](https://matplotlib.org/3.1.1/gallery/showcase/anatomy.html)

![[Matplotlib Anatomoy of a Figure](https://matplotlib.org/3.1.1/gallery/showcase/anatomy.html)](https://matplotlib.org/3.1.1/_images/anatomy.png)

### Plot without the annotation

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.ticker import AutoMinorLocator, FuncFormatter, MultipleLocator

# Numpy part - please skip
np.random.seed(19680801)

X = np.linspace(0.5, 3.5, 100)
Y1 = 3 + np.cos(X)
Y2 = 1 + np.cos(1 + X / 0.75) / 2
Y3 = np.random.uniform(Y1, Y2, len(X))

# Matplotlib part
fig = plt.figure(figsize=(8, 8))
ax = fig.add_subplot(1, 1, 1, aspect=1)


def minor_tick(x, pos):
    if not x % 1.0:
        return ""
    return "%.2f" % x


ax.xaxis.set_major_locator(MultipleLocator(1.000))
ax.xaxis.set_minor_locator(AutoMinorLocator(4))
ax.yaxis.set_major_locator(MultipleLocator(1.000))
ax.yaxis.set_minor_locator(AutoMinorLocator(4))
ax.xaxis.set_minor_formatter(FuncFormatter(minor_tick))

ax.set_xlim(0, 4)
ax.set_ylim(0, 4)

ax.tick_params(which="major", width=1.0)
ax.tick_params(which="major", length=10)
ax.tick_params(which="minor", width=1.0, labelsize=10)
ax.tick_params(which="minor", length=5, labelsize=10, labelcolor="0.25")

ax.grid(linestyle="--", linewidth=0.5, color=".25", zorder=-10)

ax.plot(X, Y1, c=(0.25, 0.25, 1.00), lw=2, label="Blue signal", zorder=10)
ax.plot(X, Y2, c=(1.00, 0.25, 0.25), lw=2, label="Red signal")
ax.plot(X, Y3, linewidth=0, marker="o", markerfacecolor="w", markeredgecolor="k")

ax.set_title("Anatomy of a figure", fontsize=20, verticalalignment="bottom")
ax.set_xlabel("X axis label")
ax.set_ylabel("Y axis label")

_ = ax.legend()

## Plotting Ecosystem

Libraries using matplotlib
- seaborn 
- pandas `.plot`- method ([guide](https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html), 
    [method-doc](https://pandas.pydata.org/pandas-docs/stable/reference/frame.html#plotting), 
    [plotting-sublibrary](https://pandas.pydata.org/pandas-docs/stable/reference/plotting.html))
    
> Claim: You will hardly use matplotlib directly, when your data is in a [`pandas.DataFrame`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html)  
> If you program in pure numpy this would be different.
    

In [None]:
import seaborn as sns

sns.__version__

In [None]:
import pandas as pd

pd.__version__

In [None]:
# help(pd.plotting)

## Example using matplotlib, seaborn and pandas plotting together

### Data

In [None]:
data_fasta_uniprot = {
    "protein_id": {
        0: 0,
        1: 1173665,
        2: 703064,
        3: 469464,
        4: 301013,
        5: 191521,
        6: 117915,
        7: 72008,
        8: 48178,
        9: 33798,
        10: 23277,
    },
    "protein": {
        0: 0,
        1: 1758959,
        2: 749718,
        3: 346667,
        4: 153179,
        5: 81073,
        6: 42527,
        7: 24434,
        8: 14458,
        9: 9126,
        10: 6935,
    },
    "gene": {
        0: 19444,
        1: 3105730,
        2: 64634,
        3: 11683,
        4: 3670,
        5: 1866,
        6: 931,
        7: 821,
        8: 469,
        9: 407,
        10: 237,
    },
}

## Figure with 4 subplots

In [None]:
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(10, 10))
axes

### Matplotlib bar plot

- check [`axes` API](https://matplotlib.org/3.1.0/api/axes_api.html) documentation


In [None]:
x = list(data_fasta_uniprot["protein_id"].keys())
x

In [None]:
y = list(data_fasta_uniprot["protein_id"].values())
y

In [None]:
ax = axes[0, 0]
ax.bar(x, y)

In [None]:
fig

### Bar plot using Pandas

In [None]:
df_fasta_uniprot = pd.DataFrame(data_fasta_uniprot)
df_fasta_uniprot

In [None]:
df_fasta_uniprot.plot(kind="bar", ax=axes[0, 1])

In [None]:
fig

#### Side Note: Matplotlib also works with `pandas.DataFrames`

> Don't use it. Use Pandas directly!

In [None]:
axes[1,0].clear()
axes[1,0].bar(x=list(range(len(df_fasta_uniprot))), height='protein_id', data=df_fasta_uniprot)

In [None]:
fig

In [None]:
axes[1,0].clear()

### Barplot in Seaborn

- seaborn summarizes the data 

In [None]:
sns.barplot(data=df_fasta_uniprot.T, ax=axes[1, 1])
fig

### Exercise: Plot something in the last subplot

## Extras: Some spotlights

> Collection of options for figure layouts (tbc)

### Different sized subplots

In [None]:
fig = plt.figure()
# define axes [left, bottom, width, height] as fractions of figure width and height.
frames = [[0.04, 0.08, .22, .90], [0.4, 0.08, .63, .90]]
axes = (fig.add_axes(frames[0], frame_on=False), fig.add_axes(frames[1], frame_on=True))

In [None]:
fig, axes = plt.subplots(ncols=2, gridspec_kw={"width_ratios": [5, 1], "wspace": 0.2}, figsize=(10,4))

In [None]:
_ = axes[1].axis("off")
fig

In [None]:
fig.clear() # fig.clf()
fig

### Reusing code

- define a function which takes as first argument an `axes` object: `myfunc(ax, ...)`
- use only object-orientated matplotlib (OOM) API

> Reference: [Coding Style](https://matplotlib.org/3.1.1/tutorials/introductory/usage.html#coding-styles) section in Usage Guide

In [None]:
#ToDo

## Galary
- [gallary](https://matplotlib.org/gallery) 
- check out some examples: [XKCD](https://matplotlib.org/3.1.1/gallery/showcase/xkcd.html#sphx-glr-gallery-showcase-xkcd-py)


## Covid19 data

Let's plot some Covid19 aggregates.

In [None]:
import os
import pandas as pd

FOLDER_DATA = 'data/covid-19/data/'
COUNTRIES_AGG = os.path.join(FOLDER_DATA, 'countries-aggregated.csv')
TIME_SERIES_COVID19 = os.path.join(FOLDER_DATA, 'time-series-19-covid-combined.csv')
REFERENCE = os.path.join(FOLDER_DATA, 'reference.csv')

data_covid19 = pd.read_csv(COUNTRIES_AGG, index_col='Date')
data_covid19_reference = pd.read_csv(REFERENCE)

## Different kinds of plots

matplotlib | pandas | seaborn
---------- | ------ | -------
ax.plot

## Color maps and Styles

- [Color maps](https://matplotlib.org/3.3.1/tutorials/colors/colormaps.html#sphx-glr-tutorials-colors-colormaps-py), abreviated `cmap`, map numeric values (of a certain range) to colors.
- [style sheets](https://matplotlib.org/3.3.1/gallery/style_sheets/style_sheets_reference.html#sphx-glr-gallery-style-sheets-style-sheets-reference-py) define several aspects at once

## Scientific Figures - Case Study

In [None]:
data_fasta_uniprot = {
    "protein_id": {
        0: 0,
        1: 1173665,
        2: 703064,
        3: 469464,
        4: 301013,
        5: 191521,
        6: 117915,
        7: 72008,
        8: 48178,
        9: 33798,
        10: 23277,
    },
    "protein": {
        0: 0,
        1: 1758959,
        2: 749718,
        3: 346667,
        4: 153179,
        5: 81073,
        6: 42527,
        7: 24434,
        8: 14458,
        9: 9126,
        10: 6935,
    },
    "gene": {
        0: 19444,
        1: 3105730,
        2: 64634,
        3: 11683,
        4: 3670,
        5: 1866,
        6: 931,
        7: 821,
        8: 469,
        9: 407,
        10: 237,
    },
}

In [None]:
x = np.linspace(0, 2, 100)

plt.plot(x, x, label='linear')
plt.plot(x, x**2, label='quadratic')
plt.plot(x, x**3, label='cubic')

plt.xlabel('x label')
plt.ylabel('y label')

plt.title("Simple Plot")

plt.legend()

plt.show()

In [None]:
x = np.arange(11)

plt.plot(x, data_fasta_uniprot['protein_id'].values(), label='protein_id')
plt.plot(x, data_fasta_uniprot['protein'].values(), label='protein')
plt.plot(x, data_fasta_uniprot['gene'].values(), label='gene')

plt.xlabel('x label')
plt.ylabel('y label')

plt.title("Simple Plot")

plt.legend()

plt.show()

## Remarks
- matplotlib is powerful and thus maybe frighting in the beginning
- many issues and technical details are still a mistery for me
- this notebook is not a reference, please browse the offical documentation for latest news
    - examples are referenced in text part.

## Matplotlib Documentation
- [Tutorials](https://matplotlib.org/tutorials/index.html)