![Matplotlib](http://upload.wikimedia.org/wikipedia/en/5/56/Matplotlib_logo.svg)

<h1 id="tocheading">Table of Contents</h1>
<div id="toc"></div>

In [None]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')

<a id=setup></a>
# Notebook Setup (run me first!)

In order to work with Matplotlib, the library must be imported first. So we do not have to type so much, we give it a shorter name:

In [None]:
import matplotlib.pyplot as plt

We apply a "magic command" to have interactive plots in the notebook.

This is using the `ipympl` package. 

Without this, you will only get non-interactive versions of the plots.

In [None]:
# only for the notebook
%matplotlib widget

# only in the ipython shell
# %matplotlib

Matplotlib works best with numpy arrays, so we import `numpy` as well

In [None]:
import numpy as np

In [None]:
# this is a matplotlib tutorial, we are going to open lots of figures ;-)
plt.rcParams['figure.max_open_warning'] = 100

<a id=line_plots></a>
# Basics

## Line plots

In [None]:
x = np.linspace(0, 1, 250) # 250 numbers from 0 to 1

# create a new plot, this is required when using ipympl
# as otherwise, code in a new cell will affect a previously 
# created plot
plt.figure()

# plot some data
plt.plot(x, x**2)


# If not interactive, e.g. in a script: 
# plt.show()

## Using different styles for plots

In [None]:
t = np.linspace(0, 2 * np.pi, 50)   # 50 points between 0 and 2π

# constrained_layout is a new feature in matplotlib, 
# to automatically adjust items in a plot to best use the available space
plt.figure(constrained_layout=True)
plt.plot(t, np.sin(t));

In [None]:
plt.figure(constrained_layout=True)
plt.plot(t, np.sin(t), '--');

In [None]:
plt.figure(constrained_layout=True)

# plt.plot(t, np.sin(t), 'go')

# same thing! But with better keywords
plt.plot(t, np.sin(t), color='green', marker='o', linestyle='');

None

In [None]:
x = np.linspace(0, 1, 100)

plt.figure(constrained_layout=True)
for n in range(9):
    
    # C<N> is the nth color of the current color cycle
    plt.plot(x**(n + 1), color=f'C{n}')

All styles and colors: [matplotlib.axes.Axes.plot](http://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.plot.html#matplotlib.axes.Axes.plot)



<a id=setting_limits></a>
## Setting x and y limits

In [None]:
plt.figure(constrained_layout=True)

plt.plot(t, np.sin(t))
plt.xlim(0, 2*np.pi)
plt.ylim(-1.2, 1.2)

None # Do not show matplotlib return value in notebook

## Axes labels

![XKCD comic on why you should label your axes.](http://imgs.xkcd.com/comics/convincing.png "And if you labeled your axes, I could tell you exactly how MUCH better.")


In [None]:
with plt.xkcd():
    
    plt.figure(constrained_layout=True)
    plt.title('Axes with labels')
    plt.plot(t, np.sin(t))
    
    plt.xlabel('t / s')
    plt.ylabel('U / V')
    
    plt.ylim(-1.1, 1.1)
    plt.xlim(0, 2*np.pi)

### A side note on units


Contrary to some customs in HEP or Astronomy, the BiPM has strict rules on how to typeset quantities, numbers and units in table headers and axis labels:

> Symbols for units are treated as mathematical entities. In expressing the value of a
quantity as the product of a numerical value and a unit, both the numerical value and
the unit may be treated by the ordinary rules of algebra.
It is often convenient to
write the quotient of a quantity and a unit in this way for the heading of a column in a
table, so that the entries in the table are all simply numbers. [...]
The axes of a graph may also be labelled in this way, so that the tick marks are
labelled only with numbers


[Bureau International des Poids et Measures,  The International System of Units, Chapter 5, Section 4](https://www.bipm.org/utils/common/pdf/si-brochure/SI-Brochure-9-EN.pdf)

A physical quantity is always a product of a number and a unit, so what is shown, is the quantity divided by the unit.

Especially square brackets have a totally different meaning and are highly problematic if it comes to mathematical operations on quantities like $\log_{10}(E  \,/\, 1\,\mathrm{GeV})$

<a id=label_formatting></a>
### Label formatting

These are preferably set outside the script in a matplotlibrc file,
see https://matplotlib.org/users/customizing.html

In [None]:
plt.figure(constrained_layout=True)

plt.plot(t, np.sin(t))

title_font = {'fontsize': 24, 'fontweight': 'bold', 'family': 'serif'}
axes_font = {'fontsize': 18}

plt.xlabel('t / s', axes_font)
plt.ylabel('U / V', axes_font)
plt.title('Always label your plots!', title_font);

### Using LaTeX

Matplotlib can handle a rather complete subset of LaTeX in any text

In [None]:
plt.figure(constrained_layout=True)

plt.plot(t, np.sin(t))

# leading r means "raw string", so that '\' does not need to be escaped
plt.xlabel(r'$t \,/\, \mathrm{s}$')
plt.ylabel(r"$\int_0^t \cos(t') \, \mathrm{d}t'$");

## Legends

Matplotlib can create legends automatically for plot objects that have a label.

In [None]:
plt.figure(constrained_layout=True)

plt.plot(t, np.sin(t), label=r'$\sin(t)$')
plt.plot(t, np.cos(t), label=r'$\cos(t)$')
plt.legend()
# plt.legend(loc='upper center')

None # only to avoid cluttering the notebook

In [None]:
plt.figure(constrained_layout=True)

x = np.linspace(0, 1)

plt.plot(x, x**2, label=r'$x^2$')
plt.plot(x, x**4)
plt.plot(x, x**6, 'o', label=r'$x^6$')

plt.legend(loc='best');

**Remember**: Legend entries are only generated for plot objects that have a label (note x⁴ is missing)!

### Legend outside the plot

In [None]:
plt.figure(constrained_layout=True)

plt.plot(t, np.sin(t), label=r'$\sin(t)$')
plt.plot(t, np.cos(t), label=r'$\cos(t)$')
plt.legend()

# put the legend in the center above the plot
# coordinates are relative to the axes size.
plt.legend(
    bbox_to_anchor=(0.5, 1.01),
    loc='lower center',  # this no sets the anchor of the legend
    ncol=2,
)

## Grids

In [None]:
plt.figure(constrained_layout=True)
plt.plot(t, np.sin(t))
plt.grid()

## Axis-Scales

In [None]:
plt.figure(constrained_layout=True)

x = np.linspace(0, 10)
# x = np.logspace(-1, 2, 100)

plt.plot(x, np.exp(-x))
plt.yscale('log')
# plt.xscale('log')

## Custom Ticks

In [None]:
x = np.linspace(0, 2*np.pi)

plt.figure(constrained_layout=True)

plt.plot(x, np.sin(x))
plt.xlim(0, 2*np.pi)
# First argument: position, second argument: labels
plt.xticks(
    np.arange(0, 2*np.pi + 0.1, np.pi/2),
    [r"$0$", r"$\frac{1}{4}\tau$", r"$\frac{1}{2}\tau$", r"$\frac{3}{4}\tau$", r"$\tau$"]
)
plt.title(r"$\tau$ FTW!")   # https://tauday.com/tau-manifesto
None

In [None]:
months = ['January',
          'February',
          'March',
          'April',
          'May',
          'June',
          'July',
          'August',
          'September',
          'October',
          'November',
          'December']

plt.figure(constrained_layout=True)

rng = np.random.default_rng(0)

plt.bar(np.arange(12), rng.uniform(0, 1, 12))
plt.xticks(
    np.arange(12),
    months,
    rotation=45,
    rotation_mode='anchor',
    horizontalalignment='right',  # or ha
    verticalalignment='top',      # or va
);

# Error bars

In [None]:
x = np.linspace(0, 2*np.pi, 10)

rng = np.random.default_rng(0)
errX = rng.normal(0, 0.4, 10)
errY = rng.normal(0, 0.4, 10)

plt.figure(constrained_layout=True)

plt.errorbar(x + errX, x + errY, xerr=0.4, yerr=errY, linestyle='');

## Asymmetrical errors

Give 2 arrays to the `xerr` or `yerr` kwargs:


In [None]:
x = np.linspace(0, 1, 10)

plt.figure(constrained_layout=True)

plt.errorbar(
    x, 
    np.sin(2 * np.pi * x),
    yerr=[np.full_like(x, 0.5), np.full_like(x, 0.1)],
    linestyle='',
    marker='o',
)

## Upper and lower limits


Often, we want to give uncertainties for some values, but upper or lower limits for others.

In [None]:
# create some random "spectrum"
bins = np.logspace(2, 4, 15)
x = (bins[:-1] + bins[1:]) / 2

rng = np.random.default_rng(0)

y = x**(-2.7)
yerr = y * 0.3
y += rng.normal(0, yerr)

# mask for which points are upper limits
uplims = np.full_like(x, False)

# last points are only upper limits
y[-3:] += 3 * y[-3:]
yerr[-3:] = 0.3 * y[-3:] # yerr determines length of limit arrow
uplims[-3:] = True 

Now we can plot the data

In [None]:
plt.figure(constrained_layout=True)

plt.errorbar(
    x,
    y,
    xerr=np.diff(bins) / 2,
    yerr=yerr,
    uplims=uplims,
    ls='none',
)

plt.xlabel('$E \ / \ \mathrm{GeV}$')
plt.ylabel('$Flux \ / \ \mathrm{GeV}^{-1} \mathrm{s}^{-1} \mathrm{m}^{-2} \mathrm{sr}^{-1}$')
plt.xscale('log')
plt.yscale('log')

# Polar Plots

In [None]:
r = np.linspace(0, 10, 50)
# r = np.linspace(0, 10, 1000)
theta = 2*np.pi*r

plt.figure(constrained_layout=True)

plt.polar(theta, r);

# Histograms

## 1D Histograms

In [None]:
# Generate random data:

rng = np.random.default_rng(0)
x = rng.normal(0, 1, 1000)

plt.figure(constrained_layout=True)
plt.hist(x, bins=25);

Whe  comparing two distributions, make sure to use the same binning for both:

In [None]:
x1 = rng.normal(-1, 1, 1000)
x2 = rng.normal(1, 1, 1000)

bin_edges = np.linspace(-6, 6, 51)  # 50 bins between -6 and 6

plt.figure(constrained_layout=True)

plt.hist(x1, bins=bin_edges, histtype='step', label='x1')
plt.hist(x2, bins=bin_edges, histtype='step', label='x2')

plt.legend();

Beware of binning effects, there should never be bins, that are always empty.

E.g. for interger data, use bins of integer width that are centered around integers:

In [None]:
x = rng.poisson(5, size=500)


plt.figure(constrained_layout=True)

plt.subplot(2, 1, 1)
plt.hist(x, bins=20)

plt.subplot(2, 1, 2)
bin_edges = np.arange(-0.5, 15, 1)

plt.hist(x, bins=bin_edges, edgecolor='w', linewidth=2)

bin_edges

## 2D Histograms

In [None]:
mean = np.array([2, 1])
cov = np.array(
    [[9, 2],
     [2, 4]]
)

x, y = rng.multivariate_normal(mean, cov, size=10000).T

plt.figure(constrained_layout=True)

plt.hist2d(x, y)
# plt.hist2d(x, y, bins=50)
# plt.hist2d(x, y, bins=[25, 50], range=[[-10, 14], [-5, 7]])

plt.colorbar(label='Counts');

### Using Norms – e.g. Logarithmic colorscale

In [None]:
from matplotlib.colors import LogNorm, SymLogNorm, TwoSlopeNorm

plt.figure(constrained_layout=True)


plt.hist2d(
    x,
    y,
    bins=50,
    norm=LogNorm(),
)
plt.colorbar(label='Counts');

### SymLogNorm

The SymLogNorm uses two logscales, one for the negative, one for the positive numbers.
Around 0, a linear scale is used. The threshold value for the linear scale
has to be given.

In [None]:
x1, y1 = rng.multivariate_normal(mean, cov / 3, size=100000).T
x2, y2 = rng.multivariate_normal(mean, cov, size=100000).T


hist1, xedges, yedges = np.histogram2d(x1, y1, bins=100, range=[[-8, 13], [-7, 9]])
hist2, _, _ = np.histogram2d(x2, y2, bins=[xedges, yedges])

plt.figure(constrained_layout=True)

plt.pcolormesh(
    xedges,
    yedges,
    hist1 - hist2,
    cmap='RdBu_r',
    norm=SymLogNorm(10),
)

plt.colorbar();

### Colormaps

* Can influence perception greatly
* Physicists most beloved colormaps (rainbow, jet) are objectively bad
    * Do not work when printed black/white
    * Not colorblind friendly
    * Not perceptually uniform
    
* Use the modern colormaps in matplotlib
    * `viridis` (default since 2.0)
    * `inferno`
    * `magma`
    * `plasma`
    * `cividis`
    
* Use domain specific colormaps where appropriate:   
  sequential vs. diverging vs. qualitative vs.cyclic 

More here: 
* https://www.youtube.com/watch?v=xAoljeRJ3lU
* https://matplotlib.org/3.1.0/tutorials/colors/colormaps.html

<a id="scatter"></a>
# Scatter Plots

In [None]:
x1, y1 = rng.multivariate_normal([1, 1], [[1, 0], [0, 1]], 1000).T
x2, y2 = rng.multivariate_normal([-1, -1], [[1, 0], [0, 1]], 1000).T

plt.figure(constrained_layout=True)

plt.scatter(x1, y1)
plt.scatter(x2, y2)

None

In [None]:
# we can also set the size and colors of each point individually
x = np.append(x1, x2)
y = np.append(y1, y2)

s = np.random.uniform(5, 50, 2000)
label = np.append(np.ones_like(x1), np.zeros_like(x2))

plt.figure(constrained_layout=True)
plt.scatter(x, y, c=label, s=s);

# Multiple plots in the same figure

In [None]:
x = np.linspace(0, 2*np.pi)

plt.figure(constrained_layout=True)

# subplot arguments: # of rows, # of columns, plot index (row * (#cols) + col)
plt.subplot(2, 1, 1)
plt.plot(x, x**2)
plt.xlim(0, 2*np.pi)

plt.subplot(2, 1, 2)
plt.plot(x, np.sin(x))
plt.xlim(0, 2*np.pi);

## Spacing

You should almost always use `constrained_layout=True` (still a bit experimental)
or in the end all `plt.tight_layout()`

In [None]:
x = np.linspace(0, 2*np.pi)

plt.figure(constrained_layout=False)

plt.subplot(2, 1, 1)
plt.plot(x, x**2)
plt.xlim(0, 2*np.pi)
plt.xlabel('$x \ / \ \mathrm{m}$')
plt.title(r"$f(x)=x^2$")

plt.subplot(2, 1, 2)
plt.plot(x, np.sin(x))
plt.xlim(0, 2*np.pi)
plt.xlabel('$x \ / \ \mathrm{m}$')
plt.title(r"$f(x)=\sin(x)$")

plt.tight_layout()   # try commenting this line out!

## Inset Plots (plot inside a plot)

In [None]:
plt.figure(constrained_layout=True)

plt.plot(x, x**2)
plt.title("Outer Plot")

# axes coordinates: (0,0) is lower left, (1,1) upper right
plt.axes([0.2, 0.45, 0.3, 0.3])
plt.plot(x, x**3)
plt.title("Inner Plot");

# Using the object orientated syntax

Matplotlib has two APIs (yes, it's strange).

* The matlab-like syntax we used until now:
    * Easier to write
    * Familiar for matlab users
    * Frequently uses global states
* Object-oriented syntax:
    * More powerful
    * More control over the plots
    * Preferable for library code
    * No (or at least very few) global states

In [None]:
import matplotlib.pyplot as plt
import numpy as np

t = np.linspace(0, 2*np.pi, 1000)


fig, (ax1, ax2) = plt.subplots(2, 1, constrained_layout=True)

# note that plot is now a method of ax1, not the global plt object
ax1.plot(t, np.sin(t), color='C0')
ax1.set_title(r"$f(t)=\sin(t)$")   # use object-oriented get/set syntax

ax2.plot(t, np.cos(t), color='C1')
ax2.set_title(r"$f(t)=\cos(t)$")

for ax in (ax1, ax2):
    ax.set_xlabel("$t$")
    ax.set_xlim(0, 2*np.pi)
    ax.set_ylim(-1.1, 1.1)


## Shared Axes and GridSpec

In [None]:
def poisson(x, k):
    return np.exp(-x)*x**k / np.math.factorial(k)

x = np.linspace(0, 12, 40)
y = poisson(x, 2)
y_noise = y + np.random.normal(0, 0.01, len(y))
z = np.linspace(0, 12, 100)

gridspec = {'height_ratios': [2, 1]}

fig, (ax1, ax2) = plt.subplots(
    2, 1,
    sharex=True,
    gridspec_kw=gridspec,
    constrained_layout=True,
)

ax1.plot(x, y_noise, 'ko')
ax1.plot(z, poisson(z, 2))
ax1.set_ylim(-0.05, 0.30)
ax1.set_ylabel('Flux')
ax1.set_yticks(ax1.get_yticks()[1:])    # remove bottom y-tick

ax2.plot(x, y_noise - y, 'ko')
ax2.axhline(y=0, color='black', linestyle='--', linewidth=1)
ax2.set_xlabel('Energy')
ax2.set_ylim(-0.03, 0.04)
ax2.set_ylabel('Residuals')
ax2.set_yticks(ax2.get_yticks()[:-1])   # remove top y-tick

fig.suptitle('\nFake Spectrum', fontweight='bold');

# Secondary Axis

Astronomers often use a data format called Modified Julian Date (MJD),
continuously counting fractional days since 1858-11-17T00:00.


* I am not able to convert MJD to normal dates in my head
* Your audience probably is also not able to do it
* Solution: provide both a human readable and a MJD axis

Matplotlib uses days since 1970-01-01T00:00:00 by default since version 3.3 for internal datetime representation.

New in matplotlib 3.1, Secondary Axis: https://matplotlib.org/3.1.0/gallery/subplots_axes_and_figures/secondary_axis.html

Note: this can also be done using `astropy.time.Time`

In [None]:
from datetime import datetime, timedelta
from matplotlib.dates import get_epoch

# constants for ordinal and mjd date representation
MJD_EPOCH = datetime(1858, 11, 17)
MPL_EPOCH = datetime.fromisoformat(get_epoch())
DELTA = (MJD_EPOCH - MPL_EPOCH).total_seconds() / 86400


# convert functions from one unit to the other

def ordinal_to_mjd(ordinal):
    ''' Converts ordinal date (days since mpl epoch) to MJD (days since 1858-11-17T00:00)'''
    return ordinal - DELTA

def mjd_to_ordinal(mjd):
    return mjd + DELTA


# create some random data
n_on = np.random.poisson(60, 25)
n_off = np.random.poisson(30, 25)
n_signal = n_on - 0.2 * n_off
n_signal_err = np.sqrt(n_on + 0.2**2 * n_off)

# create some dates
dates = [datetime(2022, 1, 1) + timedelta(days=i) for i in range(25)]

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)

ax.errorbar(dates, n_signal, yerr=n_signal_err, ls='')


ax.set_ylim(0, 80)
ax.set_ylabel(r'Signal Rate / $\mathrm{h}^{-1}$')


fig.autofmt_xdate()

# create a second axis, using the same y-axis
ax_mjd = ax.secondary_xaxis('top', functions=(ordinal_to_mjd, mjd_to_ordinal))
ax_mjd.set_xlabel('MJD')

fig.tight_layout()

# Plots for Publications

* Use fully blown LaTeX installation using the `pgf` backend
* Same font and font sizes as in your publication
* Really high quality, publication ready plots
* See example in `matplotlib_pgf`

# Styles

List available styles:

In [None]:
print(plt.style.available)

In [None]:
from scipy import stats

def plot_stuff():
    plt.figure(constrained_layout=True)
    
    plt.subplot(2, 2, 1)
    x = np.linspace(-1, 1, 1000)
    plt.plot(x, np.sin(50*x**3)/(x))
    plt.grid()

    plt.subplot(2, 2, 2)
    x = np.linspace(-1, 1, 10)
    y = np.exp(-2.2*x) + np.random.normal(0, 0.1, 10)
    yerr = np.random.normal(0, 0.2, 10)
    plt.errorbar(x, y, yerr, fmt='o', capsize=3)
    plt.yscale('log')

    plt.subplot(2, 2, 3)
    x = stats.skewnorm.rvs(10, size=1000)
    plt.hist(x, bins=50)

    plt.subplot(2, 2, 4)
    x, y = np.mgrid[-1:1:.01, -1:1:.01]
    pos = np.dstack((x, y))
    z = stats.multivariate_normal([0.1, 0.3], [[0.2, 0.3], [0.1, 0.4]])
    plt.contourf(x, y, z.pdf(pos))

for plot_style in ['classic', 'bmh', 'fivethirtyeight', 'ggplot', 'seaborn']:

    with plt.style.context(plot_style):   # use context manager so that changes are temporary
        plot_stuff()
        plt.suptitle('Plot Style: ' + plot_style, fontweight='bold')

# Saving figures into files

Use `plt.savefig` to save your figure.

You can either give path relative to your working directory or an absolute path.
Not sure what the current working directory is?

In [None]:
pwd()

In [None]:
x = np.linspace(-5, 5)

plt.figure(constrained_layout=True)

plt.plot(x, x**3, marker='s')
plt.title("My Awesome Plot")

# save in current directory; extension determines file type
plt.savefig('awesome_plot.pdf')
plt.savefig('awesome_plot.svg')
plt.savefig('awesome_plot.eps') # old vector graphics format required by some journals
plt.savefig('awesome_plot.png', dpi=300)   # bitmap graphics; lossless compression; don't use me for publications!
plt.savefig('awesome_plot.jpg', dpi=300)   # bitmap graphics; lossy compression; don't use me either!

# relative path with subdirectory
# plt.savefig('build/awesome_plot.pdf')

# absolute path
# plt.saveig('/path/to/output/directory/awesome_plot.pdf')

# Animations

In [None]:
from matplotlib.animation import FuncAnimation
from tqdm.auto import tqdm

fps = 25
frames = 400
progress = tqdm(total=frames)

fig = plt.figure(figsize=(12.8, 7.2), dpi=100)  # 720p
ax = fig.add_subplot(1, 1, 1)
ax.set_ylim(-1.1, 1.1)
ax.set_xlim(0, 1)
ax.axvline(0.5, 0, 1, color='gray')
fig.tight_layout()

line, = ax.plot([], [])
dot, = ax.plot([], [], 'o', ms=20)
text = ax.text(0.05, 0.95, '', va='top', ha='left', transform=ax.transAxes, family='monospace', size=32)

x = np.linspace(0, 1, 200)


def wave(x, t, lamb=2 * np.pi * 3, omega=2* np.pi / 5):
    return np.sin(omega * t - lamb * x)

def init():
    dot.set_xdata(0.5)
    dot.set_ydata(wave(0.5, 0))
    line.set_xdata(x)
    line.set_ydata(wave(x, 0))
    text.set_text(r't =  0.0 s')

def update(frame):
    progress.update()
    
    t = frame / fps
    dot.set_ydata(wave(0.5, t))
    line.set_ydata(wave(x, t))
    text.set_text(f't = {t:4.1f}')
    return dot, line, t # return all plot objects that changed


ani = FuncAnimation(
    fig,
    update,
    init_func=init,
    frames=frames, 
    interval=1000/fps,
    repeat=False,
)

plt.close(fig)

In [None]:
from IPython.display import Video

Video('sine_wave.mp4', width=640)
# you might need to reload without caches after you made changes, e.g. ctrl + shift + r in Firefox

# 3D-Plots

Just a teaser, see https://matplotlib.org/mpl_toolkits/mplot3d/tutorial.html for more

In [None]:
x, y, z = rng.normal(0, 2, size=(3, 100))


fig = plt.figure()
ax = fig.add_subplot(projection='3d')

ax.scatter(x, y, z)