# Plotting with matplotlib

So far we have resorting to printing data to screen whenever we've wanted to inspect the status of variables. Clearly being able to represent data visually is key to being able to interpreting what the data shows. Here we will explore how to make publication quality plots using the module [`matplotlib`](https://matplotlib.org).

Before we dig into the details it is well worth pointing out that `matplotlib` offers some very helpful resources for finding how to make and customise many different types of plots. In particular, it is always worth reviewing their [gallery](https://matplotlib.org/stable/gallery/index.html) or more focused [plot types](https://matplotlib.org/stable/plot_types/index.html) which show a large number of differnt plots. Clicking on any of these plots shows you the code used to generate it. This is very helpful when trying to find out how to do something. They also provide some [cheatsheets](https://matplotlib.org/cheatsheets/) which can also be handy to refer to when you can't remember a command/option.

## Importing

In almost all examples across many resources people tend to use the following convention to import matplotlib

In [None]:
import matplotlib.pyplot as plt

and we will stick with this convention here. We will also import `numpy` to help us with generating some data to plot.

In [None]:
import numpy as np

## Visualising an array

Suppose we have a set of measurements in an array and we just want to get an idea for what this data looks like, we can simply plot with

In [None]:
# First generate some pretend data
data = np.array([10.0,10.123,14.0,9.05,4.0,15.0])
plt.plot(data)
plt.show()

This will show a jagged blue line varying between 4 and 15 across an x-range of 0 to 5. Assuming these are repeat measurements of some quantity it doesn't really make sense for us to draw a line between the values, we can change that my adding a marker and removing the line:

In [None]:
plt.plot(data, linestyle='', marker='o')
plt.show()

## Plotting y against x

More typically we might want to plot some values against some other quantity, for example suppose we measure the period of a pendulum oscilating as the length changes and we want to plot this.

In [None]:
# Record our measurements
length = np.array([1,2,3,4,5]) # Known lengths
period = np.array([2.00486328, 2.83530484, 3.47252507, 4.00972656, 4.48301058]) # Measured periods

#Plot period vs length
plt.plot(length, period, marker = 'x', color = 'green')
plt.show()

This looks OK, but how will anyone know what this plot represents? We need to add labels:

In [None]:
#Plot period vs length
plt.plot(length, period, marker = 'x', color = 'green')
plt.xlabel('Length (m)')
plt.ylabel('Period (s)')
plt.title('Measured pendulum period as a function of string length')
plt.show()

Suppose we want to indicate that there's some uncertainty on our measurements? We can use the [`errorbar`](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.errorbar.html#matplotlib.axes.Axes.errorbar) plot type:

In [None]:
#Plot period vs length
plt.errorbar(length, period, xerr = 0.0, yerr=0.1, marker = 'o', color = 'green', ecolor = 'red')
plt.xlabel('Length (m)')
plt.ylabel('Period (s)')
plt.title('Measured pendulum period as a function of string length')
plt.show()

## Plotting multiple curves

Suppose we repeat our pendulum experiment three times, do we need to make three plots? No we can simply add more plot statements before we show the image.

In [None]:
# Add other measurements
period2 = np.array([2.0312352 , 2.87260036, 3.51820256, 4.0624704 , 4.54197998])
period3 = np.array([1.95467808, 2.76433225, 3.38560174, 3.90935615, 4.37079305])

#Plot period vs length, we use label to add a name to each curve
plt.errorbar(length, period, xerr = 0.0, yerr=0.1, marker = 'o', ecolor = 'red', label = 'Case 1')
plt.errorbar(length, period2, xerr = 0.0, yerr=0.1, marker = 'o', ecolor = 'red', label = 'Case 2')
plt.errorbar(length, period3, xerr = 0.0, yerr=0.1, marker = 'o', ecolor = 'red', label = 'Case 3')
plt.xlabel('Length (m)')
plt.ylabel('Period (s)')
plt.title('Measured pendulum period as a function of string length')
# Show a legend so we know which curve is which
plt.legend(loc='best')
plt.show()

It can be helpful to transform the plot from a linear scale to a log one. We can use `xscale` and `yscale` to control this, e.g.

In [None]:
#Plot period vs length, we use label to add a name to each curve
plt.errorbar(length, period, xerr = 0.0, yerr=0.1, marker = 'o', ecolor = 'red', label = 'Case 1')
plt.errorbar(length, period2, xerr = 0.0, yerr=0.1, marker = 'o', ecolor = 'red', label = 'Case 2')
plt.errorbar(length, period3, xerr = 0.0, yerr=0.1, marker = 'o', ecolor = 'red', label = 'Case 3')
plt.xlabel('Length (m)')
plt.ylabel('Period (s)')
plt.title('Measured pendulum period as a function of string length')
plt.xscale('log')
plt.yscale('log')
# Show a legend so we know which curve is which
plt.legend(loc='best')
# Add a grid just because it looks nice and helps demonstrate the logarithmic scale
plt.grid(True, 'both')
plt.show()

Note how these now look like straight lines, this suggests a powerlaw like dependence (i.e. period is proportional to length to some power).

## Saving figures

It is very helpful to be able to save the figures you create directly, rather than say taking a screenshot. Fortunately `matplotlib` provides the [`savefig`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.savefig.html) method to allow this. One simply calls this method after creating the plot but before doing `plt.show()`. It's possible to control the format used (png, pdf etc.). A simple example is

In [None]:
#Plot period vs length, we use label to add a name to each curve
plt.errorbar(length, period, xerr = 0.0, yerr=0.1, marker = 'o', ecolor = 'red', label = 'Case 1')
plt.errorbar(length, period2, xerr = 0.0, yerr=0.1, marker = 'o', ecolor = 'red', label = 'Case 2')
plt.errorbar(length, period3, xerr = 0.0, yerr=0.1, marker = 'o', ecolor = 'red', label = 'Case 3')
plt.xlabel('Length (m)')
plt.ylabel('Period (s)')
plt.title('Measured pendulum period as a function of string length')
plt.xscale('log')
plt.yscale('log')
# Show a legend so we know which curve is which
plt.legend(loc='best')
# Add a grid just because it looks nice and helps demonstrate the logarithmic scale
plt.grid(True, 'both')
plt.savefig('example.png', dpi = 300) #Set a high dpi for this example

## Using LaTeX in labels

We often would like to make use of symbols, superscripts etc. This is relatively straightforward in matplotlib using LaTeX notation. For example suppose we want to plot the electrostatic potential, $\phi$, squared against some distance $\rho_e$ we can do the following

In [None]:
rho = [0,1,2,3,4]
phi = [5,4,3,2,1]
plt.plot(rho, phi)
plt.xlabel(r'$\rho_e$')
plt.ylabel(r'$\phi^2$')
plt.show()

Specifically we put `r` in front of the first quote (this makes the string a raw string -- means python doesn't try to change it). Then inside the string we can start a LaTex section by adding a `$` then when we're finished writing LaTeX we add another `$`.