# Creating publication-quality plots

In this week's lesson we will discuss some general tips for creating high-quality plots using Matplotlib. The lesson is divided into two parts:

1. Creating publication-quality plots using Matplotlib
2. Considering accessibility in designing your plots

As was the case in the past weeks, the lesson does not follow a strict plan, but the content will be adjusted according to the input of those in attendance in the online sesson.

## Creating some fake data

First, we need to create some fake data that we can use for plotting. Rather than loading a data file we'll just generate some random data to give an idea of how the plot formatting can be applied. In our case we'll create two pandas DataFrames:

1. A set of x values and 4 sets of corresponding y values for plotting lines.
2. Random 1000 x-y points that can be used for plotting colors on scatter plots.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

In [None]:
# DataFrame of lines

# Set number of points, x min and max, and offset between y-lines
num_points = 21
xmin = 0
xmax = 20
y_offset = 0.25

# Create DataFrame
lines_df = pd.DataFrame(columns=['x', 'y1', 'y2', 'y3', 'y4'])

# Create numpy arrays for dataframe columns
x = np.linspace(xmin, xmax, num_points)
y1 = np.random.rand(num_points)
y2 = np.random.rand(num_points) + y_offset
y3 = np.random.rand(num_points) + 2 * y_offset
y4 = np.random.rand(num_points) + 3 * y_offset

# Fill DataFrame with numpy values
lines_df['x'] = x
lines_df['y1'] = y1
lines_df['y2'] = y2
lines_df['y3'] = y3
lines_df['y4'] = y4

In [None]:
# DataFrame of scatter points

# Set number of points, x and y max
num_points = 1000
xmax = 20
ymax = 20

# Create DataFrame
scatter_df = pd.DataFrame(columns=['x', 'y', 'color'])

# Create numpy arrays for dataframe values
x_pts = np.random.rand(num_points) * xmax
y_pts = np.random.rand(num_points) * ymax
color = np.random.rand(num_points)

# Fill DataFrame values
scatter_df['x'] = x_pts
scatter_df['y'] = y_pts
scatter_df['color'] = color

## Creating publication-quality plots using Matplotlib

### Matplotlib style sheets

Matplotlib has many different built-in styles that can be used for formatting the visual appearance of the plot. Many of these are nicer looking than the default plot settings. You can find information about the available plot styles in the [Matplotlib style sheets reference](https://matplotlib.org/stable/gallery/style_sheets/style_sheets_reference.html).

In [None]:
# Plot data for two lines
lines_df[['y1', 'y2']].plot()

You can specify a plot style to use with the `plt.style.use()` function.

In [None]:
# Define plot style
plot_style = "seaborn"
plt.style.use(plot_style)

# Plot data for two lines
lines_df[['y1', 'y2']].plot()

### Using colormaps for line colors

Matplotlib also has a large number of [built-in colormaps](https://matplotlib.org/stable/tutorials/colors/colormaps.html) that can be used to define colors of plot objects, often for things like filled contour plots. If you would like to have such colormaps be used for plotting lines on a plot, which you might like to do if you have a series of plot lines for different time periods, for example, you can do this using the `plt.cm.colormap()` function, where the word `colormap` in the function would be replaced by a matplotlib colormap. Let's see an example.

In [None]:
# Original plot, now for four lines
lines_df[['y1', 'y2', 'y3', 'y4']].plot()

In [None]:
# Define colors to use from inferno colormap
colors = plt.cm.inferno(np.linspace(0, 1, 4))

# Modified plot with colormap colors for four lines
lines_df[['y1', 'y2', 'y3', 'y4']].plot(color=colors)

### More advanced subplot layouts



### Formatting plot ticks



### Other handy tips

#### Reversing plot axes

#### Filling between lines



#### Using f-strings for plot labels and titles

### Sometimes manual editing is still required (or faster)

Practically speaking, I do most of the plot formatting in Python, but often clean things up in another program such as Adobe Illustrator, Inkscape, CorelDRAW, or Affinity Designer. Certain formatting processes are simply easier to customize by hand. (Geochron paper example)

## Considering accessibility in plot design

### Tips for creating line plots

### Choosing colormaps

- sequential: Lightness increases monotonically in the colors
- diverging: Color diverges on either side of a neutral value in the middle of the color range
- cyclic: Colors at either end of the colormap are equal
- qualitative: Often used to classify data values that are discrete

#### Supporting colorblind viewers

Matplotlib's has several perceptually uniform colormaps that are available:

- viridis
- plasma
- inferno
- magma
- cividis

#### Avoiding data distortion



## Resources

- [s-ink science graphics collection](https://s-ink.org/)
- [Scientific color maps](https://www.fabiocrameri.ch/colourmaps/)
- [Matplotlib plot gallery](https://matplotlib.org/stable/gallery/index.html)
- [Coblis colorblindess simulator](https://www.color-blindness.com/coblis-color-blindness-simulator/)
- [SciPy 2015 conference presentation on perceptually uniform colormaps](https://youtu.be/xAoljeRJ3lU)
