--- 
# Introduction to Matplotlib

---
# 1. Matplotlib in the Wild

A powerful plotting library that can generate a [wide range of plot types](https://matplotlib.org/stable/tutorials/introductory/sample_plots.html#sphx-glr-tutorials-introductory-sample-plots-py). 

In this tutorial, we focus only on x/y plots. 

## 1.1 Import matplotlib

In [None]:
import matplotlib.pyplot as plt

## 1.2 Matplotlib Jargon

The _figure_ is the window or exportable area that can contain one or more plot.

The _axes_ is the plot itself.

Our general process: 
1. Make a _figure_ with one or more _axis_ in it 
2. Populate the _axis_ with our data
3. Format the _axis_ and the _figure_
4. Export our _figure_ for use elsewhere

---
# 2. A Pick and Mix of Syntax

There are many ways to make a matplotlib plot. 

We will start with a live demo of two common approaches and then discuss other variations.

## 2.1 Demo: Pyplot-style plot

- Best for a quick look at your data or for a figure with just one plot
- Generates the figure and axis at the same time 
- Not recommended for functions or scripts that will be reused as part of a larger project

In [None]:
x = [3,5,9]
y = [10,20,30]



## 2.2 Demo: Object oriented type

- Best for figure layouts with one or more plots that are the same size
- Figure and axis are specified explicitly
- Suitable for functions or scripts that will be reused in a larger project

In [None]:
# one plot (axis) in one figure

x = [3,5,9]
y = [10,20,30]




In [None]:
# two plots (axis) in one figure

x = [3,5,9]
y = [10,20,30]




In [None]:
# Four plots (axis) in one figure

x = [3,5,9]
y = [10,20,30]




## 2.3 Two Other Object Oriented Approches 

### (1) Define axis locations as you plot into them

- Similar syntax to the example above, but we use a general term (axs) in the first line and then specify the row, column location of a plot as we use it
- Positive = you don't have to keep track of which axis is in which location
- Negative = more code on each line
- Best for complex figure layouts with multiple plots of the same size
- Figure and axis are specified explicitly
- Suitable for functions or scripts that will be reused in a larger project

In [None]:
fig, axs = plt.subplots(2, 2, figsize=(6, 6))

# [row, column]
axs[0,0].plot(x,y)
axs[1,0].scatter(x,y)
axs[0,1].plot()
axs[1,1].plot()

fig.tight_layout()

### (2) GridSpec for different sized plots (axis)

- Enables you to define plot (axis) with various sizes by defining the width or height of columns and rows
- Best for complex figure layouts with multiple plots of various sizes
- Figure and axis are specified explicitly
- Suitable for functions or scripts that will be reused in a larger project

For more on GridSpec: https://matplotlib.org/stable/gallery/userdemo/demo_gridspec03.html

In [None]:
from matplotlib.gridspec import GridSpec

x = [3,5,9]
y = [10,20,30]

# make the figure
fig = plt.figure(figsize=(6,6))

# define the gridspec
spec = GridSpec(
    ncols=2, 
    nrows=2, 
    width_ratios=[1,2], 
    height_ratios=[3,1]
)

# relate the gridspec to axis
ax0 = fig.add_subplot(spec[0])
ax1 = fig.add_subplot(spec[1])
ax2 = fig.add_subplot(spec[2])
ax3 = fig.add_subplot(spec[3])

# make your plots
ax0.plot(x,y)
ax1.scatter(x,y)

# fix the layout
fig.tight_layout()


---
# 3. Matplotlib and Pandas

Matplotlib works well with Pandas

## 3.1 Import Pandas

In [None]:
import pandas as pd

## 3.2 Import Data as a Pandas Dataframe

In [None]:
# import data from a csv file
p3 = pd.read_csv('Data/P3.csv')

# import data from an Excel file
tvz = pd.read_excel('Data/TVZ.xlsx', sheet_name='Data')

## 3.3 Look at the Dataframes

In [None]:
# columns, info, describe, head, tail

print(tvz.columns)
print(p3.columns)

## 3.4 Plot from a Dataframe

Data can be called into a matplotlib plot directly from a Pandas dataframe

In [None]:
fig, ax = plt.subplots(1,1,figsize=(6,3))

ax.scatter(
    tvz.EffectivePorosity_VolPercent, # x axis data
    tvz.SampleDepth_mMD, # y axis data
)

ax.scatter(
    p3['Value effective porosity (%)'], # syntax when there is a space in column name
    p3.MaxDepth_mTVD, # syntax when there is not a space in the column name
);


Strictly speaking, we should convert the columns of the Pandas Dataframe into an np.array by using .values (demo below). 

However, I'm yet to encounter an issue plotting a column of data from a Pandas Dataframe without doing this.

In [None]:
# Same as above but with columns converted to np.array

fig, ax = plt.subplots(1,1,figsize=(6,3))

ax.scatter(
    tvz.EffectivePorosity_VolPercent.values, 
    tvz.SampleDepth_mMD.values,
)

ax.scatter(
    p3['Value effective porosity (%)'].values, 
    p3.MaxDepth_mTVD.values,
);

---
# 4. Save Your Figure

In [None]:
fig, ax = plt.subplots(1,1)

ax.scatter([5,2],[10,15])



# Can save a figure as...
# Raster formats: png (defult format) or jpg 
# Vector formats: pdf or svg 

# other kwargs to try...
# dpi=300 (300 is good for general use, 500-600 is high resolution)
# facecolor='w' (sets the colour of the area outside the plot)
# transparent=False (sets the plot area to transparent)

# If figure is cut-off by export, use the following kwarg to fit the export to the plot
# bbox_inches='Tight' 
# If using the above kwarg, you can add padding around the tight frame using the following kwarg
# pad_inches=0.1

# For more information on saving figures:
# https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.savefig.html


---
# 5. Handy Hints

If you are making and then saving dozens of figures inside a for loop, then it is useful to close each figure after saving that figure so you don't run out of memory. Use "plt.close()" to do this. 

You need to use "plt.show()" if you want to see your figure when working outside of a jupyter notebook. However, if "plt.show()" is used before saving the figure, then the figure saved will be empty. Place "plt.show()" after "plt.savefig('YourFigure.png')"

---
Tutorial created by [Irene Wallis](https://www.cubicearth.nz/)
