# Python Modules - Matplotlib and Seaborn

*By Dr Chas Nelson and Mikolaj Kundegorski*

## Objectives

* Know about the plotting functions provided by Matplotlib (`matplotlib`)
* Know about the plotting functions provided by Seaborn (`seaborn`)
* Know how to plot a scatterplot (with a regression model) with Seaborn
* Know how to plot boxplots with Seaborn
* Know how to edit and save plots with Matplotlib
* See that it is possible to do more complex plotting with Seaborn

## Matplotlib

Matplotlib (`matplotlib`) is the most widely used scientific plotting module in Python. Many other modules are built upon Matplotlib and we will explore one of these in particular: Seaborn.

Matplotlib is a huge module and we will only introduce you to a few plotting tools today.

In order to make Jupyter show plots just saved with command `plt.savefig()` we need to use a 'magic' command: `%matplotlib inline`

Most of the functions we will need are in the `matplotlib.pyplot` submodule - so we will only import that today.

<div style="background-color:#abd9e9; border-radius: 5px; padding: 10pt"><strong>Task 12.1:</strong> In the cell below, add a line to import the <code>matplotlib.pyplot</code> submodule. It is conventional to give this the alias <code>plt</code>. In the same cell import the <code>pandas</code> module.
<br/>
If you get stuck, see the video <a href='https://youtu.be/8eMkAYZYGEs'>here</a> for a walkthrough, which also covers the next task.</div>

In [None]:
import pandas as pd

<div style="background-color:#abd9e9; border-radius: 5px; padding: 10pt"><strong>Task 12.2:</strong> Find the Matplotlib Documentation on-line. Can you easily navigate the documentation to find useful functions such as <code>plot</code>?
<br/>
</div>

In [None]:
%matplotlib inline

# Add you imports here

cars = pd.read_excel('cars.xlsx')
display(cars.head())

In [None]:
#Show unique value in Make
cars.Make.unique()

In [None]:
#Show the counts for each unique category in Type
cars.Make.value_counts()

### Scatter Plotting with Matplotlib

Plotting with Matplotlib is powerful but can be complicated (especially when you first start).

The basic framework for a Matplotlib figure is the following:

```python
plt.figure()

<FIGURE CODE>  # Where line and scatter plots are added

plt.legend()

plt.title('Plot Title')
plt.xlabel('X Axis Label')
plt.ylabel('Y Axis Label')

plt.savefig('myplot.png')  # To save the plot
plt.show()  # To display the plot

```

<div style="background-color:#abd9e9; border-radius: 5px; padding: 10pt"><strong>Task 12.3:</strong> Run the following cell to show how to create a scatter plot between two variables for each car type (using a different colour) with a linear regression model fit for each. Don't worry about understanding everything - this is just to show you the complexities of plotting.
<br/>
</div>

In [None]:
# Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.stats

In [None]:
# Create figure thats 5 by 5 inches
plt.figure(figsize=[5,5])

 ### Pick a color from <a href='https://matplotlib.org/stable/tutorials/colors/colors.html#sphx-glr-tutorials-colors-colors-py'>here</a>

In [None]:
#Draw a histogram,modify the numbers of bins and color
plt.hist(cars.Weight)

In [None]:
#Draw a scatter plot, please change color and marker
plt.scatter(cars.Weight,cars.Length)

In [None]:
#Draw a bar chart
plt.bar(cars.Type,cars.Weight)

In [None]:
# Create a mask for each Type
maskSUV = cars.loc[:,'Type']=='SUV'
maskSedan = cars.loc[:,'Type']=='Sedan'
maskSport = cars.loc[:,'Type']=='Sports'

In [None]:
# Plot a scatter for each Type in a unqiue colour showing Length against Weight
plt.scatter(cars.loc[maskSUV,'Length'],cars.loc[maskSUV,'Weight'],color='purple',label='SUV')
plt.scatter(cars.loc[maskSedan,'Length'],cars.loc[maskSedan,'Weight'],color='lightblue',label='Sedan')
plt.scatter(cars.loc[maskSport,'Length'],cars.loc[maskSport,'Weight'],color='orange',label='Sport')

# Calculate a linear regression model for each TYPE
slopeSUV, interceptSUV, r_valueSUV, p_valueSUV, std_errSUV = scipy.stats.linregress(cars.loc[maskSUV,'Length'],cars.loc[maskSUV,'Weight'])
slopeSedan, interceptSedan, r_valueSedan, p_valueSedan, std_errSedan = scipy.stats.linregress(cars.loc[maskSedan,'Length'],cars.loc[maskSedan,'Weight'])
slopeSport, interceptSport, r_valueSport, p_valueSport, std_errSport = scipy.stats.linregress(cars.loc[maskSport,'Length'],cars.loc[maskSport,'Weight'])

# Plot a line for each model over the range of car Length using the colours from the appropriate scatter
xSUV = np.linspace(cars.loc[maskSUV,'Length'].min(),cars.loc[maskSUV,'Length'].max(),100)
ySUV = slopeSUV*xSUV + interceptSUV
plt.plot(xSUV,ySUV,color='#FF0000')

xSedan = np.linspace(cars.loc[maskSedan,'Length'].min(),cars.loc[maskSedan,'Length'].max(),100)
ySedan = slopeSedan*xSedan + interceptSedan
plt.plot(xSedan,ySedan,color='#00FF00')

xSport = np.linspace(cars.loc[maskSport,'Length'].min(),cars.loc[maskSport,'Length'].max(),100)
ySport = slopeSport*xSport + interceptSport
plt.plot(xSport,ySport,color='#0000FF')

# Add a legend
plt.legend()

# Add a title and axis labels
plt.title('Car Length Against Car Weight')
plt.ylabel('car_weight')
plt.xlabel('car_length')

plt.savefig('myMatplotlibFigure.png')

## Seaborn

I'm sure we all agree that that's quite a lot of code - and quite daunting if you've never seen it before. But don't worry! Seaborn is here to make you life easier.

Matplotlib is an extremely powerful module. However, it can be complex, so some packages, like Seaborn, build upon Matplotlib to make plotting a little quicker and easier.

### Scatter Plotting with `seaborn`

Let's start by recreating the plot above.

<div style="background-color:#abd9e9; border-radius: 5px; padding: 10pt"><strong>Task 12.4:</strong> Run the following cell to show how import <code>seaborn</code> and to create a scatter plot between length and weight for each car types (using a different colour) with a linear regression model fit for each.
<br/>
When you've done this, or if you get stuck, see the video <a href='https://youtu.be/YQrCY9YWUr0'>here</a> for a walkthrough.</div>

In [None]:
# Imports
import seaborn as sns

# Create a plot of Length vs Weight where colour (hue) is controlled by the type
#
# 'height' controls the figure height in inches
# 'truncate' prevents the regression extending beyond the data
carsimple=cars[cars.Type.isin(['SUV','Sedan','Sports'])]
sns.lmplot(x='Length',y='Weight',data=carsimple,hue='Type',height=5,truncate=True)

# Save figure
plt.savefig('mySeabornFigure.png')

### Faceted plotting with Seaborn

Isn't that a lot simpler?!

Seaborn is doing all the hard work for you - it creates the figure, the scatter plots, the legend and it does the regression and plots the model with error bounds too.

But what if we want to split the data across three plots? Again, Seaborn comes to the rescue.

<div style="background-color:#fdae61; border-radius: 5px; padding: 10pt"><strong>Exercise 12.5:</strong> Compare the following code cell to the code cell above. Can you spot the difference? Run the cell to show how easy it is to create a faceted plot (which is what this is called).
<br/>
When you've done this, or if you get stuck, see the video <a href='https://youtu.be/rLcfMBMMNKM'>here</a> for a walkthrough.</div>

In [None]:
# Imports
import seaborn as sns

# Create a plot of Length vs Weight where colour (hue) is controlled by the Type
# height controls the figure height in inches
# truncate prevents the regression extending beyond the data
sns.lmplot(x='Length',y='Weight',data=carsimple,hue='Type',col='Type',height=5,truncate=True)

# Save figure
plt.savefig('myFacetedFigure.png')

## Boxplots

Scatter and line plots are all part of Seaborn's relational plot tools. But sometimes we have categorical data (such as Type) and might want to use box plots to explore this data.

<div style="background-color:#fdae61; border-radius: 5px; padding: 10pt"><strong>Exercise 12.6:</strong> Read the cell below. This cells aims to create a boxplot using <code>seaborn</code> for the Weight of each car type (each TYPE should be a different colour). Create a new Markdown cell below and write down, in plain English, what each line is doing. 
<br/>
</div>

In [None]:
# Plot the data
sns.catplot(x='Type',
            y='Weight',
            data=cars,
            kind="box",
            height=5,
            aspect=1);

# Save the plot
plt.savefig('myBoxplot.png')

## Plotting Contexts

And finally, we often make plots for different purposes. And Seaborn has, yet again, got us covered. 

<div style="background-color:#fdae61; border-radius: 5px; padding: 10pt"><strong>Exercise 12.7:</strong> Run the following cell to show how the same scatter plot as above can be easily replicated with subtle display difference for four different contexts.
<br/>
When you've done this, or if you get stuck, see the video <a href='https://youtu.be/HSj180pc0q4'>here</a> for a walkthrough.</div>

In [None]:
with sns.plotting_context('notebook'):
    sns.lmplot(x='Length',y='Weight',data=cars,hue='Type',height=5,truncate=True)
    
with sns.plotting_context('paper'):
    sns.lmplot(x='Length',y='Weight',data=cars,hue='Type',height=5,truncate=True)
    
with sns.plotting_context('talk'):
    sns.lmplot(x='Length',y='Weight',data=cars,hue='Type',height=5,truncate=True)
    
with sns.plotting_context('poster'):
    sns.lmplot(x='Length',y='Weight',data=cars,hue='Type',height=5,truncate=True)

## Key Points

* `matplotlib` adds plotting functionality to your Python codes
* `seaborn` makes plotting lots of data very quick and easy
* `matplotlib` can be used to modify plots produced by `seaborn`
* `sns.plotting_context()` can be used to create different plots for different purposes
* Knowing how to plot exactly what you want will come with time, practice and a bit of on-line searching!