In [None]:
from IPython.display import display_html, display, Image

# Joint plots
## Via Matplotlib and friends


### Preamble:

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import whattheplot as wtp

In [None]:
%matplotlib inline

In [None]:
# Tweak the plot sizing:
plt.rcParams['figure.figsize']=(8,6)
# plt.rcParams['figure.dpi']=100

# And also tweak the font-sizing via a seaborn shortcut:
sns.set_context('talk')

### Data

In [None]:
df = wtp.data.load_iris()
features_df = df[df.columns[:4]]
features_df.columns.tolist()

### Joint plots
Often with a multi-dimensional dataset (particularly of the multi-class variety), it's useful to get an overview of how the features are distributed and correlated. 

For this - especially when dealing with larger datasets - joint plots which combine scatterplots, histograms and KDE's / contour-plots can be very useful.

Seaborn excels in this arena - so long as you want one of the ready-made options - and we'll jump right in with it (we'll reproduce some results in Matplotlib at the end for comparison).

In [None]:
x_var, y_var = 'sepal_width', 'petal_length'

In [None]:
sns.jointplot(x_var, y_var, data=df, kind='scatter')

Those scattered points are getting a bit too close to interpret, and the histograms are chunky.
Let's flip a switch:

In [None]:
sns.jointplot(x_var, y_var, data=df, kind='kde')

Nice, but it doesn't tell us about the class-breakdown. For that, we need the 'pairplot' routine, 
which is quite snazzy (though note, by default the off-diagonal subplots are just mirror-images!)

In [None]:
sns.pairplot(df, hue='target', vars=features_df.columns)