## What is Matplotlib?

> matplotlib is a library for making 2D plots of arrays in Python
>
> *-matplotlib.org/users/intro.html*

In [None]:
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.dates as mdt
import pandas as pd

In [None]:
matplotlib.__version__

In [None]:
alta = pd.read_csv('../data/snow-alta-1990-2017.csv')
alta['DATE'] = pd.to_datetime(alta.DATE)

In [None]:
weekly = (alta
    .set_index('DATE')
    .resample('w')
    .agg({'LATITUDE': 'first', 'LONGITUDE': 'last',
          'SNOW': 'mean', 'SNWD': 'mean',
         'TMAX': 'max', 'TMIN': 'min', 'TOBS': 'mean'}))

x = weekly.index
y = weekly.SNWD
alta_x = x
alta_y = y

In [None]:
plt.plot(x,y)

In [None]:
# last 60 weeks
weeks = 60
weeks = 60
x_weeks = x[-weeks:]
y_weeks = y.iloc[-weeks:]
plt.plot(x_weeks, y_weeks)

In [None]:
# Increase size
# last 60 weeks
fig, ax = plt.subplots(figsize=(8,6))
weeks = 60
x_weeks = x[-weeks:]
y_weeks = y.iloc[-weeks:]
ax.plot(x_weeks, y_weeks)

In [None]:
# Add 6 week MA
# Increase size
# last 60 weeks
fig, ax = plt.subplots(figsize=(8,6))
weeks = 60
x_weeks = x[-weeks:]
y_weeks = y.iloc[-weeks:]
y_weeks_ma = y.rolling(6).mean().iloc[-weeks:]

ax.plot(x_weeks, y_weeks, linewidth=2)
ax.plot(x_weeks, y_weeks_ma, color='b', linestyle='--')

In [None]:
# Add legend
# Add 6 week MA
# Increase size
# last 60 weeks
fig, ax = plt.subplots(figsize=(8,6))
weeks = 60
x_weeks = x[-weeks:]
y_weeks = y.iloc[-weeks:]
y_weeks_ma = y.rolling(6).mean().iloc[-weeks:]

ax.plot(x_weeks, y_weeks, linewidth=2)
ax.plot(x_weeks, y_weeks_ma, color='b', linestyle='--', label="MA")
ax.legend()

In [None]:
# Annotate High
# Add legend
# Add 6 week MA
# Increase size
# last 60 weeks
fig, ax = plt.subplots(figsize=(8,6))
weeks = 60
x_weeks = x[-weeks:]
y_weeks = y.iloc[-weeks:]
y_weeks_ma = y.rolling(6).mean().iloc[-weeks:]

ax.plot(x_weeks, y_weeks, linewidth=2)
ax.plot(x_weeks, y_weeks_ma, color='b', linestyle='--', label="MA")
ax.legend()

max_val = max(y_weeks)
max_idx = y_weeks.idxmax()
ax.annotate(f'Max {max_val:.1f}', xy=(mdt.date2num(max_idx), max_val),
           weight='bold', size=14)

min_val = min(y_weeks)
min_idx = y_weeks.idxmin()
ax.annotate(f'Min {min_val:.1f}', xy=(mdt.date2num(min_idx), min_val+10),
           weight='bold', size=14)

In [None]:
# Add title, remove spines
# Annotate High
# Add legend
# Add 6 week MA
# Increase size
# last 60 weeks
fig, ax = plt.subplots(figsize=(8,6))
weeks = 60
x_weeks = x[-weeks:]
y_weeks = y.iloc[-weeks:]
y_weeks_ma = y.rolling(6).mean().iloc[-weeks:]

ax.plot(x_weeks, y_weeks, linewidth=2)
ax.plot(x_weeks, y_weeks_ma, color='b', linestyle='--', label="MA")
ax.legend()

max_val = max(y_weeks)
max_idx = y_weeks.idxmax()
ax.annotate(f'Max {max_val:.1f}', xy=(mdt.date2num(max_idx), max_val),
           weight='bold', size=14)


min_val = min(y_weeks)
min_idx = y_weeks.idxmin()
ax.annotate(f'Min {min_val:.1f}', xy=(mdt.date2num(min_idx), min_val+10),
           weight='bold', size=14)

ax.get_yaxis().set_visible(False)
for side in ['left', 'top', 'right', 'bottom']:
    ax.spines[side].set_visible(False)
ax.set_title(f'{weeks} Weeks of Snow')
fig.autofmt_xdate()

In [None]:
# save
fig.savefig('demo-plot.png', dpi=300)

In [None]:
# Non-Jupyter - render plot
plt.show()

## Interfaces

Matplotlib has an object-oriented interface to create *figures*, add *axes*, and plot on the *axes*. In addition there is a *state machine* interface found in the ``pyplot`` package. This interface mimics MATLAB.

OO Interface:

* **Figure** - The canvas. You can add axes to it.
* **Axes** - This is a plot. It can have a title, an x label, and a y label. A 2D plot has 2 axis. A 3D plot has 3.
* **Axis** - (Note spelling) - These hold ticks, ticklabels, and define limits. You can customize a *Locator* and *Formatter* to adjust the position and labels.
* **Artist** - (Confusing term) for describing what you can draw on a plot.
* **Backend** - This is the rendering engine for creating a plot (pdf, png, svg, etc) as well as an interface (pygtk, wx, macos, inline). For Jupyter, we specify the cell magic ``%matplotlib inline``

In [None]:
# OO Example
from matplotlib.figure import Figure
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
from IPython.core.display import display

fig = Figure()
FigureCanvas(fig)  # Figure needs a canvas (pyplot does this for us)
ax = fig.add_subplot(111)
ax.plot(x, y)
display(fig)

In [None]:
# pyplot example
import matplotlib.pyplot as plt
plt.plot(x,y)

In [None]:
# In practice I create figures or axes w/ pyplot
ax = plt.subplot(111)
# fig, ax = plt.subplots()  # to get figure as well
ax.plot(x, y)

The ``plt`` interface has a few functions (``figure``, ``legend``, ``title``, ``xlabel``, ``xlim``, ``xscale``, ``xticks``, ``ylabel``, ``ylim``, ``yscale``, ``yticks``), that have corresponding accessors on ``ax``. Ie ``ax.get_xticks()``. In addition you can use ``ax.set(title='foo', xlabel='bar', xlim=(1,100))`` to set multiple attributes.

## Lab Data
This section will load the lab data.

In [None]:
#%%time
nyc = pd.read_csv('../data/central-park-raw.csv', parse_dates=[0])
nyc_weekly = (nyc
 .rename(columns={'Mean TemperatureF': 'temp'})
 .set_index('EST')
 .resample('w')
 ['temp']
 .mean()
)

## Exercise: Interface
Using the ``nyc`` data:
* Create a ``nyc_weekly`` variable that is a series with a weekly index and the average temp
* Plot the average temp on a weekly level using the OO style interface
* Plot the average temp on a weekly level using the ``plt`` style interface
* Bump the figure size up to 10 inches wide by 8 inches tall in both plots

## Basic Plots
Matplotlib supports a variety of plots out of the box.

In [None]:
# Line Plot
fig, ax = plt.subplots()
ax.plot(x, y)

In [None]:
# Bar Plot
fig, ax = plt.subplots(figsize=(8,6))
ax.bar(x, y)

In [None]:
# Bar Plot
# width may need to be tweaked
fig, ax = plt.subplots(figsize=(8,6))
ax.bar(x, y, width=20)

In [None]:
# Scatter Plot - Using .scatter can be slower than plot. Use .scatter when you want to 
# tweak attribute
fig, ax = plt.subplots(figsize=(8,6))
ax.scatter(x, y, marker='o', alpha=.5)

In [None]:
# Scatter Plot - Using .scatter can be slower than plot. Use .scatter when you want to 
# tweak attribute
fig, ax = plt.subplots(figsize=(8,6))
ax.scatter(x, y, marker='o', alpha=.5)

In [None]:
fig, ax = plt.subplots(figsize=(8,6))
s = ax.scatter(alta.TOBS, alta.SNOW, c=alta.DATE.dt.month,
           s=alta.SNWD**1.2,
           marker='o', 
           alpha=.5,
           cmap='viridis')
# add legend
plt.colorbar(s)

In [None]:
# boxplot
fig, ax = plt.subplots(figsize=(8,6))
data = [(name, list(g.SNOW.fillna(0))) 
        for name, g in alta.groupby(alta.DATE.dt.year)]
year_data = [x[1] for x in data]
years = [x[0] for x in data]
_ = ax.boxplot(year_data,
              labels=years, vert=False
              )
plt.tight_layout()

In [None]:
# violin plot
fig, ax = plt.subplots(figsize=(8,6))
data = [(name, list(g.SNOW.fillna(0))) 
        for name, g in alta.groupby(alta.DATE.dt.year)]
year_data = [x[1] for x in data]
years = [x[0] for x in data]
size = 10
ax.violinplot(year_data[:size])
# No labels parameter for violinplot...
ax.set_xticks(range(1, size + 1))  # tell labels to start at 1 instead of 0
_ = ax.set_xticklabels(years[:size])

In [None]:
# Histogram - careful of values that you feed it
# (NaN's will cause it to fail with older matplotlibs)
fig, ax = plt.subplots(figsize=(8,6))
ax.hist(y.values)

In [None]:
# Histogram - change bins
fig, ax = plt.subplots(figsize=(8,6))
_ = ax.hist(y.dropna(), bins=100)

In [None]:
# Pie
fig, ax = plt.subplots(figsize=(8,6))
_=ax.pie([10, 5], labels=['10', '5'])
ax.legend()

## Exercise: Plot Types

*  Plot a line plot of the ``nyc_weekly`` data
*  Plot a bar plot of the ``nyc_weekly`` data
* Plot a scatter plot of the ``nyc_weekly`` data
* Plot a histogram of the ``nyc_weekly`` data
* Plot a pie chart of the ``nyc_weekly`` data

## Architecture

In [None]:
# Notice what ax.plot returns
fig, ax = plt.subplots(figsize=(8,6))
ax.plot(x, y)

In [None]:
fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)

In [None]:
print(dir(res[0]))

In [None]:
help(res[0])

In [None]:
# Change color and line style
fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)
line = res[0]  # Once we have line we can use tab completion
line.set_c('#c07fef')
line.set_linestyle('--')

In [None]:
# Set the title
fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)
line = res[0]  # Once we have line we can use tab completion
line.set_c('#c07fef')
line.set_linestyle('--')
title = ax.set_title('Alta Snow Levels')

In [None]:
title

In [None]:
# Tweak title position
fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)
line = res[0]  # Once we have line we can use tab completion
line.set_c('#c07fef')
line.set_linestyle('--')
title = ax.set_title('Alta Snow Levels')
title.set_position((.2,.7))

In [None]:
# let's look at the ax
print(dir(ax))

In [None]:
yax = ax.get_yaxis()

In [None]:
for member in dir(yax):
    if member.startswith('get'):
        try:
            print(f'{member:20}: {getattr(yax, member)()}')
        except TypeError:
            print(f'**ERR with {member}')
yax.get_scale()

In [None]:
# Customize tick locations
fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)
line = res[0]  # Once we have line we can use tab completion
line.set_c('#c07fef')
line.set_linestyle('--')
title = ax.set_title('Alta Snow Levels')
title.set_position((.2,.7))

yax = ax.get_yaxis()
yax.set_ticks([-10, 0, 20, 200])

In [None]:
# Remove spines

fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)
line = res[0]  # Once we have line we can use tab completion
line.set_c('#c07fef')
line.set_linestyle('--')
title = ax.set_title('Alta Snow Levels')
title.set_position((.2,.7))

yax = ax.get_yaxis()
yax.set_ticks([-10, 0, 20, 200])

import matplotlib
for c in ax.get_children():
    if isinstance(c, matplotlib.spines.Spine):
        c.set_visible(False)

In [None]:
# Remove ticks

fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)
line = res[0]  # Once we have line we can use tab completion
line.set_c('#c07fef')
line.set_linestyle('--')
title = ax.set_title('Alta Snow Levels')
title.set_position((.2,.7))

yax = ax.get_yaxis()
yax.set_ticks([-10, 0, 20, 200])

import matplotlib
for c in ax.get_children():
    if isinstance(c, matplotlib.spines.Spine):
        c.set_visible(False)
        
ax.tick_params(bottom=False, left=False)

In [None]:
# Jupyter hint
ax.tick_params??

## Exercise: Architecture

Using the ``nyc_weekly`` data set, create a line plot, then
* Bump up the figure size to ``(8,6)``
* Set the yticks to ``[0, 32, 80, 100]``
* Add a title in the middle of the plot ``Central Park Temp``
* Remove top and right spine

## Annotating Charts

In [None]:
# Add text to chart in data coordinates
# Note Jupyter version might fail without ``clip_on=True``)
fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)
ax.text(ax.get_xlim()[0], 120, 'Snow Levels', clip_on=True)

In [None]:
# Tweak text properties
fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)
# Need TTF/OTF fonts (TTC currently fail)
ax.text(ax.get_xlim()[0], 120, 'Snow Levels', clip_on=True,
       family='Comic Sans MS', size=20)
# bbox is a mpl.patches.Rectangle
ax.text(ax.get_xlim()[0], 140, 'Box', clip_on=True,
       family='Serif', size=20, style='italic', bbox={'facecolor': 'red', 'alpha': .5})

In [None]:
# Add text to chart USING 0-1 coordinates
fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)
# Need TTF/OTF fonts (TTC currently fail)
ax.text(0, .8, 'Snow Levels', clip_on=True,
       family='Comic Sans MS', size=20,
       transform=ax.transAxes)
# bbox is a mpl.patches.Rectangle
ax.text(.2, .9, 'Box', clip_on=True,
       family='Serif', size=20, style='italic', 
        bbox={'facecolor': 'red', 'alpha': .5},
       transform=ax.transAxes)

In [None]:
x

In [None]:
# Add text to chart in data coordinates
fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)

ax.annotate(f'Min {y.min()}', xy=(x.min(), y.min()), 
            xytext=(x.min(), y.min() + 3))

# arrowprops is a mpl.patches.FancyArrowPatch
_ = ax.annotate(f'Max {y.max():.1f}', xy=(y.idxmax(), y.max()), 
                #xytext=(y.idxmax()+150, y.max() - 20),
                # pandas 1.0 compat
                xytext=(y.idxmax()+y.idxmax().freq*100, y.max() - 20),
                family='comic sans ms',
                arrowprops={})

In [None]:
# Add text to chart USING 0-1 coordinates
fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)

ax.annotate(f'Min {y.min()}',xy=(0, 0), 
            xytext=(0, .1),
            xycoords='axes fraction', 
            textcoords='axes fraction')
# arrowprops is a mpl.patches.FancyArrowPatch
_ = ax.annotate(f'Max {y.max():.1f}', xy=(y.idxmax(), y.max()), 
                xytext=(.5, .8),
                textcoords='axes fraction',
                family='comic sans ms',
                arrowprops={})

## Exercise: Annotation
Using the last 20 rows of ``nyc_weekly`` make a new dataset ``c3``. Plot a bar plot. Remove the y axis. Label each bar with its value right above (or inside of the top of) the bar. (See https://matplotlib.org/users/text_props.html page for help with rotation or vertical or horizontal alignment) (Might need to tweak ``width`` parameter of call to ``ax.bar``)

## Configuring Matplotlib

In [None]:
plt.style.use(plt.rcParamsDefault)
fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)

In [None]:
# Default style is stored here
print(plt.rcParamsDefault)

In [None]:
print(plt.style.available)

In [None]:
# note that changing the style leaves it changed
plt.style.use('ggplot')
fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)

In [None]:
# use a context manager for temporary changes
with plt.style.context('fivethirtyeight'):
    fig, ax = plt.subplots(figsize=(8,6))
    res = ax.plot(x, y)

In [None]:
# go back to defaults
matplotlib.rcdefaults()

fig, ax = plt.subplots(figsize=(8,6))
res = ax.plot(x, y)

In [None]:
# can create a configuration file in $MPLCONFIGDIR environment variable
# MPLCONFIGDIR/
#      matplotlibrc - default styles
#      stylelib/CUSTOM.mlpstyle - can use matplotlib.style.use('CUSTOM') 
#             (Might need to plt.style.reload_library() )
matplotlib.get_configdir()

In [None]:
# style loading done at import time. need to reload
import importlib, os
folder = os.path.join(matplotlib.get_configdir(), 'stylelib')
if not os.path.exists(folder):
    os.mkdir(folder)
cfg_name = os.path.join(folder, 'big.mplstyle')
with open(cfg_name, 'w') as fout:
    fout.write("""
axes.labelsize : 36
lines.linewidth : 4
xtick.labelsize : 24
ytick.labelsize : 32
    """)
print(cfg_name)
plt.style.reload_library()
with plt.style.context('big'):
    fig, ax = plt.subplots(figsize=(8,6))
    # these fail in Jupyter currently (Sep 2018 - https://github.com/jupyter/notebook/issues/3385)
    matplotlib.rcParams['grid.color'] = 'r'
    matplotlib.rcParams['grid.linestyle'] = ':'
    matplotlib.rc('grid', color='r', alpha=1, linestyle='-', linewidth=1)
    res = ax.plot(x, y)

In [None]:
# linestyle - see help for ax.plot
fig, ax = plt.subplots(figsize=(8,6))

for i, (name, shortcut) in enumerate([('solid', '-',), ('dashed', '--'), 
    ('dashdot', '-.'), ('dotted', ':')]):
    ax.plot(y[-10:]+i*20, linestyle=shortcut, label=name, linewidth=3)
ax.legend()

In [None]:
# Adjust axis limits
fig, ax = plt.subplots()
ax.set_ylim((0, 300))
ax.plot(y, linestyle=shortcut, label=name, linewidth=3)

In [None]:
# Adjust Labels
# Can also set with ax.axis, which can also set x/y scale to same value with 'equal'.
fig, ax = plt.subplots()
#ax.axis('equal')
ax.plot(y, linestyle=shortcut, label=name, linewidth=3)
ax.set_title('Alta Snow Levels')
ax.set_xlabel('Year')
ax.set_ylabel('Inches')

In [None]:
# colormaps (useful on scatter plots)
fig, ax = plt.subplots()
ax.set_ylim((0, 150))
ax.scatter(x, y, alpha=.3, c=y, cmap='PiYG')

In [None]:
# see https://matplotlib.org/examples/color/colormaps_reference.html
import numpy as np
colormaps = [x for x in plt.cm.datad.keys() if not x.endswith('_r')]

fig, axes = plt.subplots(nrows=len(colormaps), figsize=(10,18))
plt.subplots_adjust(top=0.8,bottom=0.05,left=0.01,right=0.99)
gradient = np.linspace(0, 1, 256)
gradient = np.vstack((gradient, gradient))
for i, m in enumerate(sorted(colormaps)):
    ax = axes[i]
    ax.set_axis_off()
    ax.imshow(gradient,aspect='auto',
               cmap=m)
    pos = list(ax.get_position().bounds)
    x_text = pos[0] - 0.01
    y_text = pos[1] + pos[3]/2.
    fig.text(x_text, y_text, m, va='center', ha='right', fontsize=10)


In [None]:
# Line Plot - Colors (can specify with HEX)
fig, ax = plt.subplots()
ax.plot(x, color='#C07FEF', linewidth=3, linestyle='--')

In [None]:
# Bar Plot color - pass in a single color or parallel sequence
last10 = y.iloc[-10:]
fig, ax = plt.subplots()
colors = ['red' for val in last10]
colors[0] = '#c07fef'
colors[-1] = '#fef70c'
ax.bar(last10.index, last10, color=colors, width=4)
fig.autofmt_xdate()  # tweak dates

## Exercise: Customize

* Create a line plot of ``nyc_weekly`` temporarily using the ``'dark_background'`` style, dotted lines, and a linewidth of 5
* Using the bar plot from the annotation exercise, color all the bars grey, except color the minimum value red and the maximum value green.

## 3D and other Tools

In [None]:
pd.to_numeric(alta.DATE) #.astype(int)

In [None]:
%matplotlib notebook
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(xs=pd.to_numeric(alta.DATE), ys=alta.SNOW, zs=alta.SNWD, alpha=.5)

In [None]:
%matplotlib inline
# Personally try to avoid 3D (though interactivity helps)
# Alternative is to plot scatterplots of pairs of variables
# Annoying in Matplotlib ... but

import seaborn as sns
res = sns.pairplot(data=alta.reset_index()[['DATE', 'SNOW', 'SNWD']])
#res.axes[0][0].plot(range(10))

In [None]:
# notices the ``axes`` and ``fig``
print(dir(res))

In [None]:
%matplotlib inline
%matplotlib inline

# pandas
ax = y.plot()
ax.set_title('Pandas plot of Snowdepth')

In [None]:
%matplotlib inline

# pandas
ax = y.iloc[-20:].plot.bar()
ax.set_title('Pandas plot of Snowdepth')

In [None]:
%matplotlib inline

# pandas
ax = y.iloc[-20:].plot.bar()
ax.set_title('Pandas plot of Snowdepth')
fig = ax.get_figure()
fig.set_size_inches(8,6)
fig.autofmt_xdate()

In [None]:
%matplotlib inline

# pandas
ax = y.iloc[-20:].plot.barh()
ax.set_title('Pandas plot of Snowdepth')

In [None]:
# note Matplotlib as of 1.5 has some support for pandas DataFrames
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot('DATE', 'SNWD', data=alta)  
#ax.plot(x='DATE', y='SNWD', data=alta)  # fails!

## Other options for Plotting
* Bokeh - Aimed at HTML plots (interactive)
* Plotly - Service for plotting
* Altair - Declarative visualization

## Exercise: 3D and Pandas
* Plot a 3D scatter plot with the ``nyc`` dataset, plotting 'Mean TemperatureF', ' Mean Humidity', and 'EST' (Hint might need to limit the number and only look at 1_000 or so)
