**Matplotlib** is a desktop plotting package designed for creating (mostly twodimensional)
publication-quality plots. The project was started by John Hunter in
2002 to enable a MATLAB-like plotting interface in Python. The matplotlib and IPython
communities have collaborated to simplify interactive plotting from the IPython
shell (and now, Jupyter notebook). matplotlib supports various GUI backends on all
operating systems and additionally can export visualizations to all of the common
vector and raster graphics formats (PDF, SVG, JPG, PNG, BMP, GIF, etc.).  
  
Over time, matplotlib has spawned a number of add-on toolkits for data visualization
that use matplotlib for their underlying plotting. One of these is seaborn, which we
explore later in this chapter.  
  
The simplest way to follow the code examples in the chapter is to use interactive plotting
in the Jupyter notebook. To set this up, execute the following statement in a
Jupyter notebook:  
`%matplotlib notebook` in Jupyter (or simply `%matplotlib` in IPython)

In [1]:
%matplotlib notebook

# A Brief matplotlib API Primer

With matplotlib, we use the following import convention:

In [4]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
data = np.arange(10)
data

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [3]:
plt.plot(data)

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x1fdf1f23520>]

## Figures and Subplots
Plots in matplotlib reside within a `figure` object. You can create a new figure with plt.figure. plt.figure has a number of options; notably,
figsize will guarantee the figure has a certain size and aspect ratio if saved to disk. You can’t make a plot with a blank figure. You have to create one or more subplots using add_subplot:

In [4]:
fig = plt.figure()
ax1 = fig.add_subplot(2, 2, 1)
ax2 = fig.add_subplot(2, 2, 2)
ax3 = fig.add_subplot(2, 2, 3)

# (2, 2, 1) this means that the figure should be 2 × 2 
# (so up to 4 plots in total), and we’re selecting the first of four 
# subplots (numbered from 1).

<IPython.core.display.Javascript object>

When you issue a plotting command like plt.plot([1.5, 3.5, -2, 1.6]), matplotlib
draws on the last figure and subplot used (creating one if necessary), thus hiding
the figure and subplot creation. So if we add the following command, you’ll get
something like below -

In [5]:
fig = plt.figure()
ax1 = fig.add_subplot(2, 2, 1)
ax2 = fig.add_subplot(2, 2, 2)
ax3 = fig.add_subplot(2, 2, 3)
plt.plot(np.random.randn(50).cumsum(), 'k--')

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x1fdf2126af0>]

In [6]:
fig = plt.figure()
ax1 = fig.add_subplot(2, 2, 1)
ax2 = fig.add_subplot(2, 2, 2)
ax3 = fig.add_subplot(2, 2, 3)
ax1.hist(np.random.randn(100), bins=20, color='k', alpha=0.3)
ax2.scatter(np.arange(30), np.arange(30) + 3 * np.random.randn(30))
ax3.plot(np.random.randn(50).cumsum(), 'k--')
# 'k--' is a style option instructing matplotlib to plot a black dashed line

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x1fdf2244880>]

Creating a figure with a grid of subplots is a very common task, so matplotlib
includes a convenience method, plt.subplots, that creates a new figure and returns
a NumPy array containing the created subplot objects:

In [7]:
fig, axes = plt.subplots(2, 3)
axes

<IPython.core.display.Javascript object>

array([[<AxesSubplot:>, <AxesSubplot:>, <AxesSubplot:>],
       [<AxesSubplot:>, <AxesSubplot:>, <AxesSubplot:>]], dtype=object)

This is very useful, as the axes array can be easily indexed like a two-dimensional
array; for example, axes[0, 1]. You can also indicate that subplots should have the
same x- or y-axis using sharex and sharey, respectively. This is especially useful
when you’re comparing data on the same scale; otherwise, matplotlib autoscales plot
limits independently. See below table for more on this method.

![image.png](attachment:image.png)

## Adjusting the spacing around subplots

By default matplotlib leaves a certain amount of padding around the outside of the
subplots and spacing between subplots. This spacing is all specified relative to the
height and width of the plot, so that if you resize the plot either programmatically or
manually using the GUI window, the plot will dynamically adjust itself. You can
change the spacing using the subplots_adjust method on Figure objects, also available
as a top-level function:
  
`subplots_adjust(left=None, bottom=None, right=None, top=None,
                 wspace=None, hspace=None)`
  
wspace and hspace controls the percent of the figure width and figure height, respectively,
to use as spacing between subplots. Here is a small example where I shrink the
spacing all the way to zero

In [8]:
fig, axes = plt.subplots(2, 2, sharex=True, sharey=True)
for i in range(2):
    for j in range(2):
        axes[i, j].hist(np.random.randn(500), bins=50, color='k', alpha=0.5)
plt.subplots_adjust(wspace=0, hspace=0)

<IPython.core.display.Javascript object>

## Colors, Markers, and Line Styles

Matplotlib’s main plot function accepts arrays of x and y coordinates and optionally a
string abbreviation indicating color and line style. For example, to plot x versus y
with green dashes, you would execute:  
`ax.plot(x, y, 'g--')`  
This way of specifying both color and line style in a string is provided as a convenience;
in practice if you were creating plots programmatically you might prefer not
to have to munge strings together to create plots with the desired style. The same plot
could also have been expressed more explicitly as:  
`ax.plot(x, y, linestyle='--', color='g')`  
There are a number of color abbreviations provided for commonly used colors, but
you can use any color on the spectrum by specifying its hex code (e.g., '#CECECE').

Line plots can additionally have markers to highlight the actual data points. Since
matplotlib creates a continuous line plot, interpolating between points, it can occasionally
be unclear where the points lie. The marker can be part of the style string,
which must have color followed by marker type and line style

In [9]:
plt.plot(np.random.randn(30).cumsum(), 'ko--')
# This could also have been written more explicitly as:
# plt.plot(np.random.randn(30).cumsum(), color='k', linestyle='dashed', 
#          marker='o')

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x1fdf34459a0>]

For line plots, you will notice that subsequent points are linearly interpolated by
default. This can be altered with the drawstyle option

In [11]:
data = np.random.randn(30).cumsum()
plt.plot(data, 'k--', label='Default')
plt.plot(data, 'k-', drawstyle='steps-post', label='steps-post')
plt.legend(loc='best')

<IPython.core.display.Javascript object>

<matplotlib.legend.Legend at 0x1fdf34a1430>

## Ticks, Labels, and Legends

For most kinds of plot decorations, there are two main ways to do things: using the
procedural pyplot interface (i.e., matplotlib.pyplot) and the more object-oriented
native matplotlib API.
The pyplot interface, designed for interactive use, consists of methods like xlim,
xticks, and xticklabels. These control the plot range, tick locations, and tick labels,
respectively. They can be used in two ways:
+ Called with no arguments returns the current parameter value (e.g., plt.xlim() returns the current x-axis plotting range)
+ Called with parameters sets the parameter value (e.g., plt.xlim([0, 10]), sets the x-axis range to 0 to 10)  

All such methods act on the active or most recently created AxesSubplot. Each of
them corresponds to two methods on the subplot object itself; in the case of xlim
these are ax.get_xlim and ax.set_xlim. I prefer to use the subplot instance methods
myself in the interest of being explicit (and especially when working with multiple
subplots), but you can certainly use whichever you find more convenient.

## Setting the title, axis labels, ticks, and ticklabels
To set the x-axis ticks, it’s easiest to use set_xticks and set_xticklabels. The
former instructs matplotlib where to place the ticks along the data range; by default
these locations will also be the labels. But we can set any other values as the labels
using set_xticklabels.The rotation option sets the x tick labels at a 30-degree rotation. set_xlabel
gives a name to the x-axis and set_title the subplot title

In [12]:
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.plot(np.random.randn(1000).cumsum())
ticks = ax.set_xticks([0, 250, 500, 750, 1000])
labels = ax.set_xticklabels(['one', 'two', 'three', 'four', 'five'],
                            rotation=30, fontsize='small')
ax.set_title('My first matplotlib plot')
ax.set_xlabel('Stages')

<IPython.core.display.Javascript object>

Text(0.5, 0, 'Stages')

Modifying the y-axis consists of the same process, substituting y for x in the above.
The axes class has a set method that allows batch setting of plot properties. From the
prior example, we could also have written:  
`props = {
            'title': 'My first matplotlib plot',
            'xlabel': 'Stages'
         }
ax.set(**props)`

## Adding legends

Legends are another critical element for identifying plot elements. There are a couple
of ways to add one. The easiest is to pass the label argument when adding each piece
of the plot and then call ax.legend() or plt.legend() to automatically
create a legend.  
The loc tells matplotlib where to place the plot. If you aren’t picky, 'best' is a good
option, as it will choose a location that is most out of the way. To exclude one or more
elements from the legend, pass no label or label='\_nolegend_'.

In [14]:
from numpy.random import randn
fig = plt.figure(); ax = fig.add_subplot(1, 1, 1)
ax.plot(randn(500).cumsum(), 'k', label='one')
ax.plot(randn(500).cumsum(), 'k--', label='two')
ax.plot(randn(500).cumsum(), 'k.', label='three')
ax.legend(loc='best')

<IPython.core.display.Javascript object>

<matplotlib.legend.Legend at 0x1fdf3609f40>

## Annotations and Drawing on a Subplot
In addition to the standard plot types, you may wish to draw your own plot annotations,
which could consist of text, arrows, or other shapes. You can add annotations
and text using the text, arrow, and annotate functions. text draws text at given
coordinates (x, y) on the plot with optional custom styling:  
`ax.text(x, y, 'Hello world!', family='monospace', fontsize=10)`  
Annotations can draw both text and arrows arranged appropriately. As an example,
let’s plot the closing S&P 500 index price since 2007 (obtained from Yahoo! Finance)
and annotate it with some of the important dates from the 2008–2009 financial crisis.

In [17]:
from datetime import datetime
import pandas as pd

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
data = pd.read_csv('spx.csv', index_col=0, parse_dates=True)
spx = data['SPX']
spx.plot(ax=ax, style='k-')

crisis_data = [
(datetime(2007, 10, 11), 'Peak of bull market'),
(datetime(2008, 3, 12), 'Bear Stearns Fails'),
(datetime(2008, 9, 15), 'Lehman Bankruptcy')
]

for date, label in crisis_data:
    ax.annotate(label, xy=(date, spx.asof(date) + 75),
                xytext=(date, spx.asof(date) + 225),
                arrowprops=dict(facecolor='black', headwidth=4, width=2,
                                headlength=4),
                horizontalalignment='left', verticalalignment='top')
    
# Zoom in on 2007-2010
ax.set_xlim(['1/1/2007', '1/1/2011'])
ax.set_ylim([600, 1800])
ax.set_title('Important dates in the 2008-2009 financial crisis')

<IPython.core.display.Javascript object>

Text(0.5, 1.0, 'Important dates in the 2008-2009 financial crisis')

There are a couple of important points to highlight in this plot: the ax.annotate
method can draw labels at the indicated x and y coordinates. We use the set_xlim
and set_ylim methods to manually set the start and end boundaries for the plot
rather than using matplotlib’s default. Lastly, ax.set_title adds a main title to the
plot.

Drawing shapes requires some more care. matplotlib has objects that represent many
common shapes, referred to as patches. Some of these, like Rectangle and Circle, are
found in matplotlib.pyplot, but the full set is located in matplotlib.patches.  
To add a shape to a plot, you create the patch object shp and add it to a subplot by
calling ax.add_patch(shp)

In [18]:
fig = plt.figure()
ax = fig.add_subplot(1,1,1)

rect = plt.Rectangle((0.2, 0.75), 0.4, 0.15, color='k', alpha=0.3)
circ = plt.Circle((0.7, 0.2), 0.15, color='b', alpha=0.3)
pgon = plt.Polygon([[0.15, 0.15], [0.35, 0.4], [0.2, 0.6]],
                   color='g', alpha=0.5)

ax.add_patch(rect)
ax.add_patch(circ)
ax.add_patch(pgon)

<IPython.core.display.Javascript object>

<matplotlib.patches.Polygon at 0x1fdf949cfd0>

## Saving Plots to File

You can save the active figure to file using plt.savefig. This method is equivalent to
the figure object’s savefig instance method. For example, to save an SVG version of a
figure, you need only type:  
`plt.savefig('figpath.svg')`  
The file type is inferred from the file extension. So if you used .pdf instead, you
would get a PDF. There are a couple of important options that I use frequently for
publishing graphics: dpi, which controls the dots-per-inch resolution, and
bbox_inches, which can trim the whitespace around the actual figure. To get the
same plot as a PNG with minimal whitespace around the plot and at 400 DPI, you
would do:  
`plt.savefig('figpath.png', dpi=400, bbox_inches='tight')`  
savefig doesn’t have to write to disk; it can also write to any file-like object, such as a
BytesIO:  
`from io import BytesIO
buffer = BytesIO()
plt.savefig(buffer)
plot_data = buffer.getvalue()`  
![image.png](attachment:image.png)

## matplotlib Configuration

matplotlib comes configured with color schemes and defaults that are geared primarily
toward preparing figures for publication. Fortunately, nearly all of the default
behavior can be customized via an extensive set of global parameters governing figure
size, subplot spacing, colors, font sizes, grid styles, and so on. One way to modify the
configuration programmatically from Python is to use the rc method; for example, to
set the global default figure size to be 10 × 10, you could enter:
`plt.rc('figure', figsize=(10, 10))`  
The first argument to rc is the component you wish to customize, such as 'figure',
'axes', 'xtick', 'ytick', 'grid', 'legend', or many others. After that can follow a
sequence of keyword arguments indicating the new parameters. An easy way to write
down the options in your program is as a dict:  
`font_options = {'family':'monospace', 'weight':'bold', 'size':small'}
plt.rc('font', **font_options)`  
For more extensive customization and to see a list of all the options, matplotlib comes
with a configuration file matplotlibrc in the matplotlib/mpl-data directory. If you customize
this file and place it in your home directory titled .matplotlibrc, it will be
loaded each time you use matplotlib.

# Plotting with pandas and seaborn
matplotlib can be a fairly low-level tool. You assemble a plot from its base components:
the data display (i.e., the type of plot: line, bar, box, scatter, contour, etc.), legend,
title, tick labels, and other annotations.
In pandas we may have multiple columns of data, along with row and column labels.
pandas itself has built-in methods that simplify creating visualizations from Data‐
Frame and Series objects. Another library is seaborn, a statistical graphics library created
by Michael Waskom. Seaborn simplifies creating many common visualization
types.  
_Importing seaborn modifies the default matplotlib color schemes
and plot styles to improve readability and aesthetics. Even if you do
not use the seaborn API, you may prefer to import seaborn as a
simple way to improve the visual aesthetics of general matplotlib
plots_

## Line Plots
Series and DataFrame each have a plot attribute for making some basic plot types. By
default, plot() makes line plots.

In [20]:
s = pd.Series(np.random.randn(10).cumsum(), index=np.arange(0, 100, 10))
s

0     0.747722
10    1.484124
20   -0.129029
30   -0.646146
40   -1.997061
50   -1.945959
60   -0.002310
70   -0.586768
80   -1.277649
90   -1.862676
dtype: float64

The Series object’s index is passed to matplotlib for plotting on the x-axis, though you
can disable this by passing use_index=False. The x-axis ticks and limits can be
adjusted with the xticks and xlim options, and y-axis respectively with yticks and ylim.

In [21]:
s.plot()

<IPython.core.display.Javascript object>

<AxesSubplot:>

See Table for a full listing of plot options.
![image.png](attachment:image.png)

Most of pandas’s plotting methods accept an optional ax parameter, which can be a
matplotlib subplot object. This gives you more flexible placement of subplots in a grid
layout.
DataFrame’s plot method plots each of its columns as a different line on the same
subplot, creating a legend automatically

In [24]:
df = pd.DataFrame(np.random.randn(10, 4).cumsum(0), 
                  columns=['A', 'B', 'C', 'D'], index=np.arange(0, 100, 10))
df.plot()

<IPython.core.display.Javascript object>

<AxesSubplot:>

The plot attribute contains a “family” of methods for different plot types. For example,df.plot() is equivalent to df.plot.line().
![image.png](attachment:image.png)
![image-2.png](attachment:image-2.png)

## Bar Plots
The plot.bar() and plot.barh() make vertical and horizontal bar plots, respectively.
In this case, the Series or DataFrame index will be used as the x (bar) or y
(barh) ticks

In [29]:
fig, axes = plt.subplots(2,1)
data = pd.Series(np.random.rand(16), index=list('abcdefghijklmnop'))
data.plot.bar(ax=axes[0], color='k', alpha=0.7)
data.plot.barh(ax=axes[1], color='k', alpha=0.7)

<IPython.core.display.Javascript object>

<AxesSubplot:>

The options color='k' and alpha=0.7 set the color of the plots to black and use partial
transparency on the filling.

With a DataFrame, bar plots group the values in each row together in a group in bars,
side by side, for each value.

In [30]:
df = pd.DataFrame(np.random.rand(6, 4), 
                  index=['one', 'two', 'three', 'four', 'five', 'six'], 
                  columns=pd.Index(['A', 'B', 'C', 'D'], name='Genus'))
df

Genus,A,B,C,D
one,0.91549,0.657331,0.748207,0.942404
two,0.196607,0.52595,0.059539,0.751468
three,0.816425,0.210343,0.328233,0.888429
four,0.578146,0.519837,0.071736,0.28723
five,0.219798,0.225786,0.717291,0.251091
six,0.729386,0.639939,0.050955,0.40076


In [31]:
df.plot.bar()

<IPython.core.display.Javascript object>

<AxesSubplot:>

In [34]:
df.plot.barh(stacked=True)

<IPython.core.display.Javascript object>

<AxesSubplot:>

_A useful recipe for bar plots is to visualize a Series’s value frequency
using value_counts: s.value_counts().plot.bar()._

Returning to the tipping dataset used earlier in the book, suppose we wanted to make
a stacked bar plot showing the percentage of data points for each party size on each
day. I load the data using read_csv and make a cross-tabulation by day and party size:

In [5]:
tips = pd.read_csv('tips.csv')
tips.head()

Unnamed: 0,total_bill,tip,smoker,day,time,size
0,16.99,1.01,No,Sun,Dinner,2
1,10.34,1.66,No,Sun,Dinner,3
2,21.01,3.5,No,Sun,Dinner,3
3,23.68,3.31,No,Sun,Dinner,2
4,24.59,3.61,No,Sun,Dinner,4


In [36]:
party_counts = pd.crosstab(tips['day'], tips['size'])
party_counts

size,1,2,3,4,5,6
day,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Fri,1,16,1,1,0,0
Sat,2,53,18,13,1,0
Sun,0,39,15,18,3,1
Thur,1,48,4,5,1,3


In [38]:
# Not many 1- and 6-person parties
party_counts = party_counts.loc[:, 2:5]
party_counts

size,2,3,4,5
day,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Fri,16,1,1,0
Sat,53,18,13,1
Sun,39,15,18,3
Thur,48,4,5,1


In [39]:
# Normalize so that each row sums to 1 and then plot
party_pcts = party_counts.div(party_counts.sum(1), axis=0)
party_pcts

size,2,3,4,5
day,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Fri,0.888889,0.055556,0.055556,0.0
Sat,0.623529,0.211765,0.152941,0.011765
Sun,0.52,0.2,0.24,0.04
Thur,0.827586,0.068966,0.086207,0.017241


In [41]:
# Breakdown -
party_counts.sum(1)

day
Fri     18
Sat     85
Sun     75
Thur    58
dtype: int64

In [42]:
party_pcts.plot.bar()

<IPython.core.display.Javascript object>

<AxesSubplot:xlabel='day'>

So you can see that party sizes appear to increase on the weekend in this dataset.
With data that requires aggregation or summarization before making a plot, using the
seaborn package can make things much simpler. Let’s look now at the tipping percentage
by day with seaborn

In [7]:
import seaborn as sns
tips['tip_pct'] = tips['tip'] / (tips['total_bill'] - tips['tip'])
tips.head()

Unnamed: 0,total_bill,tip,smoker,day,time,size,tip_pct
0,16.99,1.01,No,Sun,Dinner,2,0.063204
1,10.34,1.66,No,Sun,Dinner,3,0.191244
2,21.01,3.5,No,Sun,Dinner,3,0.199886
3,23.68,3.31,No,Sun,Dinner,2,0.162494
4,24.59,3.61,No,Sun,Dinner,4,0.172069


In [8]:
sns.barplot(x='tip_pct', y='day', data=tips, orient='h')

<IPython.core.display.Javascript object>

<AxesSubplot:xlabel='tip_pct', ylabel='day'>

Tipping percentage by day with error bars

Plotting functions in seaborn take a data argument, which can be a pandas Data‐
Frame. The other arguments refer to column names. Because there are multiple
observations for each value in the day, the bars are the average value of tip_pct. The
black lines drawn on the bars represent the 95% confidence interval (this can be configured
through optional arguments).

seaborn.barplot has a hue option that enables us to split by an additional categorical
value

In [9]:
sns.barplot(x='tip_pct', y='day', hue='time', data=tips, orient='h')

<IPython.core.display.Javascript object>

<AxesSubplot:xlabel='tip_pct', ylabel='day'>

Notice that seaborn has automatically changed the aesthetics of plots: the default
color palette, plot background, and grid line colors. You can switch between different
plot appearances using seaborn.set:  

In [11]:
sns.set(style="whitegrid")

## Histograms and Density Plots
A histogram is a kind of bar plot that gives a discretized display of value frequency.
The data points are split into discrete, evenly spaced bins, and the number of data
points in each bin is plotted. Using the tipping data from before, we can make a histogram
of tip percentages of the total bill using the plot.hist method on the Series

In [12]:
tips['tip_pct'].plot.hist(bins=50)

<IPython.core.display.Javascript object>

<AxesSubplot:ylabel='Frequency'>

In [13]:
tips['tip_pct'].plot.density()

<IPython.core.display.Javascript object>

<AxesSubplot:ylabel='Density'>

Seaborn makes histograms and density plots even easier through its distplot
method, which can plot both a histogram and a continuous density estimate simultaneously.
As an example, consider a bimodal distribution consisting of draws from
two different standard normal distributions

In [14]:
comp1 = np.random.normal(0, 1, size=200)
comp2 = np.random.normal(10, 2, size=200)
values = pd.Series(np.concatenate([comp1, comp2]))
sns.distplot(values, bins=100, color='k')



<IPython.core.display.Javascript object>

<AxesSubplot:ylabel='Density'>

## Scatter or Point Plots

Point plots or scatter plots can be a useful way of examining the relationship between
two one-dimensional data series. For example, here we load the macrodata dataset
from the statsmodels project, select a few variables, then compute log differences:

In [16]:
macro = pd.read_csv('macrodata.csv')
data = macro[['cpi', 'm1', 'tbilrate', 'unemp']]
trans_data = np.log(data).diff().dropna()
trans_data[-5:]

Unnamed: 0,cpi,m1,tbilrate,unemp
198,-0.007904,0.045361,-0.396881,0.105361
199,-0.021979,0.066753,-2.277267,0.139762
200,0.00234,0.010286,0.606136,0.160343
201,0.008419,0.037461,-0.200671,0.127339
202,0.008894,0.012202,-0.405465,0.04256


We can then use seaborn’s regplot method, which makes a scatter plot and fits a linear
regression line :

In [18]:
sns.regplot('m1', 'unemp', data=trans_data)
plt.title('Changes in log %s versus log %s' % ('m1', 'unemp'))



<IPython.core.display.Javascript object>

Text(0.5, 1.0, 'Changes in log m1 versus log unemp')

In exploratory data analysis it’s helpful to be able to look at all the scatter plots among
a group of variables; this is known as a pairs plot or scatter plot matrix. Making such a
plot from scratch is a bit of work, so seaborn has a convenient pairplot function,
which supports placing histograms or density estimates of each variable along the
diagonal

In [19]:
sns.pairplot(trans_data, diag_kind='kde', plot_kws={'alpha': 0.2})

<IPython.core.display.Javascript object>

<seaborn.axisgrid.PairGrid at 0x27acdbda550>

You may notice the plot_kws argument. This enables us to pass down configuration
options to the individual plotting calls on the off-diagonal elements. Check out the
seaborn.pairplot docstring for more granular configuration options.

## Facet Grids and Categorical Data

What about datasets where we have additional grouping dimensions? One way to visualize
data with many categorical variables is to use a facet grid. Seaborn has a useful
built-in function factorplot that simplifies making many kinds of faceted plots

In [20]:
sns.factorplot(x='day', y='tip_pct', hue='time', col='smoker', kind='bar', data=tips[tips.tip_pct < 1])



<IPython.core.display.Javascript object>

<seaborn.axisgrid.FacetGrid at 0x27acdbefd00>

Instead of grouping by 'time' by different bar colors within a facet, we can also
expand the facet grid by adding one row per time value

In [21]:
sns.factorplot(x='day', y='tip_pct', row='time', col='smoker', kind='bar', data=tips[tips.tip_pct < 1])



<IPython.core.display.Javascript object>

<seaborn.axisgrid.FacetGrid at 0x27acdbe1250>

factorplot supports other plot types that may be useful depending on what you are
trying to display. For example, box plots (which show the median, quartiles, and outliers)
can be an effective visualization type

In [24]:
sns.factorplot(x='tip_pct', y='day', kind='box', data=tips[tips.tip_pct < 0.5])



<IPython.core.display.Javascript object>

<seaborn.axisgrid.FacetGrid at 0x27ad156b8e0>

# Other Python Visualization Tools
As is common with open source, there are a plethora of options for creating graphics
in Python (too many to list). Since 2010, much development effort has been focused
on creating interactive graphics for publication on the web. With tools like Bokeh and
Plotly, it’s now possible to specify dynamic, interactive graphics in Python that are
destined for a web browser.
For creating static graphics for print or web, I recommend defaulting to matplotlib
and add-on libraries like pandas and seaborn for your needs. For other data visualization
requirements, it may be useful to learn one of the other available tools out there.
I encourage you to explore the ecosystem as it continues to involve and innovate into
the future.

#### *Note - Most of the contents like images, examples, statements, etc in my notebooks / notes belongs to author "Wes McKinney" of book "Python for Data Analysis". I have collected / integrated them for study purpose and I don't own it.*