# Basic Plotting with matplotlib

#### Alberto Cario : The functional Art
#### Edward R Tufte : The visual Display of Quantative Information 

### Alberto Cario's Visualization wheel for design

#### Complex & Deeper wheel [ Generally Used by Scientist and Engineers]

    Abstraction
    Functionality
    Density
    Multidimensionality
    Originality
    Novelty

#### Intelligible & Shallower [Generally used by Artist and Journalist]

    Figuration
    Decoration
    Lightness
    Unidimensionality
    Familiarity
    Redundancy

#### Graphical heuristics 
It is not a procedure or a science to be followed instead it is a convention of practise while plotting the data

Graphical heuristics mainly concentrates on two sub topics
    
    Data Ink Ratio
    Chart Junk

The Data ink ratio stress on removing the unnecessary data from the plot in order to have high data ink ratio.
The approach here to is to have high data ink ratio this is achieved by removing those elements from the plot which don't add any value or information to the plot.
Ex : background colour, borders, grids, colours to the observations, legends etc...


Matplotlib is a powerful open source tool kit for represnting and Visualization of data, Matplotlib is created by John Hunter 

To enable web based rendering we make use of 

    %matplotlib notebook

There are many ways to render the output of a matplotlib, since we are using a web based noteook jupyter , here we are making use of the ipython magic %matplotlib notebook.
Remember that in the Jupyter Notebook, the IPython magics are just helper functions which set up the environment so that the web based rendering can be enabled. 

### Matplotlib Architecture

#### Backend

    Deals with the rendering of plots to the screen or files
    In jupyter notebook we use the inline backend
    There are also backends called hard copy backends, which support rendering to graphics formats, like scalable vector grapics, SVGs, or PNGs. 
    
#### Artist layer

    Contains containers such as Figure, Subplot and Axes
    Contains premitives such as Line 2D and rectangle, and collections, such as a pathCollection
    
#### Scripting layer
    
    Simplifies the access to the artist and backend layers i.e. pyplot
    The pyplot scripting layer is a procedural method for building a visualization

In [137]:
%matplotlib notebook

In [138]:
import matplotlib as mpl
mpl.get_backend()

'nbAgg'

In [139]:
plt.plot?

In [140]:
plt.plot(3,2)

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x15ba4410>]

In [141]:
plt.plot(3, 2, '*')

[<matplotlib.lines.Line2D at 0x15b9f190>]

Here the plt.plot function takes *args i.e. variable number of arguments but in the pairs of X and Y.

The third argument we are passing to the function will be a string that and it will represent the data point.

The interactive back end is because of %matplotlib notebook and other backend like inline uses %matplotlib inline magic instead of the %matplotlib notebook magic. The inline magic is not interactive and creates a new plot as new cells in the notebook.

The scripting layer pyplot is managing a lot of objects. It keeps track of the latest figure, of sub plots and of the axis objects etc so the need to interacting with the artist layer is not required and in this pyplot module does all the magic needed for plotting.

In [143]:
plt.plot(2.9,1.77, '.')
new = plt.gca()
new.axis([0,5,0,6])

[0, 5, 0, 6]

Let's see how to make a plot without using the scripting layer.

In [144]:
# First let's set the backend without using mpl.use() from the scripting layer
from matplotlib.backends.backend_agg import FigureCanvasAgg
from matplotlib.figure import Figure

# create a new figure
fig = Figure()

# associate fig with the backend
canvas = FigureCanvasAgg(fig)

# add a subplot to the fig
ax = fig.add_subplot(111)

# plot the point (3,2)
ax.plot(3, 2, '.')

# save the figure to test.png
# you can see this figure in your Jupyter workspace afterwards by going to
# https://hub.coursera-notebooks.org/
canvas.print_png('test.png')

In [145]:
# create a new figure
plt.figure()

# plot the point (3,2) using the circle marker
plt.plot(3, 2, '-o')

# get the current axes
ax = plt.gca()

# Set axis properties [xmin, xmax, ymin, ymax]
ax.axis([0,6,0,10])

<IPython.core.display.Javascript object>

[0, 6, 0, 10]

In [146]:
# create a new figure
plt.figure()

# plot the point (1.5, 1.5) using the circle marker
plt.plot(1.5, 1.5, 'o')
# plot the point (2, 2) using the circle marker
plt.plot(2, 2, '.')
# plot the point (2.5, 2.5) using the circle marker
plt.plot(2.5, 2.5, '*')
k = plt.gca()
k.axis([0,5,0,5])

<IPython.core.display.Javascript object>

[0, 5, 0, 5]

In [147]:
# get current axes
ax = plt.gca()
ax.axis([0,6,0,6])
# get all the child objects the axes contains
ax.get_children()

[<matplotlib.lines.Line2D at 0x15c813b0>,
 <matplotlib.lines.Line2D at 0x15c81b70>,
 <matplotlib.lines.Line2D at 0x15c81ff0>,
 <matplotlib.spines.Spine at 0x15c9b570>,
 <matplotlib.spines.Spine at 0x15c9b350>,
 <matplotlib.spines.Spine at 0x15c9b6f0>,
 <matplotlib.spines.Spine at 0x15c9b890>,
 <matplotlib.axis.XAxis at 0x15c8fff0>,
 <matplotlib.axis.YAxis at 0x15c9bd50>,
 <matplotlib.text.Text at 0x1586d450>,
 <matplotlib.text.Text at 0x1586d430>,
 <matplotlib.text.Text at 0x1586d9d0>,
 <matplotlib.patches.Rectangle at 0x1586d190>]

### Scatterplots

In [148]:
import numpy as np

x = np.array([1,2,3,4,5,6,7,8])
y = x

plt.figure()
plt.scatter(x, y) # similar to plt.plot(x, y, '.'), but the underlying child objects in the axes are not Line2D

<IPython.core.display.Javascript object>

<matplotlib.collections.PathCollection at 0x15c43410>

In [149]:
import numpy as np

x = np.array([1,2,3,4,5,6,7,8])
y = x

# create a list of colors for each point to have
# ['green', 'green', 'green', 'green', 'green', 'green', 'green', 'red']
colors = ['green']*(len(x)-1)
colors.append('red')

plt.figure()

# plot the point with size 100 and chosen colors
plt.scatter(x, y, s=100, c=colors)

<IPython.core.display.Javascript object>

<matplotlib.collections.PathCollection at 0x15c2dcd0>

In [150]:
# convert the two lists into a list of pairwise tuples
zip_generator = zip([1,2,3,4,5], [6,7,8,9,10])

print(list(zip_generator))
# the above prints:
# [(1, 6), (2, 7), (3, 8), (4, 9), (5, 10)]

zip_generator = zip([1,2,3,4,5], [6,7,8,9,10])
# The single star * unpacks a collection into positional arguments
print(list(zip(*zip_generator)))
# the above prints:
# (1, 6) (2, 7) (3, 8) (4, 9) (5, 10)

[(1, 6), (2, 7), (3, 8), (4, 9), (5, 10)]
[(1, 2, 3, 4, 5), (6, 7, 8, 9, 10)]


In [151]:
# use zip to convert 5 tuples with 2 elements each to 2 tuples with 5 elements each
print(list(zip((1, 6), (2, 7), (3, 8), (4, 9), (5, 10))))
# the above prints:
# [(1, 2, 3, 4, 5), (6, 7, 8, 9, 10)]


zip_generator = zip([1,2,3,4,5], [6,7,8,9,10])
# let's turn the data back into 2 lists
x, y = zip(*zip_generator) # This is like calling zip((1, 6), (2, 7), (3, 8), (4, 9), (5, 10))
print(x)
print(y)
# the above prints:
# (1, 2, 3, 4, 5)
# (6, 7, 8, 9, 10)

[(1, 2, 3, 4, 5), (6, 7, 8, 9, 10)]
(1, 2, 3, 4, 5)
(6, 7, 8, 9, 10)


In [152]:
plt.figure()
# plot a data series 'Tall students' in red using the first two elements of x and y
plt.scatter(x[:2], y[:2], s=100, c='red', label='Tall students')
# plot a second data series 'Short students' in blue using the last three elements of x and y 
plt.scatter(x[2:], y[2:], s=100, c='blue', label='Short students')

<IPython.core.display.Javascript object>

<matplotlib.collections.PathCollection at 0x186c0d30>

In [153]:
# add a label to the x axis
plt.xlabel('The number of times the child kicked a ball')
# add a label to the y axis
plt.ylabel('The grade of the student')
# add a title
plt.title('Relationship between ball kicking and grades')

<matplotlib.text.Text at 0x169cf630>

In [154]:
# add a legend (uses the labels from plt.scatter)
plt.legend()

<matplotlib.legend.Legend at 0x15b63e10>

In [None]:
# add the legend to loc=4 (the lower right hand corner), also gets rid of the frame and adds a title
plt.legend(loc=2, frameon=False, title='Legend')

Here the loc operator is like the quadrants of cartesian plain with 
    
    1 for 1st 
    2 for 2nd 
    3 for 3rd and
    4 for 4th quadrant

In [159]:
# get children from current axes (the legend is the second to last item in this list)
plt.gca().get_children()

[<matplotlib.collections.PathCollection at 0x186c0610>,
 <matplotlib.collections.PathCollection at 0x186c0d30>,
 <matplotlib.spines.Spine at 0x169b7d10>,
 <matplotlib.spines.Spine at 0x169b7ad0>,
 <matplotlib.spines.Spine at 0x169b7bf0>,
 <matplotlib.spines.Spine at 0x169b7e30>,
 <matplotlib.axis.XAxis at 0x169b7f30>,
 <matplotlib.axis.YAxis at 0x169b7250>,
 <matplotlib.text.Text at 0x169cf630>,
 <matplotlib.text.Text at 0x169cf670>,
 <matplotlib.text.Text at 0x169cf6b0>,
 <matplotlib.legend.Legend at 0x186cd710>,
 <matplotlib.patches.Rectangle at 0x169cf6d0>]

In [160]:
# get the legend from the current axes
legend = plt.gca().get_children()[-2]
legend

<matplotlib.legend.Legend at 0x186cd710>

In [161]:
# you can use get_children to navigate through the child artists
legend.get_children()[0].get_children()[1].get_children()[0].get_children()

[<matplotlib.offsetbox.HPacker at 0x186dc7d0>,
 <matplotlib.offsetbox.HPacker at 0x186dc7f0>]

In [162]:
# import the artist class from matplotlib
from matplotlib.artist import Artist

def rec_gc(art, depth=0):
    if isinstance(art, Artist):
        # increase the depth for pretty printing
        print("  " * depth + str(art))
        for child in art.get_children():
            rec_gc(child, depth+2)

# Call this function on the legend artist to see what the legend is made up of
rec_gc(plt.legend())

Legend
    <matplotlib.offsetbox.VPacker object at 0x186E7990>
        <matplotlib.offsetbox.TextArea object at 0x186E7810>
            Text(0,0,'None')
        <matplotlib.offsetbox.HPacker object at 0x186E71B0>
            <matplotlib.offsetbox.VPacker object at 0x186E71D0>
                <matplotlib.offsetbox.HPacker object at 0x186E77B0>
                    <matplotlib.offsetbox.DrawingArea object at 0x186E7370>
                        <matplotlib.collections.PathCollection object at 0x186E7490>
                    <matplotlib.offsetbox.TextArea object at 0x186E71F0>
                        Text(0,0,'Tall students')
                <matplotlib.offsetbox.HPacker object at 0x186E77D0>
                    <matplotlib.offsetbox.DrawingArea object at 0x186E7650>
                        <matplotlib.collections.PathCollection object at 0x186E7770>
                    <matplotlib.offsetbox.TextArea object at 0x186E74D0>
                        Text(0,0,'Short students')
    FancyBboxPatch

### Line Plots

In [163]:
import numpy as np

linear_data = np.array([1,2,3,4,5,6,7,8])
exponential_data = linear_data**2

plt.figure()
# plot the linear data and the exponential data
plt.plot(linear_data, '-o',  exponential_data, '-o')

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x18975a10>,
 <matplotlib.lines.Line2D at 0x18975b10>]

The difference between a normal plot and line plot is as follows,
    
    Both make use of same plt.plot methods But in case of a normal plot the arguments are x, y, string used for representation
    plt.plot(x,y,string)
    where as in case of line plots the arguments are x, -string, y , -string. Here x and y can be list or arrary of numbers
    plt.plot(x,-string,y,-string) in this case x and y are arrays and index act as the the other part of the pair i.e z,x and z,y
    or in case of line plots the arguments are x1, y1, -string, x2, y2, -string. Here x and y can be list or arrary of numbers
    plt.plot(x1, y1, -string, x2, y2, -string)

In [171]:
# plot another series with a dashed red line
plt.plot([22,44,55], '--r')

[<matplotlib.lines.Line2D at 0x19138f50>]

In [172]:
plt.xlabel('Some data')
plt.ylabel('Some other data')
plt.title('A title')
# add a legend with legend entries (because we didn't have labels when we plotted the data series)
plt.legend(['Baseline', 'Competition', 'Us'])

<matplotlib.legend.Legend at 0x1967b7b0>

In [173]:
# fill the area between the linear data and exponential data
plt.gca().fill_between(range(len(linear_data)), 
                       linear_data, exponential_data, 
                       facecolor='blue', 
                       alpha=0.25)

<matplotlib.collections.PolyCollection at 0x196c5970>

In [174]:
plt.figure()

observation_dates = np.arange('2017-01-01', '2017-01-09', dtype='datetime64[D]')
print(observation_dates)
plt.plot(observation_dates, linear_data, '-o',  observation_dates, exponential_data, '-o')

<IPython.core.display.Javascript object>

['2017-01-01' '2017-01-02' '2017-01-03' '2017-01-04' '2017-01-05'
 '2017-01-06' '2017-01-07' '2017-01-08']


[<matplotlib.lines.Line2D at 0x196fb9b0>,
 <matplotlib.lines.Line2D at 0x19711470>]

In [175]:
import pandas as pd

plt.figure()
observation_dates = np.arange('2017-01-01', '2017-01-09', dtype='datetime64[D]')
observation_dates = list(map(pd.to_datetime, observation_dates)) 
plt.plot(observation_dates, linear_data, '-o',  observation_dates, exponential_data, '-o')

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x1999f110>,
 <matplotlib.lines.Line2D at 0x199afcb0>]

In [176]:
x = plt.gca().xaxis

# rotate the tick labels for the x axis
for item in x.get_ticklabels():
    item.set_rotation(45)

In [177]:
# adjust the subplot so the text doesn't run off the image
plt.subplots_adjust(bottom=0.25)

In [178]:
ax = plt.gca()
ax.set_xlabel('Date')
ax.set_ylabel('Units')
ax.set_title('Exponential vs. Linear performance')

<matplotlib.text.Text at 0x1999f930>

In [179]:
# you can add mathematical expressions in any text element
ax.set_title("Exponential ($x^2$) vs. Linear ($x$) performance")

<matplotlib.text.Text at 0x1999f930>

In [180]:
x = np.array([1,2,3,4,5])
y = np.array([6,7,8,3,10])
plt.figure()
plt.plot(x,y,'*')
myplot = plt.gca()
myplot.axis([0,6,0,11])

<IPython.core.display.Javascript object>

[0, 6, 0, 11]

In [181]:
x = np.array([1,2,3,4,5])
y = np.array([6,7,8,3,10])
plt.figure()
plt.plot(x,y,'-*')
myplot = plt.gca()
myplot.axis([0,6,0,11])

<IPython.core.display.Javascript object>

[0, 6, 0, 11]

The difference betwen a regular plot and a line plot is the third argument a string, if the - is prefixed to a string then it will draw lines between points, provided there are more than one value for x and y

### Bar Charts

In [182]:
plt.figure()
xvals = range(len(linear_data))
plt.bar(xvals, linear_data, width = 0.3)

<IPython.core.display.Javascript object>

<Container object of 8 artists>

In [183]:
new_xvals = []

# plot another set of bars, adjusting the new xvals to make up for the first set of bars plotted
for item in xvals:
    new_xvals.append(item+0.3)

plt.bar(new_xvals, exponential_data, width = 0.3 ,color='pink')

<Container object of 8 artists>

In [184]:
mydata = [3,2,6,2,6,8,9,3]
another = []
for i in xvals:
    another.append(i+0.6)
    
plt.bar(another, mydata, width = 0.3, color = 'yellow')

<Container object of 8 artists>

In [185]:
plt.figure()
xvals = range(len(linear_data))
plt.bar(xvals, linear_data, width = 0.3)

new_xvals = []

# plot another set of bars, adjusting the new xvals to make up for the first set of bars plotted
for item in xvals:
    new_xvals.append(item+0.3)

plt.bar(new_xvals, exponential_data, width = 0.3 ,color='pink')

from random import randint
linear_err = [randint(0,15) for x in range(len(linear_data))] 

# This will plot a new set of bars with errorbars using the list of random error values
plt.bar(xvals, linear_data, width = 0.3, yerr=linear_err)

<IPython.core.display.Javascript object>

<Container object of 8 artists>

In [186]:
# stacked bar charts are also possible
plt.figure()
xvals = range(len(linear_data))
plt.bar(xvals, linear_data, width = 0.3, color='b')
plt.bar(xvals, exponential_data, width = 0.3, bottom=linear_data, color='r')

<IPython.core.display.Javascript object>

<Container object of 8 artists>

In [187]:
# or use barh for horizontal bar charts
plt.figure()
xvals = range(len(linear_data))
plt.barh(xvals, linear_data, height = 0.3, color='b')
plt.barh(xvals, exponential_data, height = 0.3, left=linear_data, color='r')

<IPython.core.display.Javascript object>

<Container object of 8 artists>

###  Subplots

In [188]:
%matplotlib notebook

import matplotlib.pyplot as plt
import numpy as np

plt.subplot?

Subplot takes three argument 
    
    1st is the number of rows
    2nd is the number of columns
    3rd is the Current axis
    So the gird will be divided accordingly, i.e. if the command is 
    plt.plot(2,2,1), then totally 4 plots can be displayed with 2 in 1st row and 2 in 2nd row, with current axis pointing to the 1st plot

In [189]:
plt.figure()
# subplot with 1 row, 2 columns, and current axis is 1st subplot axes
plt.subplot(1, 2, 1)

linear_data = np.array([1,2,3,4,5,6,7,8])

plt.plot(linear_data, '-o')

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x1acc6bf0>]

In [190]:
exponential_data = linear_data**2 

# subplot with 1 row, 2 columns, and current axis is 2nd subplot axes
plt.subplot(1, 2, 2)
plt.plot(exponential_data, '-o')

[<matplotlib.lines.Line2D at 0x1a9df330>]

To modify any plot previously created just make that as the current plot by using plt.subplot(nrows, ncolumns, plot number)
    
    plt.subplot(1,2,1) in the above case to make any changes to the 1st plot

In [191]:
# plot exponential data on 1st subplot axes
plt.subplot(1, 2, 1)
plt.plot(exponential_data, '-x')

[<matplotlib.lines.Line2D at 0x1af71930>]

Using the code in above cell we added the exponential data points to the same plot in which linear data was there previously, but it needs to be noted that the y axis of the 1st plot was modified to accomadate exponential data points and there by the plots are venurable to change if added with other data points and in some cases it may not be desired to change the axis value and in those cases sharing of x or y or both the axes can be done

In [192]:
plt.figure()
ax1 = plt.subplot(1, 2, 1)
plt.plot(linear_data, '-o')
# pass sharey=ax1 to ensure the two subplots share the same y axis
ax2 = plt.subplot(1, 2, 2, sharey=ax1)
plt.plot(exponential_data, '-x')

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x1afc69b0>]

In [193]:
plt.figure()
# the right hand side is equivalent shorthand syntax
plt.subplot(1,2,1) == plt.subplot(121)

<IPython.core.display.Javascript object>

True

The right hand side is equivalent shorthand syntax for subplot with number to rows and columns in sub plot restricted to single digit 

In [194]:
# create a 3x3 grid of subplots
fig, ((ax1,ax2,ax3), (ax4,ax5,ax6), (ax7,ax8,ax9)) = plt.subplots(3, 3, sharex=True, sharey=True)
# plot the linear_data on the 5th subplot axes 
ax5.plot(linear_data, '-')

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x1b669830>]

In [195]:
# set inside tick labels to visible
for ax in plt.gcf().get_axes():
    for label in ax.get_xticklabels() + ax.get_yticklabels():
        label.set_visible(True)

In [196]:
# necessary on some systems to update the plot
plt.gcf().canvas.draw()

In [197]:
#plt.figure()
plt.subplots(3, 3, sharex=True, sharey=True)

<IPython.core.display.Javascript object>

(<matplotlib.figure.Figure at 0x1b937610>,
 array([[<matplotlib.axes._subplots.AxesSubplot object at 0x1B937270>,
         <matplotlib.axes._subplots.AxesSubplot object at 0x1B95FB70>,
         <matplotlib.axes._subplots.AxesSubplot object at 0x1B98AE90>],
        [<matplotlib.axes._subplots.AxesSubplot object at 0x19C72A10>,
         <matplotlib.axes._subplots.AxesSubplot object at 0x1999B170>,
         <matplotlib.axes._subplots.AxesSubplot object at 0x15C85CD0>],
        [<matplotlib.axes._subplots.AxesSubplot object at 0x15F4B090>,
         <matplotlib.axes._subplots.AxesSubplot object at 0x150C8230>,
         <matplotlib.axes._subplots.AxesSubplot object at 0x157C7530>]], dtype=object))

In [202]:
#plt.figure()
plt.subplots(3, 3, sharex=True, sharey=True)
ax6 = plt.subplot(3,3,6)
x = [1,2,3]
y = [4,5,6]
ax6 = plt.plot(x,y, '-o')
b = plt.gca().xaxis
for item in b.get_ticklabels():
    item.set_rotation(45)
b = plt.gca()
b.axis([0,4,0,7])

<IPython.core.display.Javascript object>

[0, 4, 0, 7]

###  Histograms

In [206]:
# create 2x2 grid of axis subplots
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, sharex=True)
axs = [ax1,ax2,ax3,ax4]

# draw n = 10, 100, 1000, and 10000 samples from the normal distribution and plot corresponding histograms
for n in range(0,len(axs)):
    sample_size = 10**(n+1)
    sample = np.random.normal(loc=0.0, scale=1.0, size=sample_size)
    axs[n].hist(sample)
    axs[n].set_title('n={}'.format(sample_size))

<IPython.core.display.Javascript object>

In [207]:
# repeat with number of bins set to 100
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, sharex=True)
axs = [ax1,ax2,ax3,ax4]

for n in range(0,len(axs)):
    sample_size = 10**(n+1)
    sample = np.random.normal(loc=0.0, scale=1.0, size=sample_size)
    axs[n].hist(sample, bins=100)
    axs[n].set_title('n={}'.format(sample_size))

<IPython.core.display.Javascript object>

In [208]:
plt.figure()
Y = np.random.normal(loc=0.0, scale=1.0, size=10000)
X = np.random.random(size=10000)
plt.scatter(X,Y)

<IPython.core.display.Javascript object>

<matplotlib.collections.PathCollection at 0x19d7e2b0>

### Pandas Visualization

In [209]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

%matplotlib notebook

In [210]:
# see the pre-defined styles provided.
plt.style.available

['seaborn-darkgrid',
 'seaborn-dark-palette',
 'seaborn-white',
 'ggplot',
 'seaborn-talk',
 'seaborn-notebook',
 'classic',
 'seaborn-whitegrid',
 'seaborn-dark',
 'seaborn-ticks',
 'seaborn-paper',
 'fivethirtyeight',
 'grayscale',
 'bmh',
 'seaborn-colorblind',
 'seaborn-bright',
 'seaborn-poster',
 'seaborn-deep',
 'seaborn-pastel',
 'seaborn-muted',
 'dark_background']

In [211]:
# use the 'seaborn-colorblind' style
plt.style.use('seaborn-colorblind')

### DataFrame.plot

In [215]:
np.random.seed(123)

df = pd.DataFrame({'A': np.random.randn(365).cumsum(0), 
                   'B': np.random.randn(365).cumsum(0) + 20,
                   'C': np.random.randn(365).cumsum(0) - 20}, 
                  index=pd.date_range('1/1/2017', periods=365))
df.head()

Unnamed: 0,A,B,C
2017-01-01,-1.085631,20.059291,-20.230904
2017-01-02,-0.088285,21.803332,-16.659325
2017-01-03,0.194693,20.835588,-17.055481
2017-01-04,-1.311601,21.255156,-17.093802
2017-01-05,-1.890202,21.462083,-19.518638


In [216]:
df.plot(); # add a semi-colon to the end of the plotting call to suppress unwanted output

<IPython.core.display.Javascript object>

In [217]:
df.plot('A','B', kind = 'scatter');

<IPython.core.display.Javascript object>

You can also choose the plot kind by using the `DataFrame.plot.kind` methods instead of providing the `kind` keyword argument.

`kind` :
- `'line'` : line plot (default)
- `'bar'` : vertical bar plot
- `'barh'` : horizontal bar plot
- `'hist'` : histogram
- `'box'` : boxplot
- `'kde'` : Kernel Density Estimation plot
- `'density'` : same as 'kde'
- `'area'` : area plot
- `'pie'` : pie plot
- `'scatter'` : scatter plot
- `'hexbin'` : hexbin plot

In [219]:
# create a scatter plot of columns 'A' and 'C', with changing color (c) and size (s) based on column 'B'
df.plot.scatter('A', 'C', c='B', s=df['B'], colormap='viridis')

<IPython.core.display.Javascript object>

<matplotlib.axes._subplots.AxesSubplot at 0x152a2d90>

In [220]:
ax = df.plot.scatter('A', 'C', c='B', s=df['B'], colormap='viridis')
ax.set_aspect('equal')

<IPython.core.display.Javascript object>

In [221]:
df.plot.box();

<IPython.core.display.Javascript object>

In [222]:
df.plot.hist(alpha=0.7);

<IPython.core.display.Javascript object>

[Kernel density estimation plots](https://en.wikipedia.org/wiki/Kernel_density_estimation) are useful for deriving a smooth continuous function from a given sample.

In [224]:
df.plot.kde();

<IPython.core.display.Javascript object>

### pandas.tools.plotting

In [225]:
iris = pd.read_csv('iris.csv')
iris.head()

Unnamed: 0,SepalLength,SepalWidth,PetalLength,PetalWidth,Name
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


In [226]:
pd.tools.plotting.scatter_matrix(iris);

<IPython.core.display.Javascript object>

In [227]:
plt.figure()
pd.tools.plotting.parallel_coordinates(iris, 'Name');

<IPython.core.display.Javascript object>

###  Seaborn

In [231]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib notebook


In [232]:
np.random.seed(1234)

v1 = pd.Series(np.random.normal(0,10,1000), name='v1')
v2 = pd.Series(2*v1 + np.random.normal(60,15,1000), name='v2')

In [236]:
v1.head(5)

0     4.714352
1   -11.909757
2    14.327070
3    -3.126519
4    -7.205887
Name: v1, dtype: float64

In [237]:
v2.head(5)

0    45.695583
1    11.877633
2    89.352568
3    28.549529
4    66.526610
Name: v2, dtype: float64

In [238]:
plt.figure()
plt.hist(v1, alpha=0.7, bins=np.arange(-50,150,5), label='v1');
plt.hist(v2, alpha=0.7, bins=np.arange(-50,150,5), label='v2');
plt.legend();

<IPython.core.display.Javascript object>

In [239]:
# plot a kernel density estimation over a stacked barchart
plt.figure()
plt.hist([v1, v2], histtype='barstacked', normed=True);
v3 = np.concatenate((v1,v2))
sns.kdeplot(v3);

<IPython.core.display.Javascript object>

  y = X[:m/2+1] + np.r_[0,X[m/2+1:],0]*1j


In [240]:
plt.figure()
# we can pass keyword arguments for each individual component of the plot
sns.distplot(v3, hist_kws={'color': 'Teal'}, kde_kws={'color': 'Navy'});

<IPython.core.display.Javascript object>

  y = X[:m/2+1] + np.r_[0,X[m/2+1:],0]*1j


In [241]:
sns.jointplot(v1, v2, alpha=0.4);

<IPython.core.display.Javascript object>

In [242]:
grid = sns.jointplot(v1, v2, alpha=0.4);
grid.ax_joint.set_aspect('equal')

<IPython.core.display.Javascript object>

In [243]:
sns.jointplot(v1, v2, kind='hex');

<IPython.core.display.Javascript object>

In [244]:
# set the seaborn style for all the following plots
sns.set_style('white')

sns.jointplot(v1, v2, kind='kde', space=0);

<IPython.core.display.Javascript object>

  y = X[:m/2+1] + np.r_[0,X[m/2+1:],0]*1j


In [245]:
iris = pd.read_csv('iris.csv')
iris.head()

Unnamed: 0,SepalLength,SepalWidth,PetalLength,PetalWidth,Name
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


In [246]:
sns.pairplot(iris, hue='Name', diag_kind='kde', size=2);

<IPython.core.display.Javascript object>

  y = X[:m/2+1] + np.r_[0,X[m/2+1:],0]*1j


In [247]:
plt.figure(figsize=(8,6))
plt.subplot(121)
sns.swarmplot('Name', 'PetalLength', data=iris);
plt.subplot(122)
sns.violinplot('Name', 'PetalLength', data=iris);

<IPython.core.display.Javascript object>