# Holoviews for Matplotlib Users

## Purpose
This notebook is a working document to come with what will hopefully be a useful resource for matplotlib users to transition to holoviews.

The overall vision would be to create a resource where people could find direct examples translating matplotlib commands/tasks they already know into their holoviews equivalent.  I think a good way to do this might be to create quasi-identical matplotlib and holoviews plots, each with a subset of tasks carried out.  Then each of these examples can be tagged with the tasks they contain.  This would make it really easy to search, for example, for all the examples that include manually setting the color of a line plot.

The segment of users wanting to switch from matplotlib to holoviews will probably be primarily motivated by the excellent bokeh backend.  If they already know matplotlib, then they would have less of an incentive to use holoviews instead of just dropping back to the mpl they know and love.  I would recommend that this document focus on using only the bokeh backend.

Furthermore.  Many users will be writing code in jupyter notebooks, but a fair number of old-school guys like me will probably want code they can put in scripts as well.  This means that it would be nice to have notebook (i.e. magics) and script examples for each task.

## List of possible tags

These are tasks that I know how to do in matplotlib, but want to know how to do in holoviews

### Types of plots
* `line_plot`: *draw a simple line plot*
* `scatter_plot`: *draw a scatter plot varying postion, shape, size and color*
* `histogram_plot`: *plot histograms*
* `area_plot`: *draw (stacked) area plots*
* `error_bar_plot`: *errorbar plot with mean +/- sigma specified*
* `contour_plot`: *draw a contour plot of some function*
* `image_plot`: *plot images (either actuall images or 2-d arrays)*
* `bar_plot`: *create a bar pot*



### Plot styling
* `line_color`: *set a line color*
* `line_width`: *set a line width*
* `line_style`: *make a line solid, dashed, dotted, etc*
* `line_opacity`: *change line opacity*
* `marker`: *set marker shape/size*
* `fig_size`: *manually set a figure size*
* `overlay`: *put multiple traces on one plot
* `subplots`: *do subplots*
* `twinx`: *overlap two different plots with vastly different y scales*
* `log_axes`: *set either/both axes to log*
* `axis_limits`: *set axis limits on either/both x, y axes*
* `grid`: *set grids on/off*
* `legend_manual`: *manually place legend*
* `legend_best`: *auto place legend at "best" location*
* `text`: *overlay text*
* `xylabel`: *axes labels*
* `title`: *plot titles*
* `xticks`: *manually set where the xticks of a graph will lie*
* `xticklabels`: *manually define what the tick labels will be*




# Preambles
Before using either matplotlib or holoviews, you need to run imports and some basic configuration.  Below are the "preambles" you need to run for matplotlib, holoviews, and just for working with the examples.



## Matplotlib Preamble

In [None]:
% matplotlib inline
import pylab as pl

## Holoviews Preamble

In [None]:
import holoviews as hv
hv.extension('bokeh')


## Preamble for working with examples
This preamble defines some arrays we will use for our examples

In [None]:
import pandas as pd
import numpy as np

# Generate some data
time = np.linspace(0, 2 * np.pi, 30)
ysin = np.sin(time)
ycos = np.cos(time)

# Introduction

Holoviews is an incredibly powerful new visualization tool built by the people at Anaconda.  It is a fairly large project, and as a longtime matplotlib user, I was a bit bewildered when getting started.  I am writing this in the hopes that it will help other old-school matplotlibbers like myself get up to speed quickly and start appreciating the power of Holoviews.

## Assumptions about you
* You know matplotlib pretty well
* You like to do your analysis in Jupyter notebooks
* You like interactive plots that allow you to zoom/pan around your data

## Why should I learn Holoviews?  Matplotlib does what I need
* ** Bokeh! **  For me the big selling point for Holoviews was the Bokeh library.  The charts it produces are really slick, and much more suited to data exploration than the charts matplotlib creates in the notebook.

* ** Datashader! **  If you have large datasets, interactively exploring them can totally freeze your computer.  Holoviews incorporates the Datashader tool which will let you easily explore very large datasets.

* ** Advanced Features ** Once you get past the basics covered in this tutorial, you'll find some really powerful abstractions for exploring your datasets that are simply not available in Matplotlib

# Hello World
Plotting a sine wave seems to be the "hello world!" application of data visualization so lets just get that out
of the way right now

## Matplotlib

In [None]:
# make a simple plot
pl.plot(time, ysin);


## Holoviews
Holoviews calls line plots "Curves."  So here we generate a curve.  Notice that the x, y arrays are passed in as a tuple instead of as separate arguments.  Also notice that Holoviews automatically places a series of controls beside the plot.  These allow for quickly zooming/panning around your data.

In [None]:
# make a simple plot.  Jupyter will automatically display when final statement in a cell is a plot
hv.Curve((time, ysin))

# Tags for actions executed in this cell: line_plot

# Advanced Hello World
Now we move on to a slightly more advanced version of the sine wave plot where we'd like to style the trace with color, markers and linestyles.  We also want to add labels to the x and y axis and include a title

# Matplotlib
As a matplotlib user, one thing you probably never paid attention to the way you style plots.  Some of the command you use for styling plots are part of the plotting command themselves (for example: `'ro-'`) and others are separate commands (for example: `pl.xlabel('Time')`).  Holoviews will do things differently as you'll see below.

In [None]:
# Spcecify the figure size you want to generate
pl.figure(figsize=(10, 4))

# Make a simple plot with styled trace
pl.plot(time, ysin, 'ro-')

# Add axis labels and plot title
pl.xlabel('Time')
pl.ylabel('Amplitude')
pl.title('Sine Wave example');


##  Holoviews

Here where things are going to start to get weird to you as a matplotlib user.  Holoviews is designed so that your styling commands are always separate from your plotting commands.  What is even weirder is that these styling commands are performed with IPython Magic function.  This really bothered me at first, because it is special to working in the notebook, but Holoviews provides non-magic-function equivalents.  The advantage of using the IPython magics is compactness and simplicity.  

I know this feels really awkward at first because you are so used to combining everything together, but after you get the hang of it, you will see that it enables some really cool capability. 

One final thing you'll have to get used to is that there are two levels of controlling how your figures look.  One is at the holoviews level (controls layout, figure size, etc.) and the other is at the Bokeh level.  Now, what does that mean?  Holovies is designed to work with a bunch of different backends.  As a matter of fact, matplotlib itself is an allowable backend. (Try it.  Change the extension below from `'bokeh'` to '`matplotlib`' and you will see a simple matplotlib figure being generated).  The point is this.  Styling individual compoenents of your plot such as line properties, colors, etc, is the responsibility of the plotting backend.  Backends other than Bokeh are beyond the scope of this tutorial.

One thing to note is that IPython magic functions are a bit finniky in their syntax.  They MUST be at the top of your cell, have no space in `%%opts` and have spaces between (in this case) `Curve` and `[width=...]`

** You can use tab completion within the [ ... ] and ( ... ) to see what options are available to you **

FIND A PLACE FOR THESE
-# make a simple plot.  Jupyter will automatically display the chart if it is the last statement
-# in a cell.  You can use the display command to show plots not at the end of a cell like this
-# from IPython.display import display
-# display(hv.Curve((x, y)))

In [None]:
%%opts Curve [width=400 height=200] 
%%opts Curve (color='red', line_dash='dashdot' line_alpha=.2, line_width=10)

# --------- All styling happens above this line. All plot specification below this line ------------

# Same simple form as above
hv.Curve((time, ysin))

# Tags for actions executed in this cell: 
#     line_plot line_color line_width line_style line_opacity fig_size

Adding lables/title

In [None]:
%%opts Curve [width=500 height=300] 
%%opts Curve (color='blue')


# --------- All styling happens above this line. All plot specification below this line ------------


# make a simple plot
hv.Curve((time, ysin), kdims=['Time'], vdims=['Amplitude'], label='Sine Wave Example')

# Tags for actions executed in this cell: 
#     line_plot line_color fig_size xylabel title


will start to see power of separating styling from plot.  Try different things.  Say something about how easy it is to switch from overlay to subplot by just changing operator

I ACTUALLY MIGHT WANT TO CHANGE THIS EXAMPLE TO SHOW HOW MUCH EASIER IT IS TO SWITCH BETWEEN
SUBPLOTS AND OVERLAYS

In [None]:
%%opts Curve [width=300 height=300] 
%%opts Curve.sin (color='blue')
%%opts Curve.cos (color='red')

# --------- All styling happens above this line. All plot specification below this line ------------


# define some curves
curve1 = hv.Curve((time, ysin), kdims=['Time'], vdims=['Amplitude'], group=('sin', 'Sine Wave Example'))
curve2 = hv.Curve((time, ycos), kdims=['Time'], vdims=['Amplitude'], group=('cos', 'Cosine Wave Example'))

# Change this variable to see the different ways to create subplots and overlays
kind = 'overlay'

#TODO: BREAK THESE UP INTO DIFFERENT CELLS
# "Adding" two curves together results in a subplot (defaults to 1 x 2)
if kind == 'subplot12':
    composite = curve1 + curve2

# The layout object created by adding two curves has a .cols() method.
# Here we use it to specify 1 column thereby creating a 2 x 1 subplot
elif kind == 'subplot21':
    composite = (curve1 + curve2)
    composite.cols(1)
    
# "Multiplying" two curves together results in them being displayed on the same plot.
# This is called an overlay.  But lets say you wanted to add a title to your overlay.
# There's no way to do that with a * operator, so you can explicitely create an overlay.
# Then, the cool thing with holovies is that you can compose the titled and untitled
# into subplots by just "adding" them together.
elif kind == 'overlay':
    # Short-cut way of making overlays
    untitled = curve1 * curve2
    
    # Explicitly making overlays lets you create a label
    titled = hv.Overlay([curve1, curve2], label='Harmonic')
    
    # You can compose your plots using the +/* operators which is pretty cool
    composite = untitled + titled


composite
# Tags for actions executed in this cell: 
#     line_plot line_color fig_size xylabel subplots title

* `line_color`: *set a line color*
* `line_width`: *set a line width*
* `line_style`: *make a line solid, dashed, dotted, etc*
* `line_opacity`: *change line opacity*
* `marker`: *set marker shape/size*
* `fig_size`: *manually set a figure size*
* `subplots`: *do subplots*
* `twinx`: *overlap two different plots with vastly different y scales*
* `log_axes`: *set either/both axes to log*
* `axis_limits`: *set axis limits on either/both x, y axes*
* `grid`: *set grids on/off*
* `legend_manual`: *manually place legend*
* `legend_best`: *auto place legend at "best" location*
* `text`: *overlay text*
* `xticks`: *manually set where the xticks of a graph will lie*
* `xticklabels`: *manually define what the tick labels will be*



In [None]:
% matplotlib inline
import pandas_datareader.data as web
import datetime
import pylab as pl
import pandas as pd


In [None]:
start = datetime.datetime(2016, 1, 1)
end = datetime.datetime(2016, 12, 31)
panel = web.DataReader(['aapl', 'goog', 'amzn'], start=start, end=end, data_source='yahoo')
dfc = panel['Close']
dfc = dfc.sort_index().reset_index().rename(
    columns={'Date': 'date'}
).reset_index().rename(
    columns={'index': 'day'}
)

dfv = panel['Volume']
dfv = dfv.sort_index().reset_index().rename(
    columns={'Date': 'date'}
).reset_index().rename(
    columns={'index': 'day'}
)

In [None]:
dfc.head()

In [None]:
dfv.head()

In [None]:
dfc.to_csv('closing.csv', index=False)
dfv.to_csv('volume.csv', index=False)

In [None]:
dfx = pd.read_csv('volume.csv')
dfx.head()

In [None]:
# simple line plot with axis labels and title
pl.plot(df.day, df.aapl)
pl.xlabel('day')
pl.ylabel('price')
pl.title('AAPL Stock Price')

In [None]:
# simple plot with log axes
pl.loglog(df.day, df.aapl)
pl.xlabel('day')
pl.ylabel('price')
pl.title('AAPL Stock Price')

# another plot with only one axis being log
# simple plot with log axes
pl.semilogx(df.day, df.aapl)
pl.xlabel('day')
pl.ylabel('price')
pl.title('AAPL Stock Price')


In [None]:
# line plot with markers, custom axes, and grid, and custom legend placement
pl.plot(df.day, df.aapl, 'x-', label='apple')
ax = pl.gca()
ax.set_xlim((0, 100))
ax.set_ylim((0, 115))
pl.grid(True)
pl.legend(loc=3)


In [None]:
# plots on same axis with best placement of legend
pl.plot(df.day, df.aapl, 'r.-')
pl.plot(df.day, df.goog, 'b.-')
pl.plot(df.day, df.amzn, 'g.-')
pl.legend(loc='best')



In [None]:
# twinx plot (really want this for bokeh backend)
pl.plot(df.day, df.aapl, 'r.-')
pl.xlabel('Days')
pl.ylabel('Apple Price', color='red')
ax = pl.gca()
ax.twinx()

pl.plot(df.day, df.goog, 'b.-')
pl.ylabel('Google Price', color='blue')

In [None]:
# transparent scatter plot
pl.scatter(df.aapl, df.goog, alpha=.1, s=80)
pl.scatter(df.aapl, df.amzn, alpha=.1, s=80)



In [None]:
# simple histogram setting bins and range
_ = pl.hist(df.aapl, bins=20, range=(100, 110))

In [None]:
# transparent  histogram with legend and labels
_ = pl.hist(df.goog, bins=20, normed=True, alpha=.2, label='google')
_ = pl.hist(df.amzn, bins=20, normed=True, alpha=.2, label='amazon')
pl.legend(loc='best')
pl.xlabel('price')
pl.ylabel('Normalized Frequency')

In [None]:
# two figures with different subplot layouts
pl.subplot(211)
pl.plot(df.day, df.amzn)
pl.subplot(212)
pl.plot(df.day, df.goog)

pl.figure()
pl.subplot(121)
pl.plot(df.day, df.amzn)
pl.subplot(122)
pl.plot(df.day, df.goog)


In [None]:
# simple stacked filled chart
pl.fill_between(df.day, df.goog, label='google')
pl.fill_between(df.day, df.amzn + df.goog, df.goog, label='amazon')
pl.legend(loc='best')



In [None]:
!pwd