As we discovered in the [Introduction](Introduction.ipynb), HoloPlot provides two ways of working. For convenience we will stick with the patching approach here, and for demonstration purposes we will use the ``intake`` library which is a Python API for bulk loading of data using a declarative YAML specification. To begin with we import our libraries and patch intake with the HoloPlot API:

In [None]:
import intake
import numpy as np
import holoplot.pandas
import holoplot.dask

Intake's ``DataSource`` objects will now support the HoloPlot plot API. The HoloPlot API closely mirrors the [Pandas plotting API](https://pandas.pydata.org/pandas-docs/stable/visualization.html), but instead of generating static images when used in a notebook, it uses HoloViews to generate either static or dynamically streaming Bokeh plots. Static plots can be used in any context, while streaming plots require a live [Jupyter notebook](http://jupyter.org) or a deployed [Bokeh Server app](https://bokeh.pydata.org/en/latest/docs/user_guide/server.html).

HoloViews provides an extensive  very rich set of objects and operations on them, as you can find out in the [HoloViews User Guide](http://holoviews.org/user_guide/index.html). But here we will focus on the most essential mechanisms needed to make your data visualizable, without having to worry about the mechanics going on behind the scenes.

We will be focusing on two different datasets:

- A small CSV file of US crime data, broken down by state
- A larger Parquet-format file of airline data

Here we used the `intake` package as a convenient way to get data into Python, but the same plotting commands below will work regardless of whether you use an Intake object or supplied a Pandas or Dask dataframe directly.

In [None]:
crime   = intake.cat.us_crime.get().read()
print(type(crime))
crime.head()

In [None]:
flights = intake.cat.airline_flights.get().read()
print(type(flights))
flights.head()

## The plot interface

The ``dask.dataframe.DataFrame.plot``, ``pandas.DataFrame.plot`` and ````intake.DataSource.plot`` interfaces (and Series equivalents) from HoloPlot provides a powerful high-level API to generate complex plots. The ``.plot`` API can be called directly or used as a namespace to generate specific plot types.

### The plot method

The simplest way to use the plotting API is to specify the names of columns to plot on the ``x``- and ``y``-axis respectively:

In [None]:
crime.plot(x='Year', y='Violent Crime rate')

As you'll see in more detail below, you can choose which kind of plot you want to use for the data:

In [None]:
crime.plot(x='Year', y='Violent Crime rate', kind='scatter')

An additional convenience on top of this explicit API is to specify an additional ``by`` variable, which groups the data by one or more additional columns. As an example here we will plot the departure delay ('depdelay') as a function of 'distance', grouping the data by the 'carrier'.  There are many available carriers, so we will select only two of them so that the plot is readable:

In [None]:
flights[flights.carrier.isin([b'OH', b'F9'])].plot(x='distance', y='depdelay', by='carrier', kind='scatter', alpha=0.2)

Here we have specified the `x` axis explicitly, which can be omitted if the Pandas index column is already set to what you want on the x axis. Similarly, here we specified the `y` axis; by default all of the non-index columns would be plotted (which would be a lot of data in this case). If you don't specify the 'y' axis, it will have a default label named 'value', but you can then provide a y axis label explicitly using the ``value_label`` option.

Putting all of this together we will plot violent crime, robbery, and burglary rates on the y-axis, specifying 'Year' as the x, and relabel the y-axis to display the 'Rate'.

In [None]:
crime.plot(x='Year', y=['Violent Crime rate', 'Robbery rate', 'Burglary rate'],
           value_label='Rate (per 100k people)')

### The plot namespace

Instead of using the ``kind`` argument to the plot call, we can use the ``plot`` namespace, which lets us easily discover the range of plot types that are supported. Plot types available include:

* <a href="#Area">``.area()``</a>: Plots a  area chart similar to a line chart except for filling the area under the curve and optionally stacking 
* <a href="#Bars">``.bar()``</a>: Plots a bar chart that can be stacked or grouped
* <a href="#Bivariate">``.bivariate()``</a>: Plots 2D density of a set of points 
* <a href="#Box-Whisker-Plots">``.box()``</a>: Plots a box-whisker chart comparing the distribution of one or more variables
* <a href="#HeatMap">``.heatmap()``</a>: Plots a heatmap to visualizing a variable across two independent dimensions
* <a href="#HexBins">``.hexbins()``</a>: Plots hex bins
* <a href="#Histogram">``.histogram()``</a>: Plots the distribution of one or histograms as a set of bins
* <a href="#KDE">``.kde()``</a>: Plots the kernel density estimate of one or more variables.
* <a href="#The-plot-method">``.line()``</a>: Plots a line chart (such as for a time series)
* <a href="#Scatter">``.scatter()``</a>: Plots a scatter chart comparing two variables
* <a href="#Tables">``.table()``</a>: Generates a SlickGrid DataTable
* <a href="#Violin-Plots">``.violin()``</a>: Plots a violin plot comparing the distribution of one or more variables using the kernel density estimate

#### Area

Like most other plot types the ``area`` chart supports the three ways of defining a plot outlined above. An area chart is most useful when plotting multiple variables in a stacked chart. This can be achieve by specifying ``x``, ``y``, and ``by`` columns or using the ``columns`` and ``index``/``use_index`` (equivalent to ``x``) options:

In [None]:
crime.plot.area(x='Year', y=['Robbery', 'Aggravated assault'], stacked=True)

We can also explicitly set ``stacked`` to False and define an ``alpha`` value to compare the values directly:

In [None]:
crime.plot.area(x='Year', y=['Robbery', 'Aggravated assault'], stacked=False, alpha=0.6)

#### Bars

In the simplest case we can use ``source.plot.bar`` to plot ``x`` against ``y``:

In [None]:
crime.plot.bar('Year', 'Violent Crime rate', rot=90)

If we want to compare multiple columns instead we can again use the ``index`` option to treat the 'Year' column as the index and then compare the specific columns. Using the ``stacked`` option we can then compare the column values more easily:

In [None]:
crime.plot.bar('Year', ['Violent crime total', 'Property crime total'],
               stacked=True, rot=90, width=800)

#### Scatter

The scatter plot supports all the same features as the other chart types we have seen so far but can also be colored by another variable using the ``c`` option and allows declaring a ``cmap``.

In [None]:
crime.plot.scatter('Violent Crime rate', 'Burglary rate', c='Year', cmap='viridis', size=6, colorbar=True)

#### HexBins

You can create hexagonal bin plots with the ``hexbin`` method. Hexbin plots can be a useful alternative to scatter plots if your data are too dense to plot each point individually.

In [None]:
flights.plot.hexbin(x='airtime', y='arrdelay', width=600, height=500)

#### Bivariate

You can create a 2D density plot with the ``bivariate`` method. Bivariate plots can be a useful alternative to scatter plots if your data are too dense to plot each point individually.

In [None]:
crime.plot.bivariate('Violent Crime rate', 'Burglary rate', colorbar=True, width=600, height=500)

#### HeatMap

A ``HeatMap`` lets us view the relationship between three variables, so we specify the 'x' and 'y' variables and an additional 'C' variable. Additionally we can define a ``reduce_function`` that computes the values for each bin from the samples that fall into it. Here we plot the 'depdelay' (i.e. departure delay) for each day of the month and carrier in the dataset:

In [None]:
flights.plot.heatmap(x='day', y='carrier', C='depdelay', reduce_function=np.mean,
                     colorbar=True)

#### Tables

Unlike all other plot types, a table only supports one signature: either all columns are plotted, or a subset of columns can be selected by defining the ``columns`` explicitly:

In [None]:
crime.plot.table(columns=['Year', 'Population', 'Violent Crime rate'], width=400)

### Distributions

Plotting distributions differs slightly from other plots since they plot only one variable in the simple case rather than plotting two or more variables against each other. Therefore when plotting these plot types no ``index`` or ``x`` value needs to be supplied. Instead:

1. Declare a single ``y`` variable, e.g. ``source.plot.hist(variable)``, or
2. Declare a ``y`` variable and ``by`` variable, e.g. ``source.plot.hist(variable, by='Group')``, or
3. Declare columns or plot all columns, e.g. ``source.plot.hist()`` or ``source.plot.hist(columns=['A', 'B', 'C'])``

#### Histogram

The Histogram is the simplest example of a distribution; often we simply plot the distribution of a single variable, in this case the 'Violent Crime rate'. Additionally we can define a range over which to compute the histogram and the number of bins using the ``bin_range`` and ``bins`` arguments respectively:

In [None]:
crime.plot.hist('Violent Crime rate')

Or we can plot the distribution of multiple columns:

In [None]:
columns = ['Violent Crime rate', 'Property crime rate', 'Burglary rate']
crime.plot.hist(y=columns, bins=20, alpha=0.5)

We can also group the data by another variable:

In [None]:
flights[flights.carrier.isin([b'AA', b'US', b'OH'])].plot.hist('depdelay', by='carrier', bins=20, bin_range=(-20, 100), alpha=0.3)

#### KDE

You can also create density plots using ``DataSource.plot.kde()`` method:

In [None]:
crime.plot.kde('Violent Crime rate')

Comparing the distribution of multiple columns is also possible:

In [None]:
columns=['Violent Crime rate', 'Property crime rate', 'Burglary rate']
crime.plot.kde(y=columns, alpha=0.5, value_label='Rate')

The ``DataSource.plot.kde`` also supports the ``by`` keyword:

In [None]:
flights[flights.carrier.isin([b'AA', b'US', b'OH'])].plot.kde('depdelay', by='carrier', alpha=0.3, xlim=(-20, 70))

#### Box-Whisker Plots

Just like the other distribution-based plot types, the box-whisker plot supports plotting a single column:

In [None]:
crime.plot.box('Violent Crime rate')

It also supports multiple columns:

In [None]:
columns=['Burglary rate', 'Larceny-theft rate', 'Motor vehicle theft rate',
         'Property crime rate', 'Violent Crime rate']
crime.plot.box(y=columns, group_label='Crime', legend=False, value_label='Rate (per 100k)', invert=True)

Lastly, it also supports using the ``by`` keyword to split the data into multiple subsets:

In [None]:
flights[flights.carrier.isin([b'AA', b'US', b'OH'])].plot.box('depdelay', by='carrier', ylim=(-10, 70))

## Composing Plots

One of the core strengths of HoloViews is the ease of composing
different plots. Individual plots can be composed using the ``*`` and
``+`` operators, which overlay and compose plots into layouts
respectively. For more information on composing objects, see the
HoloViews [User Guide](http://holoviews.org/user_guide/Composing_Elements.html).

By using these operators we can combine multiple plots into composite plots. A simple example is overlaying two plot types:

In [None]:
crime.plot('Year', 'Violent Crime rate') * crime.plot.scatter('Year', 'Violent Crime rate', size=30)

We can also lay out different plots and tables together:

In [None]:
(crime.plot.bar('Year', 'Violent Crime rate', rot=90, width=550) +
 crime.plot.table(['Year', 'Population', 'Violent Crime rate'], width=420))

## Large data

The previous examples summarized the fairly large airline dataset using statistical plot types that aggregate the data into a feasible subset for plotting.  We can instead aggregate the data directly into the viewable image using [datashader](http://datashader.org), which provides a rendering of the entire set of raw data available (as far as the resolution of the screen allows). Here we plot the 'airtime' against the 'distance':

In [None]:
flights.plot.scatter('distance', 'airtime', datashade=True)

## Groupby

Thanks to the ability of HoloViews to explore a parameter space with a set of widgets we can apply a groupby along a particular column or dimension. For example we can view the distribution of departure delays by carrier grouped by day, allowing the user to choose which day to display:

In [None]:
flights.plot.violin('depdelay', by='carrier', groupby='dayofweek', ylim=(-20, 60), height=500, dynamic=False)

## Customizing the visualization

In addition to specific options for different plot types the plotting
API exposes a number of general options including:

- ``colorbar`` (default=False): Enables colorbar
- ``grid`` (default=False): Whether to show a grid
- ``hover`` (default=True): Whether to show hover tooltips
- ``invert`` (default=False): Swaps x- and y-axis
- ``legend`` (default=True): Whether to show a legend
- ``logx``/``logy`` (default=False): Enables logarithmic x- and y-axis respectively
- ``loglog`` (default=False): Enables logarithmic x- and y-axis
- ``shared_axes`` (default=False): Whether to link axes between plots
- ``title`` (default=''): Title for the plot
- ``xlim``/``ylim`` (default=None): Plot limits of the x- and y-axis
- ``xticks``/``yticks`` (default=None): Ticks along x- and y-axis specified as an integer, list of ticks postions, or list of tuples of the tick positions and labels
- ``width`` (default=800)/``height`` (default=300): The width and height of the plot in pixels

In addition, options can be passed directly to HoloViews, providing greater control over the plots. The options can be provided as dictionaries via the ``plot_opts`` and ``style_opts`` keyword arguments. You can also apply options using the HoloViews API (for more information see the HoloViews [User Guide](http://holoviews.org/user_guide/Customizing_Plots.html)). 

In general, the objects returned by HoloPlot are full HoloViews objects, which can be overlaid, laid out, or composed with other HoloViews objects, and sampled, sliced, selected, or annotated like any HoloViews objects.  The [HoloViews](http://holoviews.org) website explains all the functionality available, but what's on this HoloPlot website should be enough to get you up and running for typical usage.  