In [2]:
#ignore
import hvplot.sample_data
import pandas as pd
import numpy as np
import holoviews as hv
import holoviews.plotting.bokeh

from bokeh.sampledata import iris, stocks 

hv.Store.set_current_backend('bokeh')
hv.opts.defaults(
    hv.opts.Violin(violin_color='Type of crime', cmap='Category10', show_legend=False))

crime = hvplot.sample_data.us_crime.get().read()
iris = iris.flowers

<img src="./images/hvplot-wm.png" width="150px"></img>

**A high-level plotting API for the PyData ecosystem - built on HoloViews.**

<img src="./images/hvplot_collage.png" width=100%></img>

We are very pleased to introduce a new visualization library called hvPlot, which is closely modeled on the pandas and xarray ``.plot`` API. Unlike the standard pandas plotting API, hvPlot outputs HoloViews objects which visualize themselves as interactive bokeh plots. hvPlot is designed to work well with a wide array of libraries in the PyData ecosystem, including:

* [Pandas](http://pandas.pydata.org): DataFrame, Series (columnar/tabular data)
* [xarray](http://xarray.pydata.org): Dataset, DataArray (multidimensional arrays)
* [Dask](http://dask.pydata.org): DataFrame, Series (columnar data)
* [Streamz](http://streamz.readthedocs.io): DataFrame(s), Series(s) (streaming columnar data)
* [Intake](http://github.com/ContinuumIO/intake): DataSource (data catalogues)
* [GeoPandas](http://geopandas.org/): GeoDataFrame (geometry data)
* [NetworkX](https://networkx.github.io/documentation/stable/): Graph (network data)

### Why a new library?

The Python visualization landscape is already very crowded and many libraries in the PyData ecosystem ship with their own plotting APIs usually built on top of matplotlib, which provide a quick and easy way to generate plots. Matplotlib is a very powerful library but when using it you miss out on all the interactive features of modern web-based plotting tools. hvPlot makes the transition from tools that users are familiar with to interactive plotting as simple as switching out an import. Since hvPlot aims to provide plotting tools for all the major libraries in the PyData system plots generated from all these different libraries can then be flexibly combined. Similar efforts such as the [Pandas-Bokeh](https://github.com/PatrikHlobil/Pandas-Bokeh) provide many of the same features but are not as ambitious in scope.

Additionally hvPlot provides the user with entrypoint to make use of many of the features in HoloViews and Bokeh:

* Exploring multi-dimensional parameter spaces using auto-generated widgets
* Scale visualization to millions or even billions of datapoints using dask and datashader integration
* Exploring interactive visualizations including streaming plots in the notebook and seamlessly transition to a standalone server



### A shared, consistent and familiar API

Whether you are plotting pandas, xarray, dask, streamz, intake or geopandas data you only need to learn one plotting API with extensive documentation for all the options.

#### Interactivity

Let us jump straight into what hvPlot can do by generating a DataFrame containing a number of timeseries and plotting it. By importing ``hvplot.pandas`` we add ``.hvplot`` to the pandas DataFrame and Series methods and can immediately start using it. The same concept applies to the other supported libraries, e.g. import ``hvplot.dask`` to add the method to dask DataFrame/Series.

In [3]:
import hvplot.pandas

idx = pd.date_range('1/1/2000', periods=1000)
df  = pd.DataFrame(np.random.randn(1000, 4), index=idx, columns=list('ABCD')).cumsum()

df.hvplot()

Thanks to bokeh we automatically get a fully interactive plot with hover, zoom, and an interactive legend.

#### Fully tab-completable

When working in a Jupyter(Lab) notebook or an IPython prompt both the plot types and the supported options are fully tab-completable to make the options easily discoverable.

<div style="margin: auto; display: block; width: 60%"><img src="./images/tab_complete.gif"></img></div>

#### A wide-range of plot types

As you have seen in the collage hvPlot supports a wide range of plot types including line, scatter, area, step, bar, box-whisker, violin, KDE, hexbin, histogram, image, contour, filled contour, polygon, and graph plots.

In [4]:
columns = ['Burglary rate', 'Larceny-theft rate', 'Robbery rate', 'Violent Crime rate']
crime.hvplot.violin(y=columns, group_label='Type of crime', value_label='Rate per 100k', invert=True)

#### Support for geographic plots

Thanks to integration with [GeoViews](https://geoviews.org) and [Cartopy](https://scitools.org.uk/cartopy/docs/latest/) we can ingest and project data from and to any coordinate system, letting you combine columnar data from pandas, gridded data from [xarray](http://xarray.pydata.org/en/stable/) and geometry data from [GeoPandas](http://geopandas.org/) while letting GeoViews handle projections automatically.

In [24]:
import xarray as xr
import hvplot.xarray 
import cartopy.crs as crs
import geoviews as gv

xr_ds = xr.tutorial.open_dataset('air_temperature').load()
air_temp = air_ds.air.isel(time=0)
proj = crs.Orthographic(-90, 30)

air_plot = air_temp.hvplot.quadmesh(
    'lon', 'lat', projection=proj, project=True, global_extent=True,
    width=525, height=450, cmap='viridis', rasterize=True)

air_plot * gv.feature.coastline

#### Streaming data

With the ``streamz`` library we can easily generate streaming plots, which efficiently stream the data when a ``streamz.DataFrame`` object is updated.

In [None]:
import hvplot.streamz
from streamz.dataframe import Random

streaming_df = Random(freq='50ms', interval='100ms')
rolling_df = streaming_df.rolling('500ms').mean()

(rolling_df.hvplot.hexbin(x='x', y='z', backlog=2000, height=400, width=500, padding=0.1) +
 rolling_df.hvplot(backlog=100, height=400, width=500, padding=0.1))

<div style="margin: auto; display: block; width: 60%"><img src="./images/hvplot_streamz.gif" ></img></div>

#### Network graphs

In addition to the general plotting API supported for the other libraries, hvPlot also ships with a plotting interface for NetworkX which mirrors the plotting functions in the ``nx.`` (networkx) namespace. The ``hvnx`` namespace as defined below therefore provides a drop-in replacement for the usual plotting functions, letting you plot interactive graphs with nodes, edges and labels:

In [7]:
import networkx as nx
import hvplot.networkx as hvnx

G = nx.karate_club_graph()

hvnx.draw_spring(G, labels='club', font_size='10pt', node_color='club', cmap='Category10', width=500, height=500)

#### Datashader integration

When your data is larger than can ordinarily be plotted, hvPlot makes it incredibly simple to activate datashading, which will dynamically aggregate your data dependening on the current zoom level. This allows plotting millions or even billions of datapoints, e.g. below we are loading and then displaying 300 million datapoints (one for every resident in the 2010 US census), using dask and datashader:

In [None]:
import hvplot.dask
import dask.dataframe as dd

ddf = dd.read_parquet('/Users/philippjfr/datashader/examples/data/census.parq/').persist()
gv.tile_sources.CartoDark * ddf.hvplot.points('meterswest', 'metersnorth', datashade=True, cmap='viridis', height=500)

<div style="margin: auto; display: block; width: 60%"><video src='./images/hvplot_census.mp4' controls></div>

#### Intake catalogs

The [Intake](https://intake.readthedocs.io/en/latest/quickstart.html) library provides a plugin system for loading your data and defining data catalogs. On top of the many different types of data it supports via various plugins it also natively supports ``hvPlot`` as part of its yaml catalogue specification. This means you can define custom plots declaratively as part of a catalogue definition, letting you to define some default plots alongside your data.

```yaml
sources:
  nyc_taxi:
    description: NYC Taxi dataset
    driver: parquet
    args:
      urlpath: 's3://datashader-data/nyc_taxi_wide.parq'
    metadata:
      plots:
        dropoff_scatter:
          kind: scatter
          x: dropoff_x
          y: dropoff_y
          datashade: True
          width: 800
          height: 600
```

To view a plot defined in a catalogue is as simple as calling: ``intake.cat.nyc_taxi.hvplot.dropoff_scatter()``

## Try it out

We hope you'll give hvPlot a try and it makes your visualization workflows a little bit easier and more interactive. Let us know how it goes and don't hesitate to file issues or make suggestions for improvements for the library. To get started follow the installation instructions below and [visit the website](https://hvplot.pyviz.org/).

### Installation

hvPlot supports Python 2.7, 3.5, 3.6 and 3.7 on Linux, Windows, or Mac and can be installed with conda:

```
conda install -c pyviz hvplot
```

or with pip:

```
pip install hvplot
```

For JupyterLab support, the jupyterlab_pyviz extension is also required::

```
jupyter labextension install @pyviz/jupyterlab_pyviz
```

### Acknowledgements

hvPlot was built with the support of Anaconda Inc.. Special thanks to all the contributors:

* Philipp Rudiger (@philippjfr)
* Julia Signell (@jsignell)
* James A. Bednar (@jbednar)
* Andrew Huang (@ahuang11)
* Jean-Luc Stevens (@jlstevens)