In [1]:
#ignore

<img src="./images/collage_ds0.13.png" width="100%"></img>

## What is Datashader?

[Datashader](https://datashader.org) is an open-source Python library for rendering large datasets quickly and accurately.


## Announcing Datashader 0.13!

We are very pleased to announce the 0.13 release of Datashader (building on 0.12.1)! This release includes new features from a slew of different contributors, plus maintenance and bug fixes from Jim Bednar, Philipp Rudiger, Peter Roelants, Thuy Do Thi Minh, Chris Ball, and Jean-Luc Stevens. What's new:

## Matplotlib Artist for Datashader

_Thanks to Nezar Abdennur (nvictus), Trevor Manz, Thomas Caswell, and Philipp Rudiger._

Datashader works best when embedded in an interactive plotting library so that data can be revealed at every spatial scale by zooming and panning. Thomas Caswell made a [draft of Datashader support for Matplotlib](https://github.com/holoviz/datashader/pull/200) during SciPy 2016 when Datashader was first announced, but there was still a lot of work needed to make it general. Various people made suggestions, but largely the sketch sat patiently waiting for someone to finish it. In the meantime, 
Thomas Robitaille made a simpler points-only renderer https://github.com/astrofrog/mpl-scatter-density, which is useful if that's all that's needed. During sprints at SciPy 2020, Nezar Abdennur and Trevor Manz rescuscitated Tom's work, and it's now been released at last! You can now use all the power of Datashader with any of Matplotlib's many backends, e.g. here for the `osx` backend:

<div style="margin: 0 auto; width: 800px"><img src="./images/dsshow_code.png"></video></div>
<div style="margin: 0 auto; width: 800px"><video src="./images/dsshow.mp4" controls width=800></video></div>

See [getting_started/Interactivity](https://datashader.org/getting_started/Interactivity.html#native-support-for-matplotlib) to see how to use it.

## Much more powerful categorical plotting

_Thanks to Oleg Smirnov._

One of Datashader's most powerful features is its categorical binning and categorical colormapping, which allow detailed understanding of how the distribution of data differs by some other variable, such as this plot of how population is segregated by race in New York City:

https://examples.pyviz.org/census/census.html

To build such a plot, Datashader calculates a stack of aggregate arrays simulaneously, one per category, instead of a single aggregate array as in the non-categorical case. 

Previously, categorical binning and plotting was limited to a `Count()` reduction, i.e., counting how many datapoints fell into each pixel, by category, implemented using a speci8al `cat_count()` reduction. Categorical plotting has now been fully generalized into a new `ds.by()` reduction, which accepts a categorical column along with `count()` or
any other reduction (`max()`, `min()`, `mean()`, `sum()`, etc.). Thus it's now possible to plot the mean value of any column, per pixel, per category.
See the [Pipeline docs[(https://datashader.org/getting_started/Pipeline.html) for details).

You can also now use categorical binning and plotting with numerical columns using new functions `category_modulo` and `category_binning`, which opens up entirely new applications for Datashader. For instance, `by(category_binning('z', 0, 10, 16))` will bin by the floating-point column `z`, with 16 categories (0: 0<=z<10, 1: 10<=x<20, etc.). `category_binning` effectively gives Datashader the power to do 3D aggregations of numeric axes, not just the usual 2D.

`category_modulo` is useful when working with very large numbers of unsorted integers, using a `modulo` operator on an integer column to reduce a large number of columns down to something more tractable for plotting.

See [#875](https://github.com/holoviz/datashader/pull/875) and [#927](https://github.com/holoviz/datashader/pull/927) for details on `by`, `category_modulo`, and `category_binning`.

## dynspread that actually works!

_Thanks to Jim Bednar_.

Datashader's points plotting is designed to aggregate datapoints by pixel, accurately counting how many datapoints fell into each pixel. For large datasets, such a plot will accuratelyl reveal the spatial distribution of the data over the axes plotted. However, a consequence is that an individual data point not surrounded by others will show up as a single pixel, which can be difficult to see on a high-resolution monitor, and it is almost impossible to see its color. To alleviate this issue and make it easier to go back and forth between the big picture and individual datapoints, Datashader has long offered the `dynspread` output-transformation function, which takes each pixel and dilates it (increases it in size) until the density of such points reaches a specified metric value. However, dynspread never worked very well in practice, always either doing no spreading or one step of spreading (a 3x3 kernel). After a fresh look at the code, it became clear that the first step of spreading was artificially increasing the amount of estimated pixel density, making it very unlikely that a second or third step would ever be done.

dynspread now spreads each pixel by an integer radius `px` up to the maximum radius `max_px`, stopping earlier if a specified fraction of data points have non-empty neighbors within the radius. This new definition provides predictable, well-behave dynspread behavior even for large values of max_px, making isolated datapoints easily visible. [(#1001)](](https://github.com/holoviz/datashader/pull/1001)

<div style="margin: 0 auto; width: 800px"><video src="https://user-images.githubusercontent.com/1695496/114293108-0ca77500-9a59-11eb-86ed-9679603e7fd1.mp4" controls width=800></video></div>

Note that this definition is only compatible with points, as they are spatially isolated; any usage of dynspread with datatypes other than points should be replaced with spread(), which will do what was probably intended by the original dynspread call anyway (i.e., to make a line or polygon edge thicker).

## Aggregate spreading

_Thanks to Jean-Luc Stevens_.

Spreading previously worked only on RGB arrays, not numerical aggregate arrays, which meant that Datashader users had to choose between seeing isolated datapoints and having interactive features like Bokeh's hover tool and colorbars that require access to the numerical aggregate values. `spread` and `dynspread` now work equally well with either RGB aggregates or numerical aggregates, and we now recommend that users spread at the numerical aggregate level in all supported cases. ([#771](](https://github.com/holoviz/datashader/pull/771),
 [#954](](https://github.com/holoviz/datashader/pull/954)))

## Anti-aliasing (experimental)

_Thanks to Valentin Haenel._

Datashader's line aggregations (also used in trimesh and network plotting) count how many times a line crosses a given pixel. The resulting line plots are very blocky, because of binary transitions between rows and columns depending on where the underlying line lands in the aggregate array grid. To improve appearance of such lines (at a cost of making them less easy to interpret as counts of crossings), Datashader now supports antialiased lines. This support is only partial and is still experimental; it's enabled by adding `antialias=True` to the Canvas.line() method call and is currently restricted to `sum` and `max` reductions only, and to a single-pixel line width.
([#916](](https://github.com/holoviz/datashader/pull/916)))

<hr>

## Help us!

Datashader is an open-source project and we are always looking for new contributors. Join us the discussion on the [Discourse](https://discourse.holoviz.org/) and we would be very excited to get you started contributing! Also please [get in touch with us](mailto:jbednar@anaconda.com) if you work at an organization that would like to support future Datashader development, fund new Datashader features, or set up a support contract.