!["Anaconda"](img/anaconda-logo.png)
<br>
*Copyright Continuum 2012-2016 All Rights Reserved.*

# Dask Dependency Graphs

## Table of Contents
* [Dask Dependency Graphs](#Dask-Dependency-Graphs)
	* [Set-up](#Set-up)
* [Dask.array vs NumPy Array](#Dask.array-vs-NumPy-Array)
	* [1D Array of Ones](#1D-Array-of-Ones)
	* [2D Array: More Interesting Graphs](#2D-Array:-More-Interesting-Graphs)
* [Execution Examples](#Execution-Examples)
	* [Embarrasingly Parallel](#Embarrasingly-Parallel)
	* [More Reduction Required](#More-Reduction-Required)


## Set-up

In [None]:
# Requires graphviz install
# !conda install -y graphviz

In [None]:
# from dask.dot import dot_graph


Example using Dask.array to generate Dask graphs
=============================

This example builds intuition about block parallel algorithms for nd-arrays and how dask works by visualizing dask.array algorithms for increasingly complex computations.

It tends to be a bit nicer to do with a side-by-side terminal and PDF viewer that auto-refreshes.  The GUI Graphviz client 

# Dask.array vs NumPy Array

Dask.array looks and feels like NumPy

In [None]:
import numpy as np
import dask.array as da

In [None]:
np.arange(10)

In [None]:
da.arange(10, chunks=5)

In [None]:
da.arange(10, chunks=5).compute()

## 1D Array of Ones

This breaks up an array of 15 ones into three blocks, each of size 5

If doing this in the terminal add an appropriate filename to the `visualize` call like so

    x.visualize('dask.pdf')

In [None]:
x = da.ones(15, chunks=(5,))
x.visualize('dask.dot')
#x.visualize('dask.svg')

In [None]:
# Depending on platform, we might be able to use Windows associations
#!dask.dot
# Or OSX utility `open`
!open dask.dot
# Or Linux utilities:
#!xdg-open dask.dot
#!mimeopen -d dask.dot

In [None]:
(x + 1).visualize('dask.dot')
#(x + 1).visualize('dask.svg')

In [None]:
(x + 1).sum().visualize('dask.dot')
#(x + 1).sum().visualize('dask.svg')

## 2D Array: More Interesting Graphs

This is a 15x15 array of ones, broken up into a 3x3 grid of blocks.  Each block has size 5x5.

In [None]:
x = da.ones((15, 15), chunks=(5, 5))
x.visualize('dask.dot')
#x.visualize('dask.svg')

In [None]:
(x + 1).sum(axis=0).visualize('dask.dot')
#(x + 1).sum(axis=0).visualize('dask.svg')

In [None]:
# this one is fun to guess before you render it to the screen
(x + x.T).visualize('dask.dot')
#(x + x.T).visualize('dask.svg')

In [None]:
# Now we just start showing off
(x.dot(x.T)).visualize('dask.dot')
#(x.dot(x.T)).visualize('dask.svg')

In [None]:
(x.dot(x.T + 1)).visualize('dask.dot')
#(x.dot(x.T + 1)).visualize('dask.svg')

In [None]:
(x.dot(x.T + 1) - x.mean(axis=0)).visualize('dask.dot')
#(x.dot(x.T + 1) - x.mean(axis=0)).visualize('dask.svg')

In [None]:
(x.dot(x.T + 1) - x.mean(axis=0)).std().visualize('dask.dot')
#(x.dot(x.T + 1) - x.mean(axis=0)).std().visualize('dask.svg')

In [None]:
# This actually runs the computation, sending threads through the full graph, 
# executing all of the circles to produce our final result
%time (x.dot(x.T + 1) - x.mean(axis=0)).std().compute()

# Execution Examples

The below animations reflect different procedures than illustrated above, but give a feel for the flow of computation on many cores.  

## Embarrasingly Parallel

First an embarrasingly parallel example:

<img src="img/embarrassing.gif"/>

## More Reduction Required

Not quite as optimal, but still resolved by Dask, are cases where more reduction steps are required:

<img src="img/fail-case.gif"/>

<br>
*Copyright Continuum 2012-2016 All Rights Reserved.*