# BMI565 - Bioinformatics Programming & Scripting

## Data Visualization With [D3](https://d3js.org)

### Table of Contents

1. [Introduction](#Preamble)
    * Dependencies
    * Installation
    * Setup
4. [In-Class Exercises](#In---Class-Exercises)
5. [Reference](#Reference)

### Introduction

#### Dependencies

The following dependencies are required to run this notebook.

1. Python
    * Python 3.x.x
2. Python Libraries
    * Pandas 0.20.3
    * Jupyter 1.0.0
    * Jupyter Dashboards 0.7.0
    * Jupyter Dashboards Bundler 0.9.1
    * Ipywidgets 6.0.0
3. JavaScript
    * Node.js 5.x +
    * NPM 3.5.x +
    
#### Installation
Paste & run the following commands into your terminal. 
```bash
pip3 install pandas==0.20.3  # Pandas (Introduces DataFrame data structure, similar to R)
pip3 install jupyter==1.0.0  # Jupyter Notebook
pip3 install jupyter_dashboards==0.7.0  # Jupyter Dashboards
jupyter dashboards quick-setup --sys-prefix  # Jupyter Dashboards - Extension Setup Step
pip3 install jupyter_dashboards_bundlers==0.9.1  # Jupyter Dashboard Bundler
jupyter bundlerextension enable --sys-prefix --py dashboards_bundlers  # Jupyter Bundler - Extension Setup Step
pip3 install ipywidgets==6.0.0 # Ipywidgets
jupyter nbextension enable --py --sys-prefix widgetsnbextension  # Ipywidgets - Extension Setup Step
```

#### Setup

In [37]:
# Render Plots Inline
%matplotlib notebook

# Imports: Standard Library
import re as Rgx
import math as Math
import json as JSON

# Imports: Third Party
import pandas as Pandas
import matplotlib.pyplot as Plot, mpld3
from ipywidgets import *
from IPython.display import display
from IPython.display import clear_output
from IPython.display import Javascript

In the previous lesson, we used a native Python library (`mpld3`) to render a D3 plot using matplotlib & Python, however, it's also possible to render interactive data visualizations in pure D3. There are a number of advantages to offloading visualization tasks to D3 and other JavaScript visualization libraries, which include:

##### Advantages
* Highly customizable visualization
* Custom event-based interactions
* Crisp rendering via SVG
* No superfluous tool bars / UI
* Improved performance
* Still able to leverage the power of Python for data manipulation

##### Disadvantages
* Requires multiple languages (JavaScript, HTML, CSS, in addition to Python)
* Number of data points often limited by browser memory (roughly 2,000 - 10,000 points)
* Within a Jupyter Notebook, debugging can be tricky

While D3.js is the defacto library for visualization in JavaScript, it can be a bit complex at first. To simplify things, we can use a library abstraction that lets us 

#### Running JavaScript Using Jupyter Magics

In [38]:
%%javascript 

// Call a Beloved JavaScript Alert
alert("I Ran From a JavaScript Cell in a Python Notebook!")

<IPython.core.display.Javascript object>

Above is a JavaScript `magic` – it notifies the active Jupyter Notebook kernel to run this cell using JavaScript instead of Python or Markdown.

To generate an interactive visualization, we need to do four things:
* Load D3.js into the notebook environment
* Make the data we'd like to visualize available to the JavaScript engine
* Create an HTML element to bind our SVG visualization to
* Write the visualization code

The first two are fairly trivial and have been adapted from [this blogpost](http://blog.thedataincubator.com/2015/08/embedding-d3-in-an-ipython-notebook/).

#### Load Visualization JavaScript Libraries

In [39]:
%%javascript

// Loading a JS Library
require.config({
    paths: {
        // Notice We Omit The '.js' Extension From The CDN Paths
        vega : '//cdnjs.cloudflare.com/ajax/libs/vega/3.0.2/vega.min'
    }
});

<IPython.core.display.Javascript object>

Here we're using require.js library (automatically loaded by Jupyter for it's own processes) to load new libraries; Vega, Vega Lite, and Vega Embed. This method of loading external libraries from a CDN ([content delivery network](https://en.wikipedia.org/wiki/Content_delivery_network)) can be used to load any arbitrary JavaScript library into the Jupyter environment.

#### Loading Data

In [40]:
with open("data/iris.json", "r") as json:
    data = JSON.load(json)

#### Passing the Data

The data needs to be made available to Vega, we can do this by binding a new property `dataframe` to the global `window` scope.

In [41]:
Javascript("window.visualizationData={};".format(JSON.dumps(data)))

<IPython.core.display.Javascript object>

Let's break this down, as there's a lot going on here:

    Javascript("window.dataframe={};".format(JSON.dumps(data)))
    |________|  |__________________| |_______________________|
         |               |                       |              
       Python        JavaScript                Python           

Let's start with the JavaScript. We first bind a new property `dataframe` to the `window` object, which is effectively the `global` scope in JavaScript (i.e. it is accessible to any function created within the JavaScript Browser environment). 

On the right, Python is using the `format` method to pass our data, as a JSON (i.e. JavaScript Object Notation) object, to the JavaScript section.

This is done before the JavaScript code is made active – by the python on the left, which calls the Javascript evaluation function, imported from `Ipython.display`.

#### Create an HTML Element to Bind Our Visualization To

In [42]:
%%javascript

// Define Visualization & Append To Global Scope
window.visualizationDef = {
  "$schema": "https://vega.github.io/schema/vega/v3.0.json",
  "width": 500,
  "padding": 5,

  "config": {
    "axisBand": {
      "bandPosition": 1,
      "tickExtra": true,
      "tickOffset": 0
    }
  },

  "signals": [
    { "name": "fields",
      "value": ["petalWidth", "petalLength", "sepalWidth", "sepalLength"] },
    { "name": "plotWidth", "value": 60 },
    { "name": "height", "update": "(plotWidth + 10) * length(fields)"},
    { "name": "bandwidth", "value": 0,
      "bind": {"input": "range", "min": 0, "max": 0.5, "step": 0.005} },
    { "name": "steps", "value": 100,
      "bind": {"input": "range", "min": 10, "max": 500, "step": 1} }
  ],

  "data": [
    {
      "name": "iris",
      "values": window.visualizationData,
      "transform": [
        {
          "type": "fold",
          "fields": {"signal": "fields"},
          "as": ["organ", "value"]
        }
      ]
    }
  ],

  "scales": [
    {
      "name": "layout",
      "type": "band",
      "range": "height",
      "domain": {"data": "iris", "field": "organ"}
    },
    {
      "name": "xscale",
      "type": "linear",
      "range": "width", "round": true,
      "domain": {"data": "iris", "field": "value"},
      "zero": true, "nice": true
    },
    {
      "name": "color",
      "type": "ordinal",
      "range": "category"
    }
  ],

  "axes": [
    {"orient": "bottom", "scale": "xscale", "zindex": 1},
    {"orient": "left", "scale": "layout", "tickCount": 5, "zindex": 1}
  ],

  "marks": [
    {
      "type": "group",
      "from": {
        "facet": {
          "data": "iris",
          "name": "organs",
          "groupby": "organ"
        }
      },

      "encode": {
        "enter": {
          "yc": {"scale": "layout", "field": "organ", "band": 0.5},
          "height": {"signal": "plotWidth"},
          "width": {"signal": "width"}
        }
      },

      "data": [
        {
          "name": "density",
          "transform": [
            {
              "type": "density",
              "steps": {"signal": "steps"},
              "distribution": {
                "function": "kde",
                "from": "organs",
                "field": "value",
                "bandwidth": {"signal": "bandwidth"}
              }
            },
            {
              "type": "stack",
              "groupby": ["value"],
              "field": "density",
              "offset": "center",
              "as": ["y0", "y1"]
            }
          ]
        },
        {
          "name": "summary",
          "source": "organs",
          "transform": [
            {
              "type": "aggregate",
              "fields": ["value", "value", "value"],
              "ops": ["q1", "median", "q3"],
              "as": ["q1", "median", "q3"]
            }
          ]
        }
      ],

      "scales": [
        {
          "name": "yscale",
          "type": "linear",
          "range": [0, {"signal": "plotWidth"}],
          "domain": {"data": "density", "field": "density"}
        }
      ],

      "marks": [
        {
          "type": "area",
          "from": {"data": "density"},
          "encode": {
            "enter": {
              "fill": {"scale": "color", "field": {"parent": "organ"}}
            },
            "update": {
              "x": {"scale": "xscale", "field": "value"},
              "y": {"scale": "yscale", "field": "y0"},
              "y2": {"scale": "yscale", "field": "y1"}
            }
          }
        },
        {
          "type": "rect",
          "from": {"data": "summary"},
          "encode": {
            "enter": {
              "fill": {"value": "black"},
              "height": {"value": 2}
            },
            "update": {
              "yc": {"signal": "plotWidth / 2"},
              "x": {"scale": "xscale", "field": "q1"},
              "x2": {"scale": "xscale", "field": "q3"}
            }
          }
        },
        {
          "type": "rect",
          "from": {"data": "summary"},
          "encode": {
            "enter": {
              "fill": {"value": "black"},
              "width": {"value": 2},
              "height": {"value": 8}
            },
            "update": {
              "yc": {"signal": "plotWidth / 2"},
              "x": {"scale": "xscale", "field": "median"}
            }
          }
        }
      ]
    }
  ]
}

<IPython.core.display.Javascript object>

In [43]:
%%javascript

// Write HTML String To Create Visualization Element
var html = '<div id="visualization"></div>'

// Append String As HTML To Output Element – Uses jQuery
element.append(html);

// Require Vega Library Scope
require(['vega'], function (vega) {
    // Generate Vega Visualization
    var view = new vega.View(vega.parse(window.visualizationDef))
        .renderer('svg')              // Set Renderer Type (Canvas Or SVG)
        .initialize('#visualization') // Initialize View Within Parent Element
        .hover()                      // Enable Hover Encode Set Processing
        .run();                       // Call Visualization
    
});

<IPython.core.display.Javascript object>

Above, `element` is a global reference to the output cell HTML element. This element is selected using the standard JavaScript document manipulation library jQuery, making it a jQuery object – conveniently, this let's us call jQuery's `append` method to add an HTML element to the output. 