# BMI565 - Bioinformatics Programming & Scripting

## Data Visualization With [D3](https://d3js.org)

### Table of Contents

1. [Introduction](#Preamble)
    * Dependencies
    * Installation
    * Setup
    * Overview
2. [JavaScript & Jupyter](#JavaScript-&-Jupyter)
    * Running JavaScript Using Jupyter Magics
    * Loading JavaScript Libraries
    * Passing Python Objects to JavaScript
    * Creating HTML Elements
3. [An Example Visualization With Vega.js](#An-Example-Visualization-With-Vega.js)
4. [In-Class Exercises](#In---Class-Exercises)
5. [Reference](#Reference)

### Introduction

#### Dependencies

The following dependencies are required to run this notebook.

1. Python
    * Python 3.x.x
2. Python Libraries
    * Pandas 0.20.3
    * Jupyter 1.0.0
3. JavaScript
    * Node.js 5.x +
    * NPM 3.5.x +
    
#### Installation
Paste & run the following commands into your terminal. 
```bash
pip3 install jupyter==1.0.0  # Jupyter Notebook
```

#### Setup

In [26]:
# Render Plots Inline
%matplotlib notebook

# Imports: Standard Library
import json as JSON

# Imports: Third Party
from IPython.display import Javascript # Included With Jupyter Installaton

#### Overview
In the previous lesson, we used a native Python library (`mpld3`) to render a D3 plot using matplotlib & Python, however, it's also possible to render interactive data visualizations in pure D3. There are a number of advantages to offloading visualization tasks to D3 and other JavaScript visualization libraries, which include:

##### Advantages
* Highly customizable, potentially novel visualizations
* Highly customizable event-based interactions
* Crisp rendering via SVG
* No superfluous tool bars / UI
* Robust performance
* Still able to leverage the power of Python for data manipulation

##### Disadvantages
* Requires multiple languages (JavaScript, HTML, CSS, in addition to Python – this ignores libraries)
* Number of data points often limited by browser memory (roughly 2,000 - 10,000 points)
* Within a Jupyter Notebook, debugging can be tricky

While D3.js is the defacto library for visualization in JavaScript, it can be a bit complex – before embarking down the road of creating an interactive visualization with D3 for your data, it's worth considering why D3 is the best solution for your usecase.

Before showing a full example, we'll go look at how:
* We can run JavaScript in a Jupyter notebook
* Dynamically load JavaScript libraries as needed 
* Expose our data to the JavaScript environment 
* Inject new HTML into a notebook cell

All of the above points are precursors to being able to do a notebook visualization – that said, they're not too tricky. Note that some of this workflow has been adapted from [this blogpost](http://blog.thedataincubator.com/2015/08/embedding-d3-in-an-ipython-notebook/). Okay, let's jump in. 

### JavaScript & Jupyter

#### Running JavaScript Using Jupyter Magics

In [27]:
%%javascript 

// Call a Beloved JavaScript Alert
alert("I Ran From a JavaScript Cell in a Python Notebook!")

<IPython.core.display.Javascript object>

Above is a JavaScript `magic` – it notifies the active Jupyter Notebook kernel to run this cell using JavaScript instead of the kernels language (in this case, Python) or Markdown.

#### Loading JavaScript Libraries

In [28]:
%%javascript

// Loading a JS Library
require.config({
    paths: {
        // Notice We Omit The '.js' Extension And Internet Protocol (e.g. https:) From The CDN Paths
        vega : '//cdnjs.cloudflare.com/ajax/libs/vega/3.0.2/vega.min'
    }
});

<IPython.core.display.Javascript object>

Here we're using require.js library (automatically loaded by Jupyter for it's own processes) to load a new library called Vega. This method of loading external libraries from a CDN ([content delivery network](https://en.wikipedia.org/wiki/Content_delivery_network)) can be used to load any arbitrary JavaScript library into the Jupyter environment as needed, however, it is best to load any libraries you'll need upfront.

#### Passing Python Objects to JavaScript

The data we'll be using needs to be loaded and made available to Vega, we can do this by first simply loading the data into Python and then binding a new property `dataframe` to the global `window` scope and injecting the data therein as a JavaScript object. Don't worry if you find this code complicated at first – it is quite dense. 

In [29]:
# Load The Classic Iris Dataset
with open("data/iris.json", "r") as json:
    data = JSON.load(json)

# Bind The Data To The Browser's Global Scope
Javascript("window.visualizationData={};".format(JSON.dumps(data)))

<IPython.core.display.Javascript object>

Let's break the line above down, as there's a lot going on:

    Javascript("window.dataframe={};".format(JSON.dumps(data)))
    |__________||__________________||_________________________|
         |               |                       |              
       Python    String of JavaScript           Python           

Let's start with the string of JavaScript `"window.dataframe={};"`. Once the string becomes executable JavaScript, it will bind a new property `dataframe` to the `window` object, which is the `global` scope in the Browser's JavaScript environment (i.e. it is accessible to any JavaScript code). 

On the right, the Python code `"".format(JSON.dumps(data))` is first using the `JSON` Python module's `dumps()` method to cast the JSON data as a string and then using Python's string `format` method to insert the now stringified data into string of JavaScriptwhere the curly braces `{}` are.

Once the string has been formatted, Python calls the Javascript evaluation function `Javascript("some JavaScript string")`, imported from `Ipython.display` on it, converting the string into active JavaScript code.

#### Creating HTML Elements

The final piece is to be able to create a new HTML element that can be the parent element that our visualization lives in, fortunately this is very straightforward:

In [30]:
%%javascript

// Write HTML String To Create Visualization Element
var html = '<div id="some-element-id"></div>'

// Append String As HTML To Output Element – Uses jQuery
element.append(html);

<IPython.core.display.Javascript object>

Above, `element` is a global reference to the subsequent Jupyter output cell's HTML element. This element is selected using the standard JavaScript document manipulation library jQuery, making it a jQuery object – conveniently, this let's us call jQuery's builtin `append` method, which appends our new HTML element to the output cell. 

### An Example Visualization With [Vega.js](https://vega.github.io/vega/)

Vega.js is a JavaScript library that aims to define a declarative grammar for visualization that allows for the development of portable visualization models. The library uses D3.js to generate visualizations for this grammar. Below is a boilerplate example; while you may find it to be quite complex (perhaps needlessly so), there is fortunately a pared down version of Vega.js called [Vega-Lite](https://vega.github.io/vega-lite/) that provides much of the core functionality of Vega without a required complexity.

In [33]:
%%javascript

// Define Visualization & Append To Global Scope
window.visualizationDef = {
  "$schema": "https://vega.github.io/schema/vega/v3.0.json",
  "width": 500,
  "padding": 5,

  "config": {
    "axisBand": {
      "bandPosition": 1,
      "tickExtra": true,
      "tickOffset": 0
    }
  },

  "signals": [
    { "name": "fields",
      "value": ["petalWidth", "petalLength", "sepalWidth", "sepalLength"] },
    { "name": "plotWidth", "value": 60 },
    { "name": "height", "update": "(plotWidth + 10) * length(fields)"},
    { "name": "bandwidth", "value": 0,
      "bind": {"input": "range", "min": 0, "max": 0.5, "step": 0.005} },
    { "name": "steps", "value": 100,
      "bind": {"input": "range", "min": 10, "max": 500, "step": 1} }
  ],

  "data": [
    {
      "name": "iris",
      "values": window.visualizationData,
      "transform": [
        {
          "type": "fold",
          "fields": {"signal": "fields"},
          "as": ["organ", "value"]
        }
      ]
    }
  ],

  "scales": [
    {
      "name": "layout",
      "type": "band",
      "range": "height",
      "domain": {"data": "iris", "field": "organ"}
    },
    {
      "name": "xscale",
      "type": "linear",
      "range": "width", "round": true,
      "domain": {"data": "iris", "field": "value"},
      "zero": true, "nice": true
    },
    {
      "name": "color",
      "type": "ordinal",
      "range": "category"
    }
  ],

  "axes": [
    {"orient": "bottom", "scale": "xscale", "zindex": 1},
    {"orient": "left", "scale": "layout", "tickCount": 5, "zindex": 1}
  ],

  "marks": [
    {
      "type": "group",
      "from": {
        "facet": {
          "data": "iris",
          "name": "organs",
          "groupby": "organ"
        }
      },

      "encode": {
        "enter": {
          "yc": {"scale": "layout", "field": "organ", "band": 0.5},
          "height": {"signal": "plotWidth"},
          "width": {"signal": "width"}
        }
      },

      "data": [
        {
          "name": "density",
          "transform": [
            {
              "type": "density",
              "steps": {"signal": "steps"},
              "distribution": {
                "function": "kde",
                "from": "organs",
                "field": "value",
                "bandwidth": {"signal": "bandwidth"}
              }
            },
            {
              "type": "stack",
              "groupby": ["value"],
              "field": "density",
              "offset": "center",
              "as": ["y0", "y1"]
            }
          ]
        },
        {
          "name": "summary",
          "source": "organs",
          "transform": [
            {
              "type": "aggregate",
              "fields": ["value", "value", "value"],
              "ops": ["q1", "median", "q3"],
              "as": ["q1", "median", "q3"]
            }
          ]
        }
      ],

      "scales": [
        {
          "name": "yscale",
          "type": "linear",
          "range": [0, {"signal": "plotWidth"}],
          "domain": {"data": "density", "field": "density"}
        }
      ],

      "marks": [
        {
          "type": "area",
          "from": {"data": "density"},
          "encode": {
            "enter": {
              "fill": {"scale": "color", "field": {"parent": "organ"}}
            },
            "update": {
              "x": {"scale": "xscale", "field": "value"},
              "y": {"scale": "yscale", "field": "y0"},
              "y2": {"scale": "yscale", "field": "y1"}
            }
          }
        },
        {
          "type": "rect",
          "from": {"data": "summary"},
          "encode": {
            "enter": {
              "fill": {"value": "black"},
              "height": {"value": 2}
            },
            "update": {
              "yc": {"signal": "plotWidth / 2"},
              "x": {"scale": "xscale", "field": "q1"},
              "x2": {"scale": "xscale", "field": "q3"}
            }
          }
        },
        {
          "type": "rect",
          "from": {"data": "summary"},
          "encode": {
            "enter": {
              "fill": {"value": "black"},
              "width": {"value": 2},
              "height": {"value": 8}
            },
            "update": {
              "yc": {"signal": "plotWidth / 2"},
              "x": {"scale": "xscale", "field": "median"}
            }
          }
        }
      ]
    }
  ]
}

<IPython.core.display.Javascript object>

Developing your visualization model is certainly the trickiest part. Actually rendering the visualization is easy:

In [37]:
%%javascript

// Write HTML String To Create Visualization Element
var html = '<div id="visualization"></div>'

// Append String As HTML To Output Element – Uses jQuery
element.append(html);

// Require The Vega Library
require(['vega'], function (vega) {
    
    // Generate Vega Visualization
    var view = new vega.View(vega.parse(window.visualizationDef))
        .renderer('svg')              // Set Renderer Type (Canvas Or SVG)
        .initialize('#visualization') // Initialize View Within Parent Element
        .hover()                      // Enable Hover Encode Set Processing
        .run();                       // Call Visualization
    
});

<IPython.core.display.Javascript object>