In [None]:
%matplotlib inline
import matplotlib
import seaborn as sns
matplotlib.rcParams['savefig.dpi'] = 144

# An HTML Primer
<!-- requirement: small_data/Charm_City_Circulator_Ridership.csv -->
<!-- requirement: small_data/gettysburg.json -->

In this notebook:
* HTML, DOM and D3: how a website works
* Manipulating Data with javascript
* Live-updating visualizations with D3


This notebook provides a template [taken from here](http://nbviewer.ipython.org/github/abarto/embedding_interactive_charts_on_an_ipython_notebook/blob/master/embedding_interactive_charts_on_an_ipython_notebook.ipynb#sub_est_2012_df_by_state_template) for how to embed HTML/JavaScript in an IPython notebook, and how to link up data you may analyze in pandas or other Python tooling with a web frontend, all from the same ipynb.*

## Data-Driven Documents

D3 stands for "Data-Driven Documents", which is a fairly apt name.  While it can be used for plotting, its fundamental purpose is connecting data to elements of a document.  We'll demonstrate that here by connecting data to a text document.

Specifically, the document is the Gettysburg Address.  The data are values of from a (simplistic) sentiment analysis of the text.  Each word gets two numbers: a measure of positive sentiment and a measure of negative sentiment.  This essentially gives us two axes to explore.  The difference of the scores will tell us if a word is overall positive or negative, while the sum tells us how emotionally charged that word is.

In [None]:
import ihtml
import simplejson as json
gettysburg = json.load(open('small_data/gettysburg.json', 'r'))
gettysburg[:5]

We will reassemble the address from this list, coloring each word according to the positive (blue) and negative (red) sentiments:

```js
function textColor(d) {
  var red = Math.round(d.neg * 255);
  var blue = Math.round(d.pos * 255);
  return "rgb(" + red + ", 0, " + blue + ")";
}
```

## (A) Create document manually, using DOM

 we will do this "by hand", without using D3.  We will put the address inside of a `<div>` with the id of *address*, so we select that.
```js
  var div = document.querySelector("#address");
```
For each word in the address,
```js
  for (var i in gettysburg) {
```
we create a `<span>`,
```js
    var span = document.createElement("span");
```
set its text and color from the data,
```js
    span.textContent = gettysburg[i].word;
    span.style.color = textColor(gettysburg[i])
```
and then append it to the `<div>`.
```js
    div.appendChild(span);
  }
```
This is all done inside of a "DOMContentLoaded" handler, to ensure the DOM is ready for our manipulations.

Here it is in action:

## Code and output:

In [None]:
%%ihtml
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <title>Gettysburg</title>

    <script>
      function textColor(d) {
        var red = Math.round(d.neg * 255);
        var blue = Math.round(d.pos * 255);
        return "rgb(" + red + ", 0, " + blue + ")";
      }

      document.addEventListener("DOMContentLoaded",
        function(e) {
          var gettysburg = {{ gettysburg | json }}
          var div = document.querySelector("#address");
          for (var i in gettysburg) {
            var span = document.createElement("span");
            span.textContent = gettysburg[i].word;
            span.style.color = textColor(gettysburg[i]);
            div.appendChild(span);
          }
        }
      )
    </script>

    <style>
      #address {
        white-space: pre-wrap;
      }
    </style>
  </head>

  <body>
    <h1>Gettysburg Address</h1>
    <div id="address"></div>
  </body>
</html>

## (B) Create document with D3

Now let's do the same thing, but with D3.  Only two things change: We need to load the D3 javascript file, and we build the document with a D3 pipeline.  Once again, we start by selecting the address `<div>`.
```js
  d3.select("#address")
```
D3 methods tend to return other D3 objects, which allow us to chain calls together.  Inside of this element, we will select all of the `<span>` elements.  (There are none right now, but that's okay.)
```js
    .selectAll("span")
```
Now we associate each span with one element of our data.
```js
    .data(gettysburg)
```
There are left over data not associated with any elements (since there were no `<span>`s to start with).  We tell D3 that we want to do something for each of those data.
```js
    .enter()
```
Specifically, we will append a `<span>`,
```js
    .append("span")
```
set its text,
```js
    .text(function(d) { return d.word; })
```
and set its color.
```js
    .style("color", textColor);
```
Each of these last two calls take a function that is called for each point of data.

Et voila!

## Code and output:

In [None]:
%%ihtml
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <title>Gettysburg</title>

    <script src="https://d3js.org/d3.v3.min.js" charset="utf-8"></script>
    <script>
      function textColor(d) {
        var red = Math.round(d.neg * 255);
        var blue = Math.round(d.pos * 255);
        return "rgb(" + red + ", 0, " + blue + ")";
      }

      document.addEventListener("DOMContentLoaded",
        function(e) {
          var gettysburg = {{ gettysburg | json }}
          d3.select("#address")
            .selectAll("span")
            .data(gettysburg)
            .enter()
            .append("span")
            .text(function(d) { return d.word; })
            .style("color", textColor);
        }
      )
    </script>

    <style>
      #address {
        white-space: pre-wrap;
      }
    </style>
  </head>

  <body>
    <h1>Gettysburg Address</h1>
    <div id="address"></div>
  </body>
</html>

## Explanation

This is a little bit more readable than the first example, but it isn't *that* big of an improvement.  The real advantage of D3 comes when we want to change how the data inform the document displaying them.

It can be a bit difficult to tell which words are highly charged just from the colors.  Let's change the font size based on the total positive and negative sentiment.  This could make it hard to read, so we'll add a checkbox to the document.  When it's checked, we'll change the font sizes; when it's not, we'll set them back to a constant value.

There's some boilerplate necessary to get the interaction set up, but the D3-related changes are all in the function that gets called when we click that checkbox.  It takes a single boolean flag for whether the font size should be adjusted.
```js
function update(sizing) {
```
As before, we select all `<span>` in the address `<div>`.  (There's more than zero, now.)
```js
  d3.select("#address")
    .selectAll("span")
```
And we associate each span with an entry in our data.  Since we've already created one span per datum, they match up one-to-one.
```js
    .data(gettysburg)
```
Therefore, don't have to make any more elements.  We can just modify the existing `<span>`s.  Let's be fancy and add a 1000ms transition...
```js
    .transition()
    .duration(1000)
```
...to a new font size.
```js
    .style("font-size", function(d) {
      return (1 + sizing * (d.pos + d.neg)) * 16 + "px";
    });
}
```

## add interactivity : font size toggle

In [None]:
%%ihtml
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <title>Gettysburg</title>

    <script src="https://d3js.org/d3.v3.min.js" charset="utf-8"></script>
    <script>
      function textColor(d) {
        var red = Math.round(d.neg * 255);
        var blue = Math.round(d.pos * 255);
        return "rgb(" + red + ", 0, " + blue + ")";
      }
      
      var gettysburg = {{ gettysburg | json }}

      function update(sizing) {
        d3.select("#address")
          .selectAll("span")
          .data(gettysburg)
          .transition()
          .duration(1000)
          .style("font-size", function(d) {
            return (1 + sizing * (d.pos + d.neg)) * 16 + "px";
          });
      }

      document.addEventListener("DOMContentLoaded",
        function(e) {
          d3.select("#address")
            .selectAll("span")
            .data(gettysburg)
            .enter()
            .append("span")
            .text(function(d) { return d.word; })
            .style("color", textColor)

          document.querySelector("#sizing")
            .addEventListener("change", function(event) {
              update(event.target.checked);
            });
        }
      )
    </script>

    <style>
      #address {
        white-space: pre-wrap;
      }
    </style>
  </head>

  <body>
    <h1>Gettysburg Address</h1>
    <form>
        <label><input type="checkbox" id="sizing" />Toggle font size</label>
    </form>
      <br>
    <div id="address"></div>
  </body>
</html>

## continue

In this case, we have an exact match between the data and the HTML elements.  This won't always happen.  If there are extra data, `.enter()` can be used to handle them.  If there are extra elements, `.exit()` can be used.  For more details on how D3 matches elements to data, see this [tutorial](https://bost.ocks.org/mike/selection/).

## Manipulating Data with Javascript

So far, we used pandas to load and manipulate data and written it into json objects for javascript consumption.  We could also have done the above by directly loading and manipulating the data in javascript.  Unfortunately, to do this, we will need a little bit more work.  For example, we'll have to
- Read the csv file ourselves using `d3.csv` (notice the callback).
- Write a `mean` function in javascript
- filter for invalid numbers using `isNaN`
- manually write our own aggregation function

This can be done using `underscore.js`, a library that provides convenient functional primitives in javascript.  The benefit is that our code is 100% javascript.  This reduces the complexity considerably because we no longer need a bunch of python and javascript code to agree on an intermediate format.  For example, this is easier to deploy on a server because all the logic is handled on the client and we only have to worry about the server side.  Of course, facilities for handling data are much more primitive in javascript so we do not get the benefit of pandas.

In [None]:
%%javascript

require(['d3', 'nvd3'], function(d3, nv){
  var mean = function(values) {
    if (values.length) {
      return _.reduce(values, function(x ,y) { return x + y; }) / values.length;
    } else {
      return NaN;
    };
  };

  var filterMean = function(values) {
    return mean(_.filter(values, function(x) { return !isNaN(x); }));
  };
    
  var calcAggByProp = function(data, prop) {
    var agg = _.map(_.groupBy(data, prop), function(values, day) {
      return { 
        day: day,
        orangeAverage: filterMean(_.map(values, function(val) { return parseInt(val.orangeAverage); } )),
        purpleAverage: filterMean(_.map(values, function(val) { return parseInt(val.purpleAverage); } )),
        greenAverage: filterMean(_.map(values, function(val) { return parseInt(val.greenAverage); } ))
      }
    });

    return {
      orangeAverage: _.object(_.map(agg, function(val) { return [val.day, val.orangeAverage]; })),
      purpleAverage: _.object(_.map(agg, function(val) { return [val.day, val.purpleAverage]; })),
      greenAverage: _.object(_.map(agg, function(val) { return [val.day, val.greenAverage]; }))
    };
  }

  d3.csv("small_data/Charm_City_Circulator_Ridership.csv", function(data) {
    // data is a list of objects with day, date, [color]Average, and more
    var aggByDay = calcAggByProp(data, 'day');

    var data2 = _.map(data, function(d) {
      d.year = d.date.split('/')[2];
      return d;
    });
    var aggByYear = calcAggByProp(data2, 'year');
      
    // first we need to include nvd3's stylesheet because hacks
    $('#nvd3style').remove();
    $("head").append("<link id='nvd3style' rel='stylesheet' type='text/css' href='https://cdn.rawgit.com/novus/nvd3/v1.8.1/build/nv.d3.css'/>");
      
    // generate data from pandas json into d3 format (done by copying the example)
    var chart1data = window.convertKeys(aggByYear);
    var chart2data = window.convertKeys(aggByDay, window.WEEKDAYS_IN_ORDER);
    console.log(chart2data)
    // making some display ports
    $("#chart6").remove();
    $("#chart7").remove();
    element.append("<div id='chart6' />");
    element.append("<br/>");
    element.append("<div id='chart7' />");
    $("#chart6").append("<svg height='500px'></svg>");
    $("#chart7").append("<svg height='500px'></svg>");

    nv.addGraph(window.genGraph(chart1data, "#chart6 svg"));
    nv.addGraph(window.genGraph(chart2data, "#chart7 svg"));
  });
});

## Live-updating visualizations
So remember how `enter` created new nodes only for data points that weren't already present? There is an analogous `exit` function that returns only formerly-bound nodes. We can use these to create live data visualizations where data enters and exits the visualization on the fly.  This is particularly useful for time series data and other streaming data.

One time, one of this notebook's authors wrote a [chat client](https://github.com/cmoscardi/bubblechat) that uses D3 to visualize a live chatroom via a tree of replies. One application of this code could plot reddit conversations in real-time (or, for a big thread, animate the thread tree's growth over time).

Below are the key parts of that chat visualization.

In [None]:
%%javascript
function buildTree(messages, activeMessage) {
  content.innerHTML = "";

  var margin = {top: 20, right: 120, bottom: 20, left: 120},
      width = 960 - margin.right - margin.left,
      height = 600 - margin.top - margin.bottom,
      r = 300;

  var tree = d3.layout.tree()
    .size([360, r]) // this is pretty fancy
    .separation(separation)
    .sort(function(a, b){
      return b.id - a.id;
    });


  var svg = d3.select("#content").append("svg")
    .attr("width", width + margin.right + margin.left)
    .attr("height", height + margin.top + margin.bottom)
    .style("float", "left")
    .append("g")
    .attr("transform", "translate(" + (r*1.5) + "," + (r) + ")");

  var messageArea = d3.select("#content").append("div")
    .style("float", "left")
    .style("width", "300px")
    .style("margin-top", "100px")
    .style('max-width', '300px')
    .attr('class', 'messageArea');


  messageArea
    .append('p')
    .attr('class', 'username')
    .attr('color', 'black');
  messageArea
    .append('p')
    .attr('class', 'message')
    .attr('color', 'black');


  var colors = d3.scale.category20();
  var messageColor = function(d){
    return d.username;
  }
  var source = messages[0];
  update(source);

  function update(root) {
    var nodes = tree.nodes(root);
    var links = tree.links(nodes);

    var link = svg.selectAll(".link")
      .data(links, function(d){return d.source.id + "-" + d.target.id});

    var linkEnter = link
      .enter()
      .append("line")
      .attr("class", "link")
      .attr("stroke", "black")
      .style("stroke-opacity", 1)
      .style("stroke-width", function(d) { return 1 })
      .attr("x1", function(d){return rtc(d.source, true).x})
      .attr("y1", function(d){return rtc(d.source, true).y})
      .attr("x2", function(d){return rtc(d.source, true).x})
      .attr("y2", function(d){return rtc(d.source, true).y})

    var node = svg.selectAll(".node")
      .data(nodes, function(d){return d.id})
      //keeps them above the links
      .each(function(){ this.parentNode.appendChild(this);});

    var nodeEnter = node
      .enter()
      .append("circle")
      .attr("class", "node")
      .attr("r", function(d) {return (50/(1 + d.depth))})
      .attr("fill", function(d){return colors(messageColor(d))})
      .attr("stroke", "black")
      .attr("transform", function(d) { return "rotate(" + (d.parent && d.parent.x0? d.parent.x0 - 90:0) + ")translate(" + (d.parent && d.parent.y0? d.parent.y0:0) + ")"})
      .on("mouseover", function(d){
        messageArea
          .style('background', colors(messageColor(d)));
        messageArea.select('p.username')
          .text("user: " + d.username);
        messageArea.select('p.message')
          .text("message: " + d.message);
      })
      .on("click", function(d){
        if(activeMessage.el){
          activeMessage.el.attr("stroke", "black");
        }
        activeMessage.message = d;
        activeMessage.el = $(this);
        $(this).attr("stroke", "red");
      });

    transitionLinks(link);
    transitionNodes(node);
    
    //set this for the transitions later
    nodes.forEach(function(n){
      n.x0 = n.x;
      n.y0 = n.y;
    });
  }
  return update;
}

function radialToCartesian(d, usex0){
  if(!d.x){
    d.x = 90;
  }
  if(!d.x0){
    d.x0 = 90;
  }
  var r = d.y;
  var theta = d.x;
  if(usex0){
    r = d.y0;
    theta = d.x0;
  }
  theta= (theta - 90) / 180 * Math.PI;
  return {x: r*Math.cos(theta), y: r*Math.sin(theta)};
}
var rtc = radialToCartesian;

function separation(a, b) {
  return (a.parent == b.parent ? .5 : 1) / a.depth;
}

function transitionLinks(link){
  link.transition()
      .duration(700)
      .attr("x1", function(d) { return rtc(d.source).x; })
      .attr("y1", function(d) { return rtc(d.source).y; })
      .attr("x2", function(d) { return rtc(d.target).x; })
      .attr("y2", function(d) { return rtc(d.target).y; });
}

function transitionNodes(node){
  node.transition()
      .duration(700)
      .attr("transform", function(d) { return "rotate(" + (d.x? d.x-90 :0) + ")translate(" + d.y + ")"}); 
}

*Copyright &copy; 2016 The Data Incubator.  All rights reserved.*