# Introduction

D3.js is a Javascript library that stands for Data Driven Documents. It can be used to manipulate documents based on data. D3 helps create visualizations using HTML/CSS and SVG. This tutorial is aimed at teaching you the basics of DOM (Document Object Model) manipulation using D3 syntax. We will start by using D3 to create basic shapes before moving onto graphs and visualizations.

A look at the D3.js website gives shows us the massive capabilities of D3 and visualizations. However, while D3 is a relatively easy library to learn, it cannot be taught in one tutorial. So the prime focus of this tutorial will focus on **basic D3 functionality** and **how D3 can be used in Python notebooks**, rather than how D3 can be used for creative, interactive visualizations.

## Tutorial Content

We will cover the following topics:
- Introduction to DOM
- Using D3 to create basic shapes
- Using data with D3
- Creating a bar chart using data from Python
- Creating a simple framework to generate D3 charts
- Conclusion
- References and further reading

## Introduction to DOM

To begin with, we will have to import the ```HTML``` library from the ```display``` module

In [1]:
from IPython.core.display import HTML

This library allows us to display HTML content as the output of our code block.

In [2]:
HTML('''
<h1>Hello World</h1>
''')

We can even use basic CSS to style out HTML.

In [3]:
HTML('''
<style scoped>
.heading {
  color: red;
  font: 16px Times New Roman;
}
</style>
<h1 class="heading">Hello World</h1>
''')

We will now use the D3 library to manipulate the HTML. Every loaded HTML page generates its Domain Object Model (DOM). Javascript, and, by extension D3 can use this DOM to manipulate the HTML content of the loaded page.

To use D3 in our notebook, we will have to include it first.

**Note:** We will be using D3 version 3.5.6 in this tutorial. The latest version of D3 is version 5 and it does contain certain syntatical differences from the version that we have used.

In [4]:
HTML('<script src="lib/d3.min.js"></script>')

In [5]:
HTML('''
<style scoped>
.heading1 {
  color: steelblue;
}
</style>
<div id="d3-div-1"></div>
<script>

d3.select("#d3-div-1").append('p')
      .attr("class","heading1")
      .style("font-size", "30px")
      .text("Hello World");

</script>
''')

Here is a breakdown of the functions we used.
- select: Selects an HTML element for manipulation using either tag name, style class or ID (we selected an HTML division using its id)
- append: Adds an HTML element *inside* the selected outer HTML element (we appended an HTML paragraph)
- attr: Adds/Modifies an HTML attribute to/of an HTML element
- style: Adds/Modifies the style attribute to/of an HTML element
- text: Modifies the innerText of an HTML element

The result of the D3 code above will be the following HTML document:

In [6]:
HTML('''
<style scoped>
.heading1 {
  color: steelblue;
}
</style>
<div id="d3-div-1_1">
    <p class="heading1" style="font-size: 30px;">Hello World</p>
</div>
''')

## Using D3 to create basic shapes

We are now going to use D3 to create some basic shapes. In this section, we will be using an Scalable Vector Graphics (SVG) element. You will get a better idea of the co-ordinate plane of an SVG and how to "navigate" it to generate visualizations. One can thing of navigating through an SVG using a reference "cursor". The cursor is not actually visible, but it points to any single point in the XY co-ordinate plane.

The first thing that you have to keep in mind is that SVG geometry works like regular geometry with one key difference: the Y-axis is inverted. Therefore, as the Y co-ordinate increases, the "cursor" moves downward and the origin, the (0,0) point, is in the top left corner instead of the bottom left corner.

Now we will see how to append a square at the centre of an SVG using D3:
1. First we will append an SVG element to an HTML division
2. We will give the SVG a size and background colour
3. We will append the square to the SVG
4. We will move the square to the centre of the SVG

In [7]:
HTML('''
<div id="d3-div-2"></div>
<script>

d3.select("#d3-div-2").append('svg')
      .attr("height","200px")
      .attr("width","400px")
      .style("background-color", "#DDD");

</script>
''')

Now that we have added our SVG, we need to append a square. A square is represented using the SVG ```rect``` element.

In [8]:
HTML('''
<script>

d3.select("#d3-div-2").select('svg')
      .append("rect")
      .attr("height","20px")
      .attr("width","20px")
      .style("fill", "#000");

</script>
''')

We will now move the square to the centre using the ```transform``` attribute. The value for this attribute will contain the co-ordinate to where the square should be "translated". Since our SVG is 400 x 200, the centre would be (200,100)

**Note:** You will need to scroll up to see the results

In [9]:
HTML('''
<script>

d3.select("#d3-div-2").select('svg').select('rect')
      .attr("transform","translate(200,100)");

</script>
''')

As you can see that the square is slightly askew from the centre. That is because we have no accounted for the square's dimensions. The "origin" of the square is its top left corner. If we are to centre it within the svg, we need to account for its dimensions.

**Note:** The values given to the ```translate``` function are absolute and only relative to the origin of the SVG

In [10]:
HTML('''
<script>

d3.select("#d3-div-2").select('svg').select('rect')
      .attr("transform","translate(190,90)");

</script>
''')

Now we'll see how this could have been done in one go.

In [11]:
HTML('''
<div id="d3-div-3"></div>
<script>

d3.select("#d3-div-3").append('svg')
      .attr("height","200px")
      .attr("width","400px")
      .style("background-color", "#DDD")
      .append('rect')
      .attr("transform","translate(190,90)")
      .attr("height","20px")
      .attr("width","20px")
      .style("fill", "#000");

</script>
''')

We will now see examples of how to append lines and circles within an SVG using D3.

In [12]:
HTML('''
<div id="d3-div-4"></div>
<script>

d3.select("#d3-div-4").append('svg')
      .attr("id","d3svg")
      .attr("height","200px")
      .attr("width","400px")
      .style("background-color", "#DDD")
      .append('circle')
      .attr("cx","200")
      .attr("cy","100")
      .attr("r","20px")
      .style("fill", "#000");

d3.select("#d3svg")
      .append('line')
      .attr("x1","200")
      .attr("x2","200")
      .attr("y1","0")
      .attr("y2","200")
      .attr("stroke", "red");
      
d3.select("#d3svg")
      .append('line')
      .attr("x1","000")
      .attr("x2","400")
      .attr("y1","100")
      .attr("y2","100")
      .attr("stroke", "red");
</script>
''')

To learn more about other SVG elements and their relevant attributes you can visit W3Schools [here](https://www.w3schools.com/graphics/svg_intro.asp).

## Using data with D3

We will now see the basics of D3 interacts with data to generate charts and graphs. We are going to use an array of sizes to generate multiple squares of those given sizes.

In [13]:
HTML('''
<div id="d3-div-5"></div>
<script>
sizes = [10,20,30,40,50]
d3.select("#d3-div-5").append('svg')
      .attr("height","200px")
      .attr("width","400px")
      .style("background-color", "#DDD")
      .selectAll('rect').data(sizes).enter()
      .append('rect')
      .attr("transform",function(d,i){return "translate("+(((i+1)*(400/(sizes.length+1)))-(d/2))+","+(90-d/2)+")"})
      .attr("height",function(d,i){return d+"px"})
      .attr("width",function(d,i){return d+"px"})
      .style("fill", "#000");

</script>
''')

We can see some new functons here:
- selectAll: Selects multiple elements as a group
- data: Performs a "data-join" between the elements of the selection and data variable that is passed to the function
- enter: Enters the group of selections and iterates over them

Effectively, the ```selectAll```, ```data``` and ```enter``` functions, select multiple iterations of an SVG element and makes the selection iterable. You can see the iteration being accessed using ```function(d,i)``` where ```d``` is the data element of the current selection, and ```i``` is its index.

You can find more information about D3 selections [here](https://bost.ocks.org/mike/selection/), and [here](https://www.intothevoid.io/data-visualization/understanding-d3-data-vs-datum/).

## Creating a bar chart using data from Python

Now that we see how D3 interacts with a data structure, we need to pass it data using Python code. Since D3 is a Javascript library, it works with JSON data. We will have to convert a ```DataFrame``` into a JSON object and plug it into our Javascript code using ```Template``` and ```substitute```.

We will encounter more D3 helper functions used to easily configure chart staples like axes and scale.

Let's start by creating a data frame.

In [14]:
from string import Template
import pandas as pd
import json, random

In [15]:
data = pd.DataFrame({'region': ['East','West','North','South'], 'value':[250,430,600,190]})
data.head()

Unnamed: 0,region,value
0,East,250
1,West,430
2,North,600
3,South,190


We will now create a ```Template``` for our HTML content. The reason we are using templates is for value substitution. We will create a placeholder for the data in the template which we can plug in later.

In [16]:
html_template = Template('''
<style>
    .bar {
      fill: steelblue;
    }

    .axis {
      font: 10px sans-serif;
    }

    .axis path,
    .axis line {
      fill: none;
      stroke: #000;
      shape-rendering: crispEdges;
    }
</style>
<div id="chart"></div>
<script>
    var margin = {top: 20, right: 20, bottom: 30, left: 40},
        width = 500 - margin.left - margin.right,
        height = 300 - margin.top - margin.bottom;

    var svg = d3.select("#chart").append("svg")
        .attr("width", width + margin.left + margin.right)
        .attr("height", height + margin.top + margin.bottom)
        .append("g")
        .attr("transform", "translate(" + margin.left + "," + margin.top + ")");

    var data = $data ;
    
    var x = d3.scale.ordinal()
        .rangeBands([0, width], .1)
        .domain(data.map(function(d) { return d.region; }));

    var y = d3.scale.linear()
        .range([height, 0])
        .domain([0, d3.max(data, function(d) { return d.value; })]);
        
    var xAxis = d3.svg.axis()
        .scale(x)
        .orient("bottom");

    var yAxis = d3.svg.axis()
        .scale(y)
        .orient("left");

    svg.append("g")
      .attr("class", "x axis")
      .attr("transform", "translate(0," + height + ")")
      .call(xAxis);

    svg.append("g")
      .attr("class", "y axis")
      .call(yAxis);

    svg.selectAll(".bar")
      .data(data)
      .enter().append("rect")
      .attr("class", "bar")
      .attr("x", function(d) { return x(d.region); })
      .attr("width", x.rangeBand())
      .attr("y", function(d) { return y(d.value); })
      .attr("height", function(d) { return height - y(d.value); });
</script>
''')

As you can see, the data variable containts a placeholder ```$data```.

Now we will convert our Pandas data frame into a JSON string and plug it into the HTML template.

In [17]:
jsonData = json.dumps(data.to_dict(orient='records'))
HTML(html_template.substitute({'data': jsonData}))

To understand how this chart was constructed we need to understand D3 scales.

D3 scales are functions where you can map an input domain to an output range. We can do this not only for linear variables but for ordinal variable as well. In the graph above, the X-axis is mapped to an ordinal variable (region) and the Y-axis is mapped to a linear or numerical variable (values).

#### Ordinal Scale

```d3.scale.ordinal()``` is used to construct a *discrete* scale which can be mapped over an array of of values. That array is passed as to the ```domain``` function of the scale. In our example, our domain is all 4 regions in the data frame. The ```rangeBands``` function sets the output range from the specified continuous interval. The interval is an array containing the lower and upper bounds of the range. If the domain has n items, the range will be divided into n equally sized, even spaced bands. This function is also used to specify the padding between two bands. The range in our example is the entire width of the chart with a 10% padding between bands.

Below is a visual representation of how an ordinal scale sets rangeBands.

<img src="img/ordinalScale.png">

#### Linear Scale

```d3.scale.linear()``` is used to construct a *continuous* scale over an array of values. Similar to the ordinal scale, that array is passed to the ```domain``` function of the scale. In our example, we map the domain from 0 to the maximum value present in the Y-axis variable, values. The ```range``` function specifies the lower and upper bounds of the output. We have set the domain as the entire height of the chart. You will notice that instead of [0, height], the range is [height,0]. The reason for this the intverted nature of the Y-axis in SVG geometry. Notice that since we have inverted our scale, the heights of the bars are adjusted by subtracting its scaled value from the height of the chart.

#### D3 Axis

```d3.svg.axis()``` can be used to create both X and Y axes for a chart. We used two main helper functions to define both the axes. The ```scale``` function defines which scale is to be used to mark the axis and the ```orient``` function defines on which side of the axis line should the ticks be marked.

Each axis is contained within an SVG group element and is displayed using the ```call``` function.

## Creating a simple framework to generate D3 charts

Now that we have seen how we can use the ```Template``` module to plug data in to Javascript code, we will extend this functionality to allow us to generate D3 charts easily.

In order to do this, the Javascript code for the D3 chart will need to be stored in a separate Javascript file (.js). We will create a Python class which can also be stored separately as a .py file. We can store any further styling configuration is separate .css files.

For this example we will create a line graph (this will allow us to look at a new type of scale, time). The code for the line graph is given below. However, we will be storing that code in a separate file while making the graph.

In [18]:
js_code = HTML('''
var margin = {top: 30, right: 20, bottom: 30, left: 50},
    width = 600 - margin.left - margin.right,
    height = 270 - margin.top - margin.bottom;

var parseDate = d3.time.format("%d-%b-%y").parse;
var formatTime = d3.time.format("%d-%b-%y");

var x = d3.time.scale().range([0, width]);
var y = d3.scale.linear().range([height, 0]);

var xAxis = d3.svg.axis().scale(x)
    .orient("bottom").ticks(5);

var yAxis = d3.svg.axis().scale(y)
    .orient("left").ticks(5);

var valueline = d3.svg.line()
    .x(function(d) { return x(d.date); })
    .y(function(d) { return y(d.value); });
    
d3.select("#maindiv${divnum}").selectAll("svg").remove();
var svg = d3.select("#maindiv${divnum}")
    .append("svg")
        .attr("width", width + margin.left + margin.right)
        .attr("height", height + margin.top + margin.bottom)
    .append("g")
        .attr("transform", 
              "translate(" + margin.left + "," + margin.top + ")");
              
    data = $data
    
    data.forEach(function(d) {
        d.date = parseDate(d.date);
        d.value = +d.value;
    });

    x.domain(d3.extent(data, function(d) { return d.date; }));
    y.domain([0, d3.max(data, function(d) { return d.value; })]);

    svg.append("path")
        .attr("class", "line")
        .attr("d", valueline(data));

    svg.selectAll("circle")
        .data(data)
      .enter().append("circle")
        .attr("r", 5)
        .attr("cx", function(d) { return x(d.date); })
        .attr("cy", function(d) { return y(d.value); })
        .style("fill", "#FFF")
        .style("stroke-width", "2")
        .style("stroke", "F00");
        });

    svg.append("g")
        .attr("class", "x axis")
        .attr("transform", "translate(0," + height + ")")
        .call(xAxis);

    svg.append("g")
        .attr("class", "y axis")
        .call(yAxis);
''')

And our CSS file will contain the following styling configuration.

In [19]:
css_code = HTML('''
body { font: 12px Arial;}

path { 
    stroke: red;
    stroke-width: 2;
    fill: none;
}

.axis path,
.axis line {
    fill: none;
    stroke: grey;
    stroke-width: 1;
    shape-rendering: crispEdges;
}
''')

The Javascript and CSS files will be stored separately as /js/line_chart.js and /css/line_chart.css folders respectively. The file names will also be the identifier for our Python class when we want to generate this chart.

Now we will write the Python class which will be used to reference these files and generate the D3 chart.

In [20]:
import inspect, os

class D3Generator:
    def this_dir(self):
        this_file = inspect.getfile(inspect.currentframe())
        return os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))


    def set_styles(self, css_file_names):
        # Check if single file name has been passed
        if type(css_file_names) == str:
            # If true read its content
            style = open(self.this_dir() + '/css/' + css_file_names + '.css','r').read()
        else:
            style = ''
            # If multiple files have been passed, read all their content and concatenate them together
            for css_file_name in css_file_names:
                style += open(self.this_dir() + '/css/' + css_file_name + '.css','r').read()
        return "<style>" + style + "</style>"


    def draw_graph(self, chart_type, data_dict):

        # Create a template with an empty division and a placeholder for the the division ID and the Javascript code
        JS_text = Template('''

                    <div id='maindiv${divnum}'></div>

                    <script>
                        $main_text
                    </script>

                    ''')
        # Give the division a randomly selected unique ID to avoid overlap with another division in the HTML page
        divnum = int(random.uniform(0,9999999999))
        #data_dict['divnum'] = divnum
        # Read and subtitiute the Javascript code in the Template
        main_text_template = Template( open(self.this_dir() + '/js/' + chart_type + '.js','r').read() )
        main_text = main_text_template.safe_substitute({'divnum': divnum, 'data':data_dict})

        return JS_text.safe_substitute({'divnum': divnum, 'main_text': main_text})

Now let's read a CSV file and convert its data into JSON string.

In [21]:
line_data = pd.read_csv('data/line_chart.csv')
line_jsonData = json.dumps(line_data.to_dict(orient='records'))

By creating an object of class D3Generator, we can access the set_styles function. This function will take the chart type(s) as an argument and read all the CSS files corresponding to those chart types. It is important that the filename matches the chart type. After reading all the CSS files, it will return the concatenated content of all the files, where will will parse it as HTML content.

In [22]:
NewChart = D3Generator()
HTML(NewChart.set_styles('line_chart'))

Now that the CSS for the chart is in place, we can draw the chart by passing it the chart type and its data as arguments to the draw_graph function. The draw graph function will randomly generate an ID for an HTML division which will contain the SVG for our chart. The function will then read the appropriate Javascript file which contains two placeholders, one for the division ID and another for the data. After plugging in the ID and data into the Javascript text, the function will plug in the division ID and the Javascript text to the main HTML Template. This text will be returned and outputted as HTML

In [23]:
HTML(NewChart.draw_graph('line_chart',line_jsonData))

## Conclusion

You now have the basic skills to use D3 for DOM manipulation and to create simple graphs. D3 is a vast library and can be used to create interactive charts with animation. The charts used in the tutorial were kept simple by design so we could focus more on the basics of D3. I wholehartedly recommend learning more about D3 and its other features. Links for further readings will be provided in the references.

## References and further reading

1. http://www.bogotobogo.com/python/IPython/iPython_Jupyter_Notebook_with_Embedded_D3.php
2. https://blog.thedataincubator.com/2015/08/embedding-d3-in-an-ipython-notebook/
3. http://www.machinalis.com/blog/embedding-interactive-charts-on-an-ipython-nb/
4. https://www.intothevoid.io/data-visualization/understanding-d3-data-vs-datum/
5. https://bost.ocks.org/mike/selection/
6. http://d3-wiki.readthedocs.io/zh_CN/master/Ordinal-Scales/
7. http://bl.ocks.org/d3noob/38744a17f9c0141bcd04
8. https://d3js.org/
9. https://github.com/d3/d3/wiki
10. https://github.com/d3/d3/wiki/Tutorials
11. https://www.dashingd3js.com/table-of-contents