[(previous)](05%20-%20-%20-%20Introduction%20-%20to%20-%20data%20-%20as%20-%20a%20-%20science.ipynb) | [(index)](00%20-%20-%20-%20Introduction%20-%20to%20-%20Python.ipynb)

# JavaScript tools for visualisation

<div class="alert alert-block alert-warning">
    <b>Learning outcomes:</b>
    <br>
    <ul>
        <li>Learn how data management, and analysis, supports the Model-View-Control approach to application development.</li>
        <li>Investigate and apply JavaScript tools, such as D3, Plotly and Leaflet, for data presentation and visualisation.</li>
        <li>Learn how to improve your knowledge and experience through online documentation and question-and-answer communities.</li>
    </ul>
</div>

While designing web applications is beyond the scope of this short introduction, the various components of this course have prepared you for the high-level approach to application abstraction.

In [Foundations for Open Data](https://docs.google.com/document/d/1g_aJrN91xHXGFi6wUaeQNQSRf5AndBqRexIgmF8mjUk/edit#) you learned about how to prepare standardised, machine-readable CSV files from spreadsheets. Core to that was restructuring the data so that each row's data has all the information required to provide context and meaning to it. While that leads to duplicated fields, it permits easy lookups, filters and joins/merges with other data.

In sections [4](04 - Python for data analysis.ipynb) and [5](05 - Introduction to data as a science.ipynb) of this course, you learned about how to create functions which encapsulate analysis and logic in preparing data for presentation.

This section focuses on some JavaScript libraries and methods for presenting data.

These three components are known as the Model-Control-View approach to software abstraction.

## Model-Control-View (MCV or MVC)

The MCV approach to abstraction permits the separation of discrete parts of an application into different modules, supporting interoperability between multiple applications and permitting greater flexibility for the developer.

- **Model**: The representation and logic for the data, from structural metadata (the way data are represented in the database) to the functionality for querying data and extracting slices from the database;
- **Control**: The core logic and functionality of the application which defines its purpose;
- **View**: The interface and visual components presented to the user, and with which they interact;

![MVC - RegisFrey via Wikipedia](https://upload.wikimedia.org/wikipedia/commons/thumb/a/a0/MVC-Process.svg/436px-MVC-Process.svg.png)

Hopefully, by this stage of the course, you can see how a well-structured data file may be used to define the model, and how our experiments with analysing data in the last two sections can be seen to be the control. While we did create some visualisations, these were static. Modern web applications are based on the ability of users to interact with data and charts.

Most web applications are composed of three separate pieces of code: HTML, CSS and JavaScript. These are the core components of the World Wide Web. It may be helpful to think of the HTML as the model, CSS as the view, and JavaScript as the control/logic operating entirely in the browser (where the web application server has simply presented data - the result of the application's model output - to the browser for further processing).

You can learn introductions to [HTML](https://learn-html.org/) and [JavaScript](https://learn-js.org/) at these links.

There are a diverse and large number of JavaScript libraries that support data visualisation and interactivity in the browser. This is by no means an extensive list:

- [D3.js](https://d3js.org/) is a comprehensive (and quite difficult) library which is capable of producing almost any visualisation of interest. A great resource for examples is Mike Bostock's [Bl.ocks](https://bl.ocks.org/). D3 is entirely open source.
- [Plot.ly](https://plot.ly/) is a series of libraries (including plotly.js) permitting a diverse range of charts. This is more structured than D3, and the non-JavaScript components (such as for Python/Pandas) are commercial products. The JavaScript library, however, is open source.
- [Leaflet.js](https://leafletjs.com/) is an open source library for producing mobile-friendly interactive maps based on [OpenStreetMap](https://www.openstreetmap.org). To visualise maps on Leaflet in Jupyter Notebooks, we'll use the [Folium](https://github.com/python-visualization/folium) Python library.

In this last section of the course, we will present the choropleth map we produced on Matplotlib in the last section on Leaflet. To do this, we need to convert our tabular data into JSON format.

## Preparing JSON notation

In [1]:
# As in Section 5, import our Pandas and GeoPandas library
import pandas as pd
import geopandas as gpd
# Import our saved Yemen data slice
data_slice = pd.read_csv("data/yemen_cholera_data_slice.csv")
# Open the shapefile called "yem_admin1.shp"
shape_data = gpd.GeoDataFrame.from_file("data/yem_admin1.shp")
# We have no data for Socotra island, so we can drop this row
shape_data = shape_data.loc[~shape_data.name_en.isin(["Socotra"])]
# And now we can merge our existing data_slice to produce our map data
map_data = pd.merge(shape_data, data_slice, how="outer", left_on="name_en", right_on="Governorate", indicator=False)
# Let's filter to only the data and date we want to show on the map (which will make it faster)
map_slice = map_data[map_data.Date=="2018-01-14"][["Governorate", "Deaths", "geometry"]]
map_slice

Unnamed: 0,Governorate,Deaths,geometry
5,Ibb,286,"POLYGON ((44.07799093200003 14.38560877300006,..."
140,Abyan,35,"POLYGON ((46.29079443600006 14.05947152400006,..."
277,Amanat Al Asimah,70,"POLYGON ((44.38133503500006 15.57250071700003,..."
413,Al Bayda,34,(POLYGON ((45.47230606000005 14.66985415100004...
549,Taizz,187,"POLYGON ((43.61001368100005 13.91468914900003,..."
684,Al Dhale'e,81,"POLYGON ((44.99620555600006 14.21919927700003,..."
821,Al Jawf,22,"POLYGON ((46.10000018300008 17.24999977900006,..."
957,Hajjah,420,(POLYGON ((42.41165918000007 16.09624915500007...
1093,Al Hudaydah,280,(POLYGON ((42.74680318000003 13.64650590000002...
1224,Hadramaut,2,"POLYGON ((50.77950706500008 16.97111052800005,..."


[JSON](https://json.org/) (or, more correctly, JavaScript Object Notation) is a data interchange format. While it has the *JavaScript* in its name, it's actually language agnostic and can be easily read and produced by most programming languages.

JSON is built on two structures:

- A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.
- An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.

[GeoJSON](https://en.wikipedia.org/wiki/GeoJSON) is a very particular format of JSON which presents data in an appropriate format for presentation on a map. And, fortunately for you, there is no need to worry much more about it than that as GeoPandas will produce appropriately-structured GeoJSON for you.

You'll note that the GeoJSON has the following structure:

    {"type": "FeatureCollection", 
     "features": [{"id": ,
                   "type": "Feature", 
                   "properties": {"Governorate": ,
                                  "Deaths": },
                   "geometry": {"type": "Polygon",
                                "coordinates": [[[]]]}
                 }]
    } 

In [2]:
map_json = map_slice.to_json()
map_json

'{"type": "FeatureCollection", "features": [{"id": "5", "type": "Feature", "properties": {"Governorate": "Ibb", "Deaths": 286}, "geometry": {"type": "Polygon", "coordinates": [[[44.077990932000034, 14.385608773000058], [44.075525818000074, 14.377315683000063], [44.075525808000066, 14.377315649000025], [44.07594207900007, 14.377206757000067], [44.07952731000006, 14.376268903000039], [44.08186399700003, 14.375316759000043], [44.085634431000074, 14.373780398000065], [44.08673081300003, 14.371951617000036], [44.08764130600008, 14.37043290300005], [44.08764126500006, 14.370432493000067], [44.08735218000004, 14.367537653000056], [44.08586143700006, 14.365779305000046], [44.08586130900005, 14.365779153000062], [44.08596232200006, 14.364889860000062], [44.086035681000055, 14.364244025000062], [44.08796543100004, 14.364035649000073], [44.08899578200004, 14.365554505000034], [44.08975243400005, 14.366669898000055], [44.09151673700006, 14.36811521800007], [44.09151680800005, 14.368115276000026], 

A data-driven web application will have components that deal with:

1. Web-addressable data, where data are imported and restructured in your application via APIs (application programming interface) which permit you to access data from outside sources;
2. Logic, including algorithmic functions, which process the data into a useable format (for the specific application) permitting it to be stored in a database (if that is required);
3. Data presentation, where the final version of the data is presented to the front-end (user application, such as a browser) for view;

That final step will often involve converting the data into JSON. As you can see above, it's a quite straightforward step. Often, the most taxing process is the first, simply importing, restructuring and aggregating data for use by your application.

## GeoJSON and Choropleth maps

This section is based on a [Folium tutorial](http://nbviewer.jupyter.org/github/python-visualization/folium/blob/master/examples/GeoJSON_and_choropleth.ipynb). This is also in a Jupyter Notebook, so feel to download it.

To install the Folium library, go to the Environments tab in Conda Navigator and select your environment (base, by default). Open a terminal and type the following:

    conda install -c conda-forge folium
    
Accept the requirements and install the library. You will now be able to run the following:

In [3]:
import folium
from folium import plugins

# Get Yemen's Latitude and Longitude (https://www.latlong.net/place/sana-a-yemen-10298.html): 15.5527 N, 48.5164 E
m = folium.Map([15.5, 48.5], zoom_start=6)
m

You can play around with the starting position and zoom to get a better view of the map, but this is a useful place to begin.

Adding Yemen's provincial borders is as simple as adding the GeoJSON to the map object.

In [4]:
folium.GeoJson(map_json,).add_to(m)
m

We can style each of the shapes on our map using the `style_function`. A `lambda` is an inline function, meaning that each `feature` (from the list of features in the GeoJSON) has the function applied to it (in this case, simply providing a dictionary of characteristics for the view).

In [5]:
m = folium.Map([15.5, 48.5], zoom_start=6)
folium.GeoJson(
    map_json,
    style_function=lambda feature: {
        "fillColor": "#ffff00",
        "color": "black",
        "weight": 2,
        "dashArray": "5, 5"
    }
).add_to(m)
m

Recognise that the `lambda` is a function, meaning we can apply a different colour to each feature. To do that, we need some sort of colour scale, and then we can map each of the `properties` ('Deaths' in our data) to the colour scale.

Folium provides a set of tools in the `branca` library for preparing colour maps. We prepare the range of colours (top and bottom), using `map_slice.Deaths.min()` and `map_slice.Deaths.max()`. The `linear` module contains a number of predefined colour ranges. We'll use `YlOrRd`, which is the yellow-orange-red range.

In [6]:
from branca.colormap import linear

colormap = linear.YlOrRd_03.scale(
    map_slice.Deaths.min(),
    map_slice.Deaths.max())

print(colormap(5.0))

colormap

#ffec9f


We'll use the Choropleth module from Folium. To understand how to use it, let's look at the Docstring.

In [12]:
print(folium.Map.choropleth.__doc__)


        Apply a GeoJSON overlay to the map.

        Plot a GeoJSON overlay on the base map. There is no requirement
        to bind data (passing just a GeoJSON plots a single-color overlay),
        but there is a data binding option to map your columnar data to
        different feature objects with a color scale.

        If data is passed as a Pandas DataFrame, the "columns" and "key-on"
        keywords must be included, the first to indicate which DataFrame
        columns to use, the second to indicate the layer in the GeoJSON
        on which to key the data. The 'columns' keyword does not need to be
        passed for a Pandas series.

        Colors are generated from color brewer (http://colorbrewer2.org/)
        sequential palettes on a D3 threshold scale. The scale defaults to the
        following quantiles: [0, 0.5, 0.75, 0.85, 0.9]. A custom scale can be
        passed to `threshold_scale` of length <=6, in order to match the
        color brewer range.

        Topo

In [7]:
m = folium.Map([15.5, 48.5], zoom_start=6)
m.choropleth(
    geo_data = map_json,
    data = map_slice,
    columns = ["Governorate", "Deaths"],
    key_on = "feature.properties.Governorate",
    fill_color = "YlOrRd",
    legend_name = "Yemen deaths from cholera, January 2018"
)
m

Continuing to develop and practice your skills takes time and effort. It's as easy to forget how to code as it is to forget any new language.

One of the best ways to ensure your skills are maintained is to find a small hobby project you can work on. After that, you will need access to resources and people with skills who can guide your progress and help out when required.

## Online documentation and support

There are two key ways to learn coding. The first is to read the manual, and the more popular software libraries usually have comprehensive documentation. The second is to ask random strangers for help via the world's most popular question-and-answer site, [Stack Overflow](https://stackoverflow.com).

You will help your learning process massively by having a specific problem to solve, and that means you need a project. Whatever your interests, find something that will motivate you to challenge yourself. Whether it be exploring and charting a dataset, or even building a complete web application, push yourself to produce working code, and solve all the problems along the way.

Here are the links to all the libraries and modules used in this course:

- [Foundations for Open Data](https://docs.google.com/document/d/1g_aJrN91xHXGFi6wUaeQNQSRf5AndBqRexIgmF8mjUk/edit#)
- [LearnPython.org](https://learnpython.org/)
- [Beginning Python](http://hetland.org/writing/beginning-python-2/) by Magnus Lie Hetland, Apress
- [10 Minutes to pandas](https://pandas.pydata.org/pandas-docs/stable/10min.html), from the pandas documentation
- [Numpy tutorial](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html)
- [Matplotlib tutorials](https://matplotlib.org/tutorials/index.html)
- [Seaborn](https://seaborn.pydata.org/introduction.html)
- [HTML](https://learn-html.org/)
- [JavaScript](https://learn-js.org/)
- [D3.js tutorial](https://d3js.org/#introduction)
- [Plot.ly JavaScript tutorial](https://plot.ly/javascript/)
- [Leaflet.js](https://leafletjs.com/)
- [OpenStreetMap](https://www.openstreetmap.org)
- [Folium documentation](https://python-visualization.github.io/folium/docs-v0.5.0/) and [Jupyter Notebooks](https://nbviewer.jupyter.org/github/python-visualization/folium/tree/master/examples/)

And take a [tour](https://stackoverflow.com/tour) of Stack Overflow to get an idea of how millions of coders from around the world help each other. Critically, no coder knows all code. Most coders spend most of their working day researching how to solve coding problems, and they often copy and paste code they find online.

This is not school, and you are encouraged to find the smartest person available to you and to copy their work. That's considered quite a good thing to do, with the proviso that you acknowledge the contribution made by that person and provide attribution in your work to those who helped you.

Good luck and have fun.

[(previous)](05%20-%20-%20-%20Introduction%20-%20to%20-%20data%20-%20as%20-%20a%20-%20science.ipynb) | [(index)](00%20-%20-%20-%20Introduction%20-%20to%20-%20Python.ipynb)