In [1]:
from IPython.display import Image

# V00: Introduction to Data Visualisation

You'll likely have heard the term 'data visualisation' (commonly abbreviated to 'data vis') before. It's a general term that describes helping users understand the data by placing it in a visual context. Patterns, trends and correlations that might go undetected in text-based data can be gleaned and highlighted easier with data visualization software and languages, such as R and of course Python.

More recently, data vis has grown beyond Excel spreadsheets and charts and become more sophisticated allowing data to be displayed in ways such as GIS maps, infographics, sparklines, heatmaps etc.

## Data Vis in Python

Python has some excellent packages for data visualisation and we'll be giving an overview of some of these in this chapter.


![title](img/matplot.png)

<a href ="http://matplotlib.org/">Matplotlib</a> is probably the most popular data vis library in Python. It was originally created in 2002 making it one of the oldest Python libraries still in use and is based upon the MATLAB visualisation suite.

Matplotlib can be used in Python scripts, Jupyter, web application servers, and graphical user interface toolkits.

![title](img/seaborn.png)

<a href = "https://stanford.edu/~mwaskom/software/seaborn/index.html#">Seaborn</a> is a library for making attractive and informative statistical graphics in Python. It is built on top of matplotlib and tightly integrated with Anaconda, including support for numpy and pandas data structures and statistical routines from scipy and statsmodels.

Some of the features that seaborn offers are:

* Several built-in themes that improve on the default matplotlib aesthetics
* Tools for choosing color palettes to make beautiful plots that reveal patterns in your data
* Functions for visualizing univariate and bivariate distributions or for comparing them between subsets of data
* Tools that fit and visualize linear regression models for different kinds of independent and dependent variables
* Functions that visualize matrices of data and use clustering algorithms to discover structure in those matrices
* A function to plot statistical timeseries data with flexible estimation and representation of uncertainty around the estimate
* High-level abstractions for structuring grids of plots that let you easily build complex visualizations

You can install it as follows:

![title](img/bokeh.png)

<a href = "http://bokeh.pydata.org/en/latest/">Bokeh</a> is a Python interactive visualization library that targets modern web browsers for presentation. Its goal is to provide elegant, concise construction of novel graphics in the style of D3.js, and to extend this capability with high-performance interactivity over very large or streaming datasets. Bokeh can help anyone who would like to quickly and easily create interactive plots, dashboards, and data applications.

Also (if you needed any more incentive to use it!), Bokeh is made by Continuum Analytics, the very same people responsible for putting Anaconda together and comes as part of the standard installation.

![title](img/plotly.png)

<a href = "https://plot.ly/">Plotly</a> is an online analytics and data visualization tool,and provides online graphing, analytics, and stats tools for individuals and collaboration. It can also be integrated with other software and languages such as Python, R, MATLAB, Perl, Julia, Arduino, and REST. Up until very recently Plotly was a 'paid' service (and still is if you want to <a href = "https://plot.ly/products/cloud/">host files online</a>), however they've recently taken the decision to <a href = "https://plot.ly/javascript/open-source-announcement/"> go open source</a>.

Plotly isn't a 'typical' Python library in that whilst you can use it offline, much of the content is posted to the web instead of output in Jupyter. This can make it difficult to use sensitive data and is an added layer of complexity.

You can install it as follows:

We won't be going through Plotly as part of this course, however there are some excellent tutorials available <a href = "http://nbviewer.jupyter.org/github/plotly/python-user-guide/blob/master/Index.ipynb">here</a>.

![title](img/lightning.png)

Similar to Ploty, <a href = "lightning-viz.org">Lightning</a> integrates with a number of software languages and produces some quite swanky looking graphs. Note that whilst the graphs are interactive to an extent, they don't appear to have tooltips that pop up which is a shame.

You can install it as follows:


## Structure of this Section

Data Vis in Python is a massive area and you could quite easily fill a training course with examples and exercises for each of the libraries listed. As such the trianing here will show the basics for a few libraries and signpost you to more information and material to enable you to learn more after the course.