# Vincent - The data capabilities of Python. The visualization capabilities of JavaScript.

This is a tutorial about how to use Vincent, a data visualisation tool in python.

Author: Tenzin Chhosphel, CSC 599.69, CCNY

# Introduction

Vega is a visualization grammar, a declarative format for creating and saving interactive visualization designs. With Vega you can describe data visualizations in a JSON format, and generate interactive views using either HTML5 Canvas or SVG. 

Built on top of D3, Vega makes building visualization easier. Vega is built by Trifacta, a company that builds data wrangling and exploratory analysis platform.

Vincent makes it easy to build Vega with python. Vincent allows you to build Vega specifications in a Pythonic way, and performs type-checking to help ensure that your specifications are correct. It also has a number of convenience chart-building methods that quickly turn Python data structures into Vega visualization grammar, enabling graphical exploration. It allows for quick iteration of visualization designs via getters and setters on grammar elements, and outputs the final visualization to JSON. Perhaps most importantly, Vincent has Pandas-Fu, and is built specifically to allow for quick plotting of DataFrames and Series. 

The core concept of Vincent is that it possess the data capabilities of Python, and visualization capabilites of JavaScript. 

More information can be found on http://vincent.readthedocs.io/


# Installation

You can install Vincent by doing the following command using pip:

*$pip install vincent*

Because Vincent depends on Pandas, before you install Vincent, make sure to have installed Pandas. If not installed already, you can install Pandas doing the following command using pip:

*$pip install pandas*

Because Pandas depends on NumPy, before you install Pandas, make sure to have installed NumPy. If not installed already, you can install NumPy doing the following command using pip:

*$pip install numpy*



Having trouble installation? 

A full guide for installation can be found on https://github.com/wrobstory/vincent



In [75]:
# Before you use Vincent, you need to import Vincent
import vincent

# Then before anything, you need to load all necessary JavaScript libraries
vincent.initialize_notebook()

# Get Started

To get started using Vincent, first import Vincent with the above command. Once imported, you need to load all neccessary JavaScript libraries using initiliaze_notebook(). 

Once the libraries are loaded, you are ready to go!

You can easily plot all different kinds of graphs using Vincent. But, for the sake of keeping this tutorial easy, we'll plot some of the basic graphs. 

Following is what we'll plot:

    * Bar Graph
    * Area Graph
    * Line Graph
    * Scatter Plots
   
Once we get familiar with the basic graphs, we'll plot a more complex real world data. But, you must know the basics first in order to understand the more complex one. 

In [76]:
# let's make a list of some integers
list_data = [0, 10, 20, 30, 40, 50, 40, 30, 20, 10, 0]

# now we'll use Vincent's Bar object to plot a bar graph
bar = vincent.Bar(list_data)

# set the size of the graph
bar.width = 400
bar.height = 200

# label axes
bar.axis_titles(x='Index', y='Value')

# let's see how the bar graph looks
bar.display()

# Bar Graph

Bar graphs can be plotted by simply using passing a list fo data to vincent.Bar(data) object. You can assign a variable to your graph, and you can set the window size of the graph by setting its width, and height. From now own I will be refering to this variable as 'graph variable.'

You can display the graph in notebook by using display() or just by simply  having the graph variable. In fact, change the bar.display() statement with just bar, you should see the same graph.

### Label Axes

Labeling the axes is simple. Just pass in label name for x and y in axis_title(x, y) as shown above. 

In [77]:
# let's make a list of some integers
list_data = [40, 50, 25, 30, 60, 50, 20, 10, 45, 70]

# now we'll use Vincent's Area object to plot a area graph
area = vincent.Area(list_data)

# set the size of the graph
area.width = 400
area.height = 200

# label axes
area.axis_titles(x='Index', y='Value')

# let's see how the bar graph looks
area.display()

# Area Graph

Like bar graphs, area graphs can be plotted by simply using passing a list fo data to vincent.Area(data) object. The area formed under the line formed from the data will be shaded as shown above. And like bar graphs, you can assign a variable to your graph, and you can set the window size of the graph by setting its width, and height. 

You can display the graph in notebook by using display() or just by simply  having the graph variable. In fact, change the area.display() statement with just area, you should see the same graph.

### Label Axes

Labeling the axes is simple. Just pass in label name for x and y in axis_title(x, y) as shown above. 

In [78]:
import random

# let's generate some random data for the example
# we'll generate four sets of integers list of size 21
# each set will have random numbers from 1 to 100
dataLabels = ['Data A', 'Data B', 'Data C', 'Data D']
index = range(0, 21)
dataset = {'index': index}
for label in dataLabels:
    dataset[label] = [random.randint(1, 100) for val in index]

# now we'll use Vincent's Line object to plot a line graph 
# for the four dataset
line = vincent.Line(dataset, iter_idx='index')

# set the size of the graph
line.width = 400
line.height = 200

# label axes
line.axis_titles(x='Index', y='Value')

# add legend
line.legend(title='Data Categorires')

# let's see how the line graph looks
line.display()


# Line Graph

Line graphs can be plotted by simply using passing a list of data to vincent.Line(data) object. But, you can also pass in a dictionary that can have multiple data list in it. The graph shown above has four data list in the dictionary that's passed as parameter to the vincent.Line(data, iter_idx) object. 'Data' is the dictionary, and 'iter_idx' is the iterative index of the dictionary. It tells Vincent iterate from the first index dataset, i.e. Data A, to last index dataset, i.e. Data D. 

Different colors of the lines happens by default. And, like bar graph, you can assign a variable to your graph, and you can set the window size of the graph by setting its width, and height.

You can display the graph in notebook by using display() or just by simply having the graph variable. In fact, change the line.display() statement with just line, you should see the same graph.

### Label Axes
Labeling the axes is simple. Just pass in label name for x and y in axis_title(x, y) as shown above.

### Legend
Adding a legend is simple. Use the legend(title) function with the graph variable, and set the title to whatever you want to call your legend. Different colors of the legends happens by default. 

In [79]:
import random

# let's generate some random data for the example
# we'll generate 9 sets of integers list of size 20
# each set will have random numbers from 10 to 100
dataLabels2 = ['Data ' + str(x) for x in range(1,10)]
index2 = range(1, 21)
dataset2 = {'index': index2}
for label in dataLabels2:
    dataset2[label] = [random.randint(10, 100) for val in index2]

# now we'll use Vincent's Scatter object to plot a scatter plot 
# for the four dataset
scatter = vincent.Scatter(dataset2, iter_idx='index')

# set the size of the graph
scatter.width = 400
scatter.height = 200

# label axes
scatter.axis_titles(x='Index', y='Value')

# add legend
scatter.legend(title='Data Categorires')

# let's see how the line graph looks
scatter.display()



# Scatter Plot

Scatter plots can be plotted by simply passing a list of data to vincent.Line(data) object. But, you can also pass in a dictionary that can have multiple data list in it. The graph shown above has nine data list in the dictionary that's passed as parameter to the vincent.Scatter(data, iter_idx) object. 'Data' is the dictionary, and 'iter_idx' is the iterative index of the dictionary. It tells Vincent to iterate from first index dataset, i.e. Data 1, to last index dataset, i.e. Data 9. 

You can display the graph in notebook by using display() or just by simply having the graph variable. In fact, change the scatter.display() statement with just scatter, you should see the same graph.

### Label Axes
Labeling the axes is simple. Just pass in label name for x and y in axis_title(x, y) as shown above.

### Legend
Adding a legend is simple. Use the legend(title) function with the graph variable, and set the title to whatever you want to call your legend. Different colors of the legends happens by default. 

==========================================================================

# More complex graph with real world data

Hopefully, at this point, you are familiar with plotting differnt graphs using Vincent. Pretty easy, isn't it. 

Now, we'll plot something more complex, a real world dataset. And, we'll walk through each step in the process.

