# EXTRA CREDIT: Let's Visualize

Another great feature of using python analysis in the Jupyter notebook is the ability to visualize the data using the [Bokeh visualization library](http://bokeh.pydata.org/en/latest/). We won't go into great detail on the step-by-step process of creating beautiful graphics in your notebook, but you can see what's possible below. You can read more documentation on Bokeh [here](http://bokeh.pydata.org/en/latest/docs/user_guide.html#userguide)

Note: You'll need to install bokeh. You can do this by running  `pip install bokeh`  on your command line.

Next, let's upload the datasets we'll use which we created in [Part 3](Part%203.ipynb)

In [96]:
import pandas as pd
njcountycrashes = pd.read_csv('_visuals/scatterplot.csv')
countydeaths = pd.read_csv('_visuals/bargraphs.csv', header=None)
crashesbydate = pd.read_csv('_visuals/linegraph.csv')

# Scatterplot

<b>Total Killed vs. Pedestrians Killed in each county</b>

In [97]:
from bokeh.charts import Scatter, output_file, show, output_notebook
from bokeh.sampledata.autompg import autompg as df
from bokeh.models import HoverTool
output_notebook()

scatterplot = Scatter(njcountycrashes, x='Total Killed', y='Pedestrians Killed', title="Total Killed vs Pedestrians Killed in every County",
            xlabel="Total Killed")

output_file("_visuals/scatterplot.html")

show(scatterplot)

<bokeh.io._CommsHandle at 0x11c4410d0>

This wil make a new [scatterplot HTML page](scatterplot.html). You can also save it as a PNG.

# Bar Graph

<b>Total Killed in each county</b>

We need to first add column names and then sort them.

In [98]:
countydeaths.columns = ["County", "TotalKilled"]

In [99]:
countydeaths.sort_values(by="TotalKilled", ascending=False)

Unnamed: 0,County,TotalKilled
0,MIDDLESEX,308
1,OCEAN,284
2,BURLINGTON,249
3,ESSEX,247
4,CAMDEN,230
5,MONMOUTH,230
6,ATLANTIC,187
7,UNION,182
8,BERGEN,166
9,PASSAIC,160


In [100]:
from bokeh.charts import Bar
from bokeh.charts.attributes import cat

p = Bar(countydeaths, 'County', values='TotalKilled',
        title="Total Killed by County", bar_width=0.6, color="purple")

output_file("_visuals/bar.html")

show(p)

<bokeh.io._CommsHandle at 0x11f537210>

# Line Graph

<b>New Jersey crashes over time (2008-2013)</b>

In [101]:
crashesbydate.columns = ["Date", "Crashes"]

In [102]:
from bokeh.charts import TimeSeries


data = dict(crashesbydate=crashesbydate["Crashes"], Date=crashesbydate["Date"])

p = TimeSeries(data, 'Date', title="NJ Crashes by Date", ylabel='Car Accidents')

output_file("_visuals/timeseries.html")

show(p)

<bokeh.io._CommsHandle at 0x11f4f8c50>