## Time-Series Visualization using plotly

### Jing Song

**This notebook is implementing the time-series visualization techniques with interaction using plotly.**

In [1]:
import plotly.plotly as py
import plotly.graph_objs as go

from datetime import datetime
import pandas as pd
import numpy as np

In [2]:
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
from IPython.display import display, HTML

init_notebook_mode(connected=True)

In [3]:
import plotly
plotly.__version__

'2.5.1'

In [4]:
crime_data = pd.read_csv('Monthly_Property_Crime_2005_to_2015.csv')

In [5]:
crime_data.head()

Unnamed: 0,Date,Category,IncidntNum
0,02/01/2014 12:00:00 AM,BURGLARY,506
1,02/01/2007 12:00:00 AM,VANDALISM,531
2,07/01/2012 12:00:00 AM,BURGLARY,522
3,07/01/2013 12:00:00 AM,LARCENY/THEFT,3318
4,08/01/2010 12:00:00 AM,VANDALISM,694


### 1) Line charts

In [6]:
crime_data['Category'].value_counts()

STOLEN PROPERTY    132
VEHICLE THEFT      132
VANDALISM          132
LARCENY/THEFT      132
ARSON              132
BURGLARY           132
Name: Category, dtype: int64

In [7]:
date = pd.to_datetime(crime_data['Date']).sort_values()

In [8]:
crime_data['Date'] = pd.to_datetime(crime_data['Date'])

In [9]:
STOLEN_PROPERTY = crime_data.loc[crime_data['Category']=='STOLEN PROPERTY', ].sort_values('Date')

In [11]:
trace = go.Scatter(
    x = STOLEN_PROPERTY['Date'],
    y = STOLEN_PROPERTY['IncidntNum'],
    name = 'STOLEN PROPERTY',
)

data = [trace]

layout = go.Layout(title = 'Monthly Property Crime in San Francisco from 2005 to 2015',
              xaxis = dict(title = 'Year'),
              yaxis = dict(title = 'Number of Incidents'),
              showlegend=True)

fig = go.Figure(data=data, layout=layout)
iplot(fig, filename='basic-line')

**Above is the line chart showing the monthly number of stolen property in San Francisco from 2005 to 2015.**

### 2) Multi line charts

In [12]:
data = []
for crime in crime_data['Category'].unique():
    df = crime_data.loc[crime_data['Category']==crime, ].sort_values('Date')
    
    trace = go.Scatter(
    x = df['Date'],
    y = df['IncidntNum'],
    name = crime,
    )
    
    data.append(trace)

layout = go.Layout(title = 'Monthly Property Crime in San Francisco from 2005 to 2015',
              xaxis = dict(title = 'Year'),
              yaxis = dict(title = 'Number of Incidents'),
              showlegend=True)

fig = go.Figure(data=data, layout=layout)
iplot(fig, filename='multi-line')

**Above is the multi-line charts showing the monthly number of property crime in San Francisco from 2005 to 2015. The different categories of crimes are shown in different colors.**

### 3) Bar charts

In [15]:
data = []
for crime in ['STOLEN PROPERTY']:
    df = crime_data.loc[crime_data['Category']==crime, ].sort_values('Date')
    
    trace = go.Bar(
    x = df['Date'],
    y = df['IncidntNum'],
    name = crime,
    )
    
    data.append(trace)

layout = go.Layout(title = 'Monthly Property Crime in San Francisco from 2005 to 2015',
              xaxis = dict(title = 'Year'),
              yaxis = dict(title = 'Number of Incidents'),
              barmode='bar',
              showlegend=True)

fig = go.Figure(data=data, layout=layout)
iplot(fig, filename='bar')

**Above is the bar chart showing the monthly number of stolen property in San Francisco from 2005 to 2015.**

### 4) Stacked area charts

In [16]:
crime_data_sum = crime_data.groupby('Date').sum().reset_index()
crime_data_agg = pd.merge(crime_data, crime_data_sum, how='left', on=['Date'])
crime_data_agg['IncidntPct'] = crime_data_agg['IncidntNum_x']/ crime_data_agg['IncidntNum_y']
crime_data_agg.head()

Unnamed: 0,Date,Category,IncidntNum_x,IncidntNum_y,IncidntPct
0,2014-02-01,BURGLARY,506,4164,0.121518
1,2007-02-01,VANDALISM,531,3067,0.173133
2,2012-07-01,BURGLARY,522,4459,0.117067
3,2013-07-01,LARCENY/THEFT,3318,5074,0.653922
4,2010-08-01,VANDALISM,694,3855,0.180026


In [19]:
data = []
y_stack = []
for crime in crime_data['Category'].unique():
    df = crime_data_agg.loc[crime_data_agg['Category']==crime, ].sort_values('Date')
    y_original = np.around(df['IncidntPct'].reset_index(drop=True)*100, decimals=2)
    
    if len(y_stack) == 0:
        y_stack = y_original
    else:
        y_stack = [y0+y1 for y0, y1 in zip(y_original, y_stack)]
        
    y_text = [str(y0)+'%' for y0 in y_original]
    
    
    trace = go.Scatter(
    x = df['Date'],
    y = y_stack,
    text=y_text,
    name = crime,
    hoverinfo='x+name+text',
    mode='lines',
    fill='tonexty'
    )
    
    data.append(trace)

layout = go.Layout(title = 'Composition of Monthly Property Crime in San Francisco from 2005 to 2015',
              xaxis = dict(title = 'Year'),
              yaxis = dict(title = 'Percentage of Total Incidents'),
              showlegend=True)

fig = go.Figure(data=data, layout=layout)
iplot(fig, filename='stacked-area-plot')

**Above is the stacked area charts showing the monthly composition of property crime in San Francisco from 2005 to 2015. The different categories of crimes are shown in different colors.**

### 5) Stacked bar charts

In [22]:
data = []
for crime in ['STOLEN PROPERTY', 'ARSON']:
    df = crime_data.loc[crime_data['Category']==crime, ].sort_values('Date')
    
    trace = go.Bar(
    x = df['Date'],
    y = df['IncidntNum'],
    name = crime,
    )
    
    data.append(trace)

layout = go.Layout(title = 'Monthly Property Crime in San Francisco from 2005 to 2015',
              xaxis = dict(title = 'Year'),
              yaxis = dict(title = 'Number of Incidents'),
              barmode='stack',
              showlegend=True)

fig = go.Figure(data=data, layout=layout)
iplot(fig, filename='grouped-bar-stacked')

**Above is the stacked bar chart showing the monthly number of stolen property and arson in San Francisco from 2005 to 2015.**

### 6) Heatmap

In [87]:
import datetime

In [20]:
data = []
z = []

for crime in crime_data['Category'].unique():
    df = crime_data.loc[crime_data['Category']==crime, ].sort_values('Date')
    z.append(list(df['IncidntNum']))
    
trace = go.Heatmap(
    z = z, 
    x = pd.to_datetime(crime_data['Date'].sort_values().unique()).to_pydatetime(),
    y = crime_data['Category'],
    name = crime,
    colorscale='Viridis',
)
    
data = [trace]

layout = go.Layout(title = 'Monthly Property Crime in San Francisco from 2005 to 2015',
              xaxis = dict(title = 'Year'),
              yaxis = dict(title='Incidents',
                            titlefont=dict(size=14),
                            showticklabels=True,
                            tickangle=45,
                            tickfont=dict(size=8)),
              showlegend=True)

fig = go.Figure(data=data, layout=layout)
iplot(fig, filename='heatmap')

**Above is the heat map showing the monthly number of property crime in San Francisco from 2005 to 2015. The different categories of crimes are shown in different rows.
**