# Plotly and Cufflinks

Plotly is a library that allows you to create interactive plots that you can use in dashboards or websites (you can save them as html files or static images).

## Installation

You'll need to install plotly and cufflinks to call plots directly off of a pandas dataframe. These libraries are not currently available through **conda** but are available through **pip**. Install the libraries at your command line/terminal using:

    pip install plotly
    pip install cufflinks

**NOTE: Make sure you only have one installation of Python on your computer when you do this, otherwise the installation may not work.**

## Imports and Set-up

In [2]:
import pandas as pd
import numpy as np
%matplotlib inline

In [3]:
from plotly import __version__

# import modules to work with data visualization offline
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot

# ensure plotly version is newer than 1.9
print(__version__) # requires version >= 1.9.0

4.10.0


In [4]:
# import the cufflinks library
import cufflinks as cf

In [5]:
# connect Javascript to the notebook with the init_notebook_mode()
init_notebook_mode(connected=True)

In [6]:
# Update to use cufflinks offline
cf.go_offline()

### Synthetic Data

In [7]:
df = pd.DataFrame(np.random.randn(100,4),columns='A, B, C, D'.split(','))  # split()  separate by whitespace
df.head(10)

Unnamed: 0,A,B,C,D
0,-1.813929,0.066075,-0.630102,-1.011188
1,-0.274329,-1.528108,0.43573,-2.316364
2,0.625302,-0.041633,-0.175123,-1.624289
3,-1.455487,-0.770648,-0.331721,-0.851712
4,3.132855,0.911796,1.086027,0.531461
5,0.583075,0.316033,1.55781,-0.805405
6,-1.002052,-0.352143,0.10921,1.036499
7,0.806417,-0.026186,0.479757,1.563519
8,0.581193,1.26445,0.686849,0.620682
9,-0.994444,-0.906819,0.813937,0.275637


In [8]:
# use pandas to create a dictionary generated Dataframe 2
pp = {'Category':['A','B','C'],'Values':[32,43,50]}
df2 = pd.DataFrame(pp)

In [9]:
df2

Unnamed: 0,Category,Values
0,A,32
1,B,43
2,C,50


# Line plots

Use the **.iplot()** method to generate a line plot with the dataset. This plot allows us to click on the elements in the legend to hide and display context which is pretty neat. The cursor to the top right of the plot to observe the various features of the plot. We can also use the zoom feature of specific areas of the plot.

## Using Cufflinks and iplot()

* scatter
* bar
* box
* spread
* ratio
* heatmap
* surface
* histogram
* bubble

## Line plots

Use the **.iplot()** method to generate a line plot with the dataset. This plot allows us to click on the elements in the legend to hide and display context which is pretty neat. The cursor to the top right of the plot to observe the various features of the plot. We can also use the zoom feature of specific areas of the plot.

In [10]:
df

Unnamed: 0,A,B,C,D
0,-1.813929,0.066075,-0.630102,-1.011188
1,-0.274329,-1.528108,0.435730,-2.316364
2,0.625302,-0.041633,-0.175123,-1.624289
3,-1.455487,-0.770648,-0.331721,-0.851712
4,3.132855,0.911796,1.086027,0.531461
...,...,...,...,...
95,0.790519,0.414697,0.552252,-0.697892
96,-1.448122,-0.105886,-0.533866,-0.561039
97,1.054877,-0.961029,-0.763568,-0.738045
98,-0.688087,1.712704,-1.310234,0.252366


In [11]:
df['A'].iplot()

## Scatter

Use the .iplot() method with arguments kind (plot type), x (x-axis variable), y (y-axis variable), and mode argument removes the line connections setup by default with plotly. The plot can be zoomed in or out depending on need.

In [12]:
# use the .iplot() method with arguments
# kind arguments determines the plot type 
# x (x-axis variable)
# y (y-axis variable)
# mode argument removes the line connections setup by default by plotly
df.iplot(kind='scatter',x='A',y='C',mode='markers',size=10)

KeyError: 'C'

## Bar Plots

In [13]:
df2

Unnamed: 0,Category,Values
0,A,32
1,B,43
2,C,50


In [14]:
df2.iplot(kind='bar',x='Category',y='Values')

In [15]:
df

Unnamed: 0,A,B,C,D
0,-1.813929,0.066075,-0.630102,-1.011188
1,-0.274329,-1.528108,0.435730,-2.316364
2,0.625302,-0.041633,-0.175123,-1.624289
3,-1.455487,-0.770648,-0.331721,-0.851712
4,3.132855,0.911796,1.086027,0.531461
...,...,...,...,...
95,0.790519,0.414697,0.552252,-0.697892
96,-1.448122,-0.105886,-0.533866,-0.561039
97,1.054877,-0.961029,-0.763568,-0.738045
98,-0.688087,1.712704,-1.310234,0.252366


In [16]:
df.iplot(kind = 'bar')

In [17]:
#iplot only selected columns
df['A'].iplot(kind = 'bar')





In [18]:
# use an aggregate method to group data
df.count().iplot(kind = 'bar') # count() method 
df.sum().iplot(kind='bar') # sum() method



In [19]:
df.mean().iplot(kind = 'bar')
df.std().iplot(kind = 'bar')

## Boxplots

In [20]:
df.iplot(kind='box')

## 3d Surface

In [21]:
df3 = pd.DataFrame({'x':[1,2,3,4,5],'y':[10,20,30,20,10],'z':[10,4,3,2,1]})
df3

Unnamed: 0,x,y,z
0,1,10,10
1,2,20,4
2,3,30,3
3,4,20,2
4,5,10,1


In [22]:
# plot type - kind = 'surface'
# colorscale argument alters the plot color, check out 
# https://plot.ly/python/builtin-colorscales/ for built-in color scales
df3.iplot(kind='surface')

## histogram

In [23]:
df.head()

Unnamed: 0,A,B,C,D
0,-1.813929,0.066075,-0.630102,-1.011188
1,-0.274329,-1.528108,0.43573,-2.316364
2,0.625302,-0.041633,-0.175123,-1.624289
3,-1.455487,-0.770648,-0.331721,-0.851712
4,3.132855,0.911796,1.086027,0.531461


In [24]:
df['A'].iplot(kind='hist',bins=100)

## Bubble plots

In [1]:
# size argument references the size of data points
df.iplot(kind='bubble',x='A',y='B',size='C')

NameError: name 'df' is not defined

## scatter_matrix()

Similar to sns.pairplot()

In [109]:
df.scatter_matrix()