Plotly is a library that allows you to create interactive plots that you can use in dashboards or websites (you can save them as html files or static images).<br>
Cufflinks is what connects plotly with pandas.

https://plotly.com

In [1]:
# !pip install cufflinks

In [2]:
# !pip install plotly

In [3]:
import pandas as pd
import numpy as np

In [4]:
from plotly import __version__
print(__version__)

4.3.0


In [5]:
import cufflinks as cf

In [6]:
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot

In [7]:
init_notebook_mode(connected=True)

In [8]:
cf.go_offline()

we need to make sure plotly version is up to date so everything will work properly.<br>

Plotly as a company can support your online and can host your online data visualization, but since we’re going to be using this as an open-source library we’re going to be using it offline.

In order to have everything work in the notebook, just call init_notebook_mode(connected=True), this is going to connect the java script to your notebook, cause plotly basically just connects pandas and python to an interactive java script library and this will allow your notebook to access those visualizations.

## Random Data

In [9]:
df = pd.DataFrame(np.random.randn(100,4),columns=['A','B','C','D'])
df.head()

Unnamed: 0,A,B,C,D
0,-0.135757,-0.319707,-0.589489,-0.96098
1,0.816116,-0.915979,-0.185242,-1.16376
2,-0.218207,-0.258473,1.282925,0.730469
3,0.189067,1.484393,-0.711062,1.61157
4,0.080558,2.361988,0.127462,0.600993


In [10]:
df2 = pd.DataFrame({'Category':['A','B','C'], 'Values':[32,43,50]})
df2

Unnamed: 0,Category,Values
0,A,32
1,B,43
2,C,50


In [11]:
df.iplot()

The regular matplotlib’s plot has been converted into plotly interactive image, where you can scroll on and it will actually tell you the values at that particular index point. It’s the exact same plot except now it’s interactive. We can zoom in, check out values, double-click to zoom back out.<br>
It has its own tool bar. We can save and edit the plot, download the plot as a png, pan around, zoom in, zoom out, reset the axes, choose a hoover, compare data on hoover.
You can also click in the legends on the columns you want to see.

In [12]:
## Scatter Plot

df.iplot(kind='scatter',x='A',y='B')

By default, plotly will try to connect by line all the dots.
In order for plotly to show scatter you need to pass in mode as ‘markers’.

In [13]:
df.iplot(kind='scatter',x='A',y='B',mode='markers')

In [14]:
## Bar Plot

df2.iplot(kind='bar',x='Category',y='Values')

Our data is not always going to be conveniently placed such as the example above, but what you can do is call groupby or an aggregate function on your data to actually get it into some sort of form that makes sense to use a bar plot using iplot.

In [15]:
df.sum().iplot(kind='bar')

In [16]:
## Box Plot

df.iplot(kind='box')

In [17]:
## 3D Surface Plot

df3 = pd.DataFrame({'x':[1,2,3,4,5],
                    'y':[10,20,30,20,10],
                    'z':[5,4,3,2,1]})
df3

Unnamed: 0,x,y,z
0,1,10,5
1,2,20,4
2,3,30,3
3,4,20,2
4,5,10,1


In [18]:
df3.iplot(kind='surface')

In [19]:
## Histogram Plot

df['A'].iplot(kind='hist',bins=50)

If you won’t specify a certain column you’ll get a histogram where all the columns overlapping each other.
You can turn them on and off to compare them.

In [20]:
df.iplot(kind='hist',bins=50)

In [21]:
## Spread Plot

df[['A','B']].iplot(kind='spread')

We get a plot and a subplot.<br>
The first one is a line bar to show them against each other.<br>
The second one is a spread plot which shows the spread of them against each other.

In [22]:
## Bubble Plot

df.iplot(kind='bubble',x='A',y='B',size='C')

Bubble plot is very similar to a scatter plot except that it will change size of the points based off of another variable.<br>
We need to specify x and y as the 2 columns we want to check and the size of the bubbles based off of another column values.

You see this kind of plots for things like World GDP (Gross Domestic Product) in comparison to population and maybe happiness factor etc.

In [23]:
## Scatter Matrix Plot

df.scatter_matrix()

## Geographical Plotting

Geographical plotting is usually challenging due to the various formats the data can come in.

In [24]:
import chart_studio.plotly as py
import plotly.graph_objs as go

In [25]:
data = dict(type='choropleth',
           locations=['AZ','CA','NY'],
           locationmode='USA-states',
           colorscale='Portland',
           text=['text1','text2','text3'],
           z=[1.0,2.0,3.0],
           colorbar={'title':'Colorbar title goes here'})

In [26]:
layout = dict(geo={'scope':'usa'})

In [27]:
choromap = go.Figure(data=[data],layout=layout)
iplot(choromap)

In [28]:
agri_df = pd.read_csv('/Users/yossiarviv/Desktop/Datasets/2011_US_AGRI_Exports.csv')
agri_df.head()

Unnamed: 0,code,state,category,total exports,beef,pork,poultry,dairy,fruits fresh,fruits proc,total fruits,veggies fresh,veggies proc,total veggies,corn,wheat,cotton,text
0,AL,Alabama,state,1390.63,34.4,10.6,481.0,4.06,8.0,17.1,25.11,5.5,8.9,14.33,34.9,70.0,317.61,Alabama<br>Beef 34.4 Dairy 4.06<br>Fruits 25.1...
1,AK,Alaska,state,13.31,0.2,0.1,0.0,0.19,0.0,0.0,0.0,0.6,1.0,1.56,0.0,0.0,0.0,Alaska<br>Beef 0.2 Dairy 0.19<br>Fruits 0.0 Ve...
2,AZ,Arizona,state,1463.17,71.3,17.9,0.0,105.48,19.3,41.0,60.27,147.5,239.4,386.91,7.3,48.7,423.95,Arizona<br>Beef 71.3 Dairy 105.48<br>Fruits 60...
3,AR,Arkansas,state,3586.02,53.2,29.4,562.9,3.53,2.2,4.7,6.88,4.4,7.1,11.45,69.5,114.5,665.44,Arkansas<br>Beef 53.2 Dairy 3.53<br>Fruits 6.8...
4,CA,California,state,16472.88,228.7,11.1,225.4,929.95,2791.8,5944.6,8736.4,803.2,1303.5,2106.79,34.6,249.3,1064.95,California<br>Beef 228.7 Dairy 929.95<br>Frui...


In [29]:
data = dict(type='choropleth',
           locations=agri_df['code'],
           locationmode='USA-states',
           colorscale='Ylorrd',
           text=agri_df['text'],
           z=agri_df['total exports'],
           marker=dict(line=dict(color='rgb(255,255,255)',width=2)),
           colorbar={'title':'Millions USD'})

In [30]:
layout = dict(title='2011 US Agriculture Exports by State',
             geo=dict(scope='usa',showlakes=True,lakecolor='rgb(85,173,240)'))

In [31]:
choromap2 = go.Figure(data=[data],layout=layout)
iplot(choromap2)

In [32]:
## Another Example

df = pd.read_csv('/Users/yossiarviv/Desktop/Datasets/2014_World_GDP.csv')
df.head()

Unnamed: 0,COUNTRY,GDP (BILLIONS),CODE
0,Afghanistan,21.71,AFG
1,Albania,13.4,ALB
2,Algeria,227.8,DZA
3,American Samoa,0.75,ASM
4,Andorra,4.8,AND


In [33]:
data = dict(type='choropleth',
           locations=df['CODE'],
            z=df['GDP (BILLIONS)'],
           text=df['COUNTRY'],
            colorscale='reds',
           colorbar={'title':'GDP in Billions USD'})

layout = dict(title='2014 World GDP',
             geo=dict(showframe=False, projection={'type':'equirectangular'}))

In [34]:
choromap3 = go.Figure(data=[data],layout=layout)
iplot(choromap3)