# Creating Plotly Charts Directly from Pandas Data Frame

Example was based on Plotly's documentation [here](https://plot.ly/python/offline/).  You need to install cufflinks in order to make plotly plots directly from pandas data frames.

In [2]:
import pandas as pd
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot
import numpy as np
import cufflinks as cf
init_notebook_mode()

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.


**A data frame made up of 5 sets of 10 randomly generated numbers.**

In [3]:
df = pd.DataFrame(np.random.rand(10, 5), columns=['A', 'B', 'C', 'D', 'E'])

In [4]:
df

Unnamed: 0,A,B,C,D,E
0,0.845652,0.318261,0.547856,0.563489,0.941024
1,0.68077,0.052165,0.973934,0.660124,0.324227
2,0.388201,0.907616,0.046748,0.616809,0.422754
3,0.057573,0.276261,0.245472,0.996926,0.044496
4,0.361129,0.596063,0.118536,0.762695,0.294441
5,0.186998,0.619209,0.834297,0.660311,0.898558
6,0.600756,0.894294,0.815138,0.655066,0.513308
7,0.125269,0.689678,0.799743,0.477832,0.316927
8,0.648589,0.660146,0.576865,0.730969,0.428077
9,0.054871,0.431081,0.928726,0.920856,0.910254


## Box Plot Example

In [5]:
iplot(df.iplot(asFigure=True, kind='box', title='Box Plot Example', dimensions=(600,500)), show_link=False)

## Histogram Example

In [6]:
from bokeh.sampledata.autompg import autompg as df  # sample automotive data

In [7]:
df.head()

Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name
0,18.0,8,307.0,130,3504,12.0,70,1,chevrolet chevelle malibu
1,15.0,8,350.0,165,3693,11.5,70,1,buick skylark 320
2,18.0,8,318.0,150,3436,11.0,70,1,plymouth satellite
3,16.0,8,304.0,150,3433,12.0,70,1,amc rebel sst
4,17.0,8,302.0,140,3449,10.5,70,1,ford torino


In [8]:
iplot(df.mpg.iplot(asFigure=True, kind='histogram', title='MPG HIstogram of All Vehicles', 
                   dimensions=(600,400)), show_link=False)

**OK, so I got a MPG histogram for all vehicles, but what if I want to make multiple histograms by engine cylinder?**

**Plotly expects the data sets or series to be in their own column.  But in this case, the data isn't setup that way.  The data is in what we call "long" format.  We need to convert the data from long format to wide format.  We can use pandas pivot() method to do this.**

In [10]:
pivoted = df.pivot(columns='cyl', values='mpg')

Here's what the data frame looks like in wide format

In [11]:
pivoted.head(20)

cyl,3,4,5,6,8
0,,,,,18.0
1,,,,,15.0
2,,,,,18.0
3,,,,,16.0
4,,,,,17.0
5,,,,,15.0
6,,,,,14.0
7,,,,,14.0
8,,,,,14.0
9,,,,,15.0


In [12]:
iplot(pivoted.iplot(asFigure=True, kind='histogram', title='MPG HIstogram by Cylinder', 
                    dimensions=(600,400)), show_link=False)

## Line Example

In [13]:
df = pd.DataFrame(np.random.randn(1000, 2), columns=['A', 'B']).cumsum()

In [14]:
df.head()

Unnamed: 0,A,B
0,-1.47385,-1.800103
1,-2.669346,0.337395
2,-3.595605,-0.399725
3,-4.648469,-2.077753
4,-3.865445,-1.462799


In [15]:
iplot(df.iplot(asFigure=True, kind='line', title='Plotly Line Example', dimensions=(600,400)),
     show_link=False)

## Line Fill Example

In [16]:
from pandas_datareader import data
import pandas as pd
from datetime import datetime

start = datetime(2016, 1, 1)
end = datetime(2016, 4, 18)
df_hmc = data.get_data_yahoo("HMC", start, end)
df_tm = data.get_data_yahoo("TM", start, end)
df_f = data.get_data_yahoo("F", start, end)
df_hymtf = data.get_data_yahoo('HYMTF', start, end)

df = pd.DataFrame({'Honda Motor Co': df_hmc.Open, 'Toyota Motor': df_tm.Open, 
                   'Ford': df_f.Open, 'Hyundai': df_hymtf.Open})

iplot(df.iplot(asFigure=True, kind='line', fill=True, title='Plotly Line /w Fill Example', dimensions=(600,400)),
     show_link=False)

ConnectionError: HTTPConnectionPool(host='ichart.finance.yahoo.com', port=80): Max retries exceeded with url: /table.csv?s=HMC&a=0&b=1&c=2016&d=3&e=18&f=2016&g=d&ignore=.csv (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff1eb144a58>: Failed to establish a new connection: [Errno -2] Name or service not known',))

## Bubble Chart Example

In [17]:
import pandas as pd
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot
from plotly.offline.offline import _plot_html
import numpy as np
import cufflinks as cf
init_notebook_mode()

df = pd.read_csv('http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderDataFiveYear.txt', sep='\t')
df2007 = df[df.year==2007]

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.


In [18]:
df2007.head()

Unnamed: 0,country,year,pop,continent,lifeExp,gdpPercap
11,Afghanistan,2007,31889923.0,Asia,43.828,974.580338
23,Albania,2007,3600523.0,Europe,76.423,5937.029526
35,Algeria,2007,33333216.0,Africa,72.301,6223.367465
47,Angola,2007,12420476.0,Africa,42.731,4797.231267
59,Argentina,2007,40301927.0,Americas,75.32,12779.37964


In [19]:
# Need categories='<to column you want the bubbles' color to correspond to>'
iplot(df2007.iplot(asFigure=True, kind='bubble', x='gdpPercap', y='lifeExp', size='pop', text='country',
             categories='country', legend=False, xTitle='GDP per Capita', colors='blue', 
             yTitle='Life Expectancy', title='Plotly Bubble Chart Example', dimensions=(800,600)),
             show_link=False)

## Subplot Example

In [20]:
import pandas as pd
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot
from plotly.offline.offline import _plot_html
import numpy as np
import cufflinks as cf
init_notebook_mode()

df = pd.DataFrame(np.random.randn(1000, 6), columns=['A','B','C','D','E','F']).cumsum()

iplot(df.iplot(asFigure=True, kind='lines', subplots=True, subplot_titles=True, legend=False,
                    title='Plotly Subplot Example', dimensions=(700,600)), show_link=False)

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.
