___

<a href='https://github.com/ai-vithink'> <img src='https://avatars1.githubusercontent.com/u/41588940?s=200&v=4' /></a>
___

# Plotly and Cufflinks

Plotly is a library that allows you to create interactive plots that you can use in dashboards or websites (you can save them as html files or static images).

## Installation

In order for this all to work, you'll need to install plotly and cufflinks to call plots directly off of a pandas dataframe. These libraries are not currently available through **conda** but are available through **pip**. Install the libraries at your command line/terminal using:

    pip install plotly
    pip install cufflinks

**NOTE: Make sure you only have one installation of Python on your computer when you do this, otherwise the installation may not work.**

## Imports and Set-up

In [None]:
import pandas as pd
import numpy as np
%matplotlib inline

In [None]:
from IPython.display import HTML
HTML('''<script>
code_show_err=false; 
function code_toggle_err() {
 if (code_show_err){
 $('div.output_stderr').hide();
 } else {
 $('div.output_stderr').show();
 }
 code_show_err = !code_show_err
} 
$( document ).ready(code_toggle_err);
</script>
To toggle on/off output_stderr, click <a href="javascript:code_toggle_err()">here</a>.''')
# To hide warnings, which won't change the desired outcome.

In [None]:
%%HTML
<style type="text/css">
table.dataframe td, table.dataframe th {
    border: 3px  black solid !important;
  color: black !important;
}
# For having gridlines 

In [None]:
import warnings
warnings.filterwarnings("ignore")


In [None]:
from plotly import __version__


print(__version__) # requires version >= 1.9.0
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot

In [None]:
import cufflinks as cf

In [None]:
# For Notebooks
init_notebook_mode(connected=True)
# Connects JavaScript to the notebook as Plotly essentially connects Pandas and Python to an interactive JS Library.

In [None]:
cf.go_offline() # Used as a method to allow cufflinks to run offline.

### Fake Data

In [None]:
df = pd.DataFrame(np.random.randn(100,4),columns='A B C D'.split()) 
# 100 rows and 4 columns, and columns are A B C D

In [None]:
df.head()

In [None]:
df2 = pd.DataFrame({'Category':['A','B','C'],'Values':[32,43,50]})

In [None]:
df2

## Using Cufflinks and iplot()

* scatter
* bar
* box
* spread
* ratio
* heatmap
* surface
* histogram
* bubble

In [None]:
# How to use cufflinks and iplot
df.plot()

In [None]:
# Now to see the improved version we use iplot and see what do we get ...
df.iplot()
# Notice how readability and ease of understanding of complex data increases. A plotly interactive image is generated.
# Drag in an area, make a square in area you want to zoom in to and double click to zoom out.
# You can hover, pan, download as static png image, click on column A, B or whichever you want to show/hide them and much more.

* Scatter, bar, heat map, box, ratio are the kinds of plots that we can do using plotly.
## Scatter Plot

In [None]:
df.iplot(kind='scatter',x='A',y = 'B')
# To switch the kind of plot we specify kind = 'what we want'. Gotta specify x and y axes for scatter plot

In [None]:
# Now the funny thing you see above which makes no sense is when plotly by default tries to make the lines connect 
# with each other. We have to specify mode = 'markers' to see markers
df.iplot(kind='scatter',x='A',y = 'B',mode='markers',size=6)
# Here also we can zoom and do all kinds of things with this interactive plot.
# For scatter plot, pass in x and y with column names, specify kind = scatter and then pass in mode = markers and size/

## Bar Plot

In [None]:
df2.iplot(kind='bar',x= 'Category',y='Values') # Specify x

* Data will not be always conveniently placed in a categorical values column, what we can do is group by or an aggregate function on our data to get it into some sort of form where it would make sense to use bar plot via iplot

In [None]:
df.head()
# Can't call bar plot off of this because then we will get a bunch of meaningless bars for every single data point.

In [None]:
df.iplot(kind='bar')

In [None]:
# As we get hard to interpret bar plot we understand that we need to do some sort of aggregate function on top of bar plot.
# Say an aggregate function to count the number of instances for each column for that we do as follows:
df.count().iplot(kind='bar')
# Bar plot for each instance, which is in this case 100. As these are evenly distributed.

In [None]:
# We can do other things like take the sum : Which would give us sum of all values in a column
df.sum().iplot(kind='bar')

In [None]:
# Bar plot with iplot becomes really powerful when we call some sort of aggregate function or a group by function on dataframe.

## Box Plot

In [None]:
# Automatically makes box plot for each of the column, columns can be turned on or off by passing them next to df or using 
# interactive selection provided by plotly.
df.iplot(kind='box')

## 3d Surface

In [None]:
# Making a new df for 3d surface
df3 = pd.DataFrame({'x':[1,2,3,4,5],'y':[10,20,30,20,10],'z':[500,400,300,200,100]})

In [None]:
df3
# A 3 dimension of values which can be plotted using surface plot of 3 variables.

In [None]:
df3.iplot(kind='surface',colorscale='rdylbu')

## Histogram

In [None]:
df['A'].iplot(kind='hist',xTitle='X axis',yTitle = 'Y axis',title='Histogram',bins=50,theme='henanigans')

In [None]:
# To see all the themes you can apply to your visualisations.
cf.getThemes()

In [None]:
# On passing entire dataframe we get an overlapping histogram of all the columns.Turn on/off as you please.
df.iplot(kind='hist',xTitle='X axis',yTitle = 'Y axis',title='Histogram',bins=50,theme='henanigans')

## Spread

In [None]:
# These types of visualisations are used a lot for stock data. To compare two stocks. Here we create 2 random values.
df[['A','B']].head()

In [None]:
df[['A','B']].iplot(kind='spread')
# The plot you see above is the line plot which is used to compare the two values against each other.
# The plot you see below is the spread plot which is used to show the spread of columns A and B against each other.

## Bubble Plot

In [None]:
# Very similar to scatter plot. Except it changes the size of points based off of some another variable.
df.iplot(kind='bubble',x='A',y='B',size='C')
# World GDP, comparison of population,happiness factor etc. say comparison of population of country A and B wrt
# population of China.

## Scatter Matrix Plot

In [None]:
# Similar to seaborn's pairplot : It just creates a scatter matrix of all columns it can. Make sure all columns are numerical.
df.scatter_matrix(theme='solar')
# If you have a lot of points then scatter matrix can take some time to load, and you may end up crashing your python kernel.

# For more info and details about documentation :

* [Cufflinks Github Page]([https://github.com/santosjorge/cufflinks](https://github.com/santosjorge/cufflinks))
    * Check out Chart Gallery, Tutorials, Offline, Pandas Like and Plotly notebooks to find info on Area Plots, Scatter Plots and more.
* There is ability to do technical analysis in CuffLinks which is still under beta. Check repo for more info. Don't get too intimidated by technical analysis if you don't want to get into financial analysis. Gives info about averages, correlation between plots and more.

**GREAT JOB :)**