## Plotly Tutorial
- Ref : https://towardsdatascience.com/the-next-level-of-data-visualization-in-python-dd6e99039d5e

## Overview
- Open-source library built on plotly.js which in turn is built on d3.js
- Get the efficiency of coding in Python with the interactive graphics capabilities of d3

## Extension / Wrapper
- cufflinks : Wrapper on plotly for designed to work with Pandas dataframes

## Installation
- pip install plotly
- pip install cufflinks

## How to use (Version 3.10)

### Import

In [1]:
import numpy as np
import pandas as pd

In [44]:
import plotly.plotly as py
import plotly.figure_factory as ff
import plotly.graph_objs as go
from plotly.offline import iplot, init_notebook_mode
import cufflinks

In [3]:
cufflinks.go_offline(connected=True)
init_notebook_mode(connected=True)

### Single Variable Distributions
- Require to install pyarrow (conda install -c conda-forge pyarrow)

In [4]:
# Read data
df = pd.read_parquet('medium_data_2019_01_06')
df.head(3)

Unnamed: 0,claps,days_since_publication,fans,num_responses,publication,published_date,read_ratio,read_time,reads,started_date,...,type,views,word_count,claps_per_word,editing_days,<tag>Education,<tag>Data Science,<tag>Towards Data Science,<tag>Machine Learning,<tag>Python
121,2,574.742788,2,0,,2017-06-10 14:25:00,41.98,7,68,2017-06-10 14:24:00,...,published,162,1859,0.001076,0,0,0,0,0,0
122,18,567.424835,3,0,,2017-06-17 22:02:00,32.93,14,54,2017-06-17 22:02:00,...,published,164,3891,0.004626,0,0,0,0,0,0
123,50,554.804959,19,0,,2017-06-30 12:55:00,20.19,42,215,2017-06-30 12:00:00,...,published,1065,12025,0.004158,0,0,0,0,1,1


In [9]:
# Histogram
df['claps'].iplot(kind='hist', xTitle='claps',
                  yTitle='count', title='Claps Distribution', )

In [20]:
# Overlaid histograms
df[['claps', 'fans']].iplot(
    kind='hist',
    histnorm='percent',
    barmode='overlay')

In [30]:
# Bar Plot (Data is resampled according to month)
df2 = df[['views','reads','published_date']].\
         set_index('published_date').resample('M').mean()

df2.iplot(kind='bar', xTitle='Date', yTitle='Average',
          title='Monthly Average Views and Reads')

In [34]:
# Box Plot
df.pivot(columns='publication', values='fans').fillna(0).iplot(
         kind='box',
         yTitle='fans',
         title='Fans Distribution by Publication')

In [40]:
# Scatterplots
tds = df[df['publication'] == 'Towards Data Science'].set_index('published_date')

tds[['claps', 'fans', 'title']].iplot(
    y='claps', mode='lines+markers', secondary_y = 'fans',
    secondary_y_title='Fans', xTitle='Date', yTitle='Claps',
    text='title', title='Fans and Claps over Time')

In [43]:
# Scatterplots (Two variable + One Category)
df.iplot(kind='scatter', x='read_time', y='read_ratio',
         categories='publication',
         xTitle='Read Time', yTitle='Reading Percent',
         title='Reading Percent vs Read Ratio by Publication')

In [51]:
# Scatter Matrix
figure = ff.create_scatterplotmatrix(df[['claps', 'publication', 'views', 'read_ratio','word_count']],
                                     diag='histogram', index='publication', height=1000, width=1000)
iplot(figure)

In [55]:
# Correlation Heatmap
corrs = df.corr()

figure = ff.create_annotated_heatmap(z=corrs.values,
                                     x=list(corrs.columns),
                                     y=list(corrs.index),
                                     annotation_text=corrs.round(2).values,
                                     showscale=True)

iplot(figure)