# Plotly and Cufflinks

Plotly is a library that allows you to create interactive plots that you can use in dashboards or websites (you can save them as html files or static images).

# Installation

In order for this all to work, you'll need to install plotly and cufflinks to call plots directly off of a pandas dataframe. These libraries are not currently available through **conda** but are available through **pip**. Install the libraries at your command line/terminal using:

    pip install plotly
    pip install cufflinks

In [40]:
# imports for EDA and plotting

import pandas as pd
import numpy as np
import calendar
import chart_studio.plotly as py
from datetime import date
#Plotly Graph Objects is the lower-level plotting library in Plotly. 
#It provides a more comprehensive API than Plotly Express, which gives you more control over the appearance and behavior of your charts.
import plotly.graph_objs as go
#Plotly Express is a high-level plotting library that makes it easy to create interactive charts and graphs. 
#It’s built on top of Plotly Graph Objects, but it provides a more simplified API that makes it easier to get started.
import plotly.express as px
# cufflinks connects plotly with pandas to create graphs and charts of dataframes directly. 
import cufflinks as cf


from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
from plotly import __version__
print(__version__) # requires version >= 1.9.0

# %matplotlib inline

5.18.0


In [41]:
init_notebook_mode(connected=True)  # for notebooks
from plotly.offline import iplot
# For offline use
cf.go_offline()

In [42]:
df = pd.read_csv(r'C:\Users\kvjai\OneDrive\Desktop\Beyond_Binary-1\ml_prodinno\data\starbucks.csv')
df.shape
df.head()

Unnamed: 0,Beverage_category,Beverage,Beverage_prep,Calories,Total Fat (g),Trans Fat (g),Saturated Fat (g),Sodium (mg),Total Carbohydrates (g),Cholesterol (mg),Dietary Fibre (g),Sugars (g),Protein (g),Vitamin A (% DV),Vitamin C (% DV),Calcium (% DV),Iron (% DV),Caffeine (mg)
0,Coffee,Brewed Coffee,Short,3,0.1,0.0,0.0,0,5,0,0,0,0.3,0%,0%,0%,0%,175
1,Coffee,Brewed Coffee,Tall,4,0.1,0.0,0.0,0,10,0,0,0,0.5,0%,0%,0%,0%,260
2,Coffee,Brewed Coffee,Grande,5,0.1,0.0,0.0,0,10,0,0,0,1.0,0%,0%,0%,0%,330
3,Coffee,Brewed Coffee,Venti,5,0.1,0.0,0.0,0,10,0,0,0,1.0,0%,0%,2%,0%,410
4,Classic Espresso Drinks,Caffè Latte,Short Nonfat Milk,70,0.1,0.1,0.0,5,75,10,0,9,6.0,10%,0%,20%,0%,75


# Scatter plot

In [43]:
df.iplot(kind='scatter',x='Beverage_category',y='Calories',mode='markers',size=10, colors='purple')

# Bar plot

In [44]:
df.iplot(kind='bar', x='Beverage', y='Calories', colors='blue', title='Calories per Beverage')

# Histrogram

In [46]:
px.histogram(df, x='Beverage_prep', color="Beverage_prep").update_xaxes(categoryorder='total ascending')







# Scatter plot

In [72]:
df = px.data.iris()

print(df.head())

fig = px.scatter(df, x="sepal_width", y="sepal_length", color='petal_length')
fig.show()

   sepal_length  sepal_width  petal_length  petal_width species  species_id
0           5.1          3.5           1.4          0.2  setosa           1
1           4.9          3.0           1.4          0.2  setosa           1
2           4.7          3.2           1.3          0.2  setosa           1
3           4.6          3.1           1.5          0.2  setosa           1
4           5.0          3.6           1.4          0.2  setosa           1


# Box plot

In [75]:
# on the Iris dataset

df.iplot(kind='box', columns=['sepal_length', 'sepal_width', 'petal_length', 'petal_width'])

# 3D plotting

In [27]:
df3 = pd.DataFrame({'x':[1,2,3,4,5],'y':[10,20,30,20,10],'z':[5,4,3,2,1]})
df3.iplot(kind='surface',colorscale='rdylbu')

In [57]:
z_data = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/api_docs/mt_bruno_elevation.csv')

#print(z_data.head(2))

fig = go.Figure(data=[go.Surface(z=z_data.values)])

fig.update_traces(contours_z=dict(show=True, usecolormap=True,
                                  highlightcolor="limegreen", project_z=True))
fig.update_layout(title='Mt Bruno Elevation', autosize=False,
                  scene_camera_eye=dict(x=1.87, y=0.88, z=-0.64),
                  width=700, height=700,
                  margin=dict(l=65, r=50, b=65, t=90)
)

fig.show()

In [76]:
# graphing equations

x = np.outer(np.linspace(-10, 10, 100), np.ones(100)) 
y = x.copy().T 
z = x**2 + y**2
  
fig = go.Figure(data=[go.Surface(x=x, y=y, z=z)]) 
fig.update_layout(title='Graphing equation', autosize=False,
                 width=1000, height=700,)

  
fig.show() 

In [59]:
# multiple surfaces

z1 = np.array([
    [8.83,8.89,8.81,8.87,8.9,8.87],
    [8.89,8.94,8.85,8.94,8.96,8.92],
    [8.84,8.9,8.82,8.92,8.93,8.91],
    [8.79,8.85,8.79,8.9,8.94,8.92],
    [8.79,8.88,8.81,8.9,8.95,8.92],
    [8.8,8.82,8.78,8.91,8.94,8.92],
    [8.75,8.78,8.77,8.91,8.95,8.92],
    [8.8,8.8,8.77,8.91,8.95,8.94],
    [8.74,8.81,8.76,8.93,8.98,8.99],
    [8.89,8.99,8.92,9.1,9.13,9.11],
    [8.97,8.97,8.91,9.09,9.11,9.11],
    [9.04,9.08,9.05,9.25,9.28,9.27],
    [9,9.01,9,9.2,9.23,9.2],
    [8.99,8.99,8.98,9.18,9.2,9.19],
    [8.93,8.97,8.97,9.18,9.2,9.18]
])

z2 = z1 + 1
z3 = z1 - 1

fig = go.Figure(data=[
    go.Surface(z=z1),
    go.Surface(z=z2, showscale=False, opacity=0.9),
    go.Surface(z=z3, showscale=False, opacity=0.9)

])

fig.update_layout(title='Multiple surfaces', autosize=False,
                 width=1000, height=700,)

fig.show()

# Violin plot
A violin plot is a statistical representation of numerical data. It is similar to a box plot, with the addition of a rotated kernel density plot on each side.

In [70]:
df = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv")
print(df.head())
fig = go.Figure()

days = ['Thur', 'Fri', 'Sat', 'Sun']

for day in days:
    fig.add_trace(go.Violin(x=df['day'][df['day'] == day],
                            y=df['total_bill'][df['day'] == day],
                            box_visible=True,
                            meanline_visible=True))

fig.show()

   total_bill   tip     sex smoker  day    time  size
0       16.99  1.01  Female     No  Sun  Dinner     2
1       10.34  1.66    Male     No  Sun  Dinner     3
2       21.01  3.50    Male     No  Sun  Dinner     3
3       23.68  3.31    Male     No  Sun  Dinner     2
4       24.59  3.61  Female     No  Sun  Dinner     4


# Scatter Matrix
A scatterplot matrix is a matrix associated to n numerical arrays (data variables), $X_1,X_2,…,X_n$ , of the same length. The cell (i,j) of such a matrix displays the scatter plot of the variable Xi versus Xj.

for this, we define another dataset based on numerical values

In [32]:
# a dataframe using numpy with 100 rows and 3 columns
arr = np.random.rand(100, 3)
df_numeric = pd.DataFrame(arr, columns=['A','B','C'])
df_numeric.head()

Unnamed: 0,A,B,C
0,0.058932,0.643521,0.424026
1,0.728995,0.072305,0.048337
2,0.625352,0.166951,0.88161
3,0.34189,0.515967,0.656847
4,0.151868,0.262432,0.689207


In [33]:
# Scatter matrix
fig = px.scatter_matrix(df_numeric,
    dimensions=["A", "B", "C"],
    color="A")
fig.show()

# Bubble

In [34]:
# again using the numerical dataset
df_numeric.iplot(kind='bubble', x='A', y='B', size='C')

# Boxplot

In [35]:
df_numeric.iplot(kind='box')

# Heatmap
The darker the shade, the greater the quantity (the higher the value, the tighter the dispersion, etc.).

In [36]:
#importing graph_objects of Plotly
import plotly.graph_objects as go 

 We have a 2D list or array which defines the data (harvest by different farmers in tons/year) to color code.
 We then also need two lists of names of farmers and vegetables cultivated by them.

 Plotly’s graph_objects module contains Heatmap() function. It needs x, y and z attributes. Their value can be a list, numpy array or Pandas dataframe.

In [77]:
vegetables = [
   "cucumber", 
   "tomato", 
   "lettuce", 
   "asparagus",
   "potato", 
   "wheat", 
   "barley"
]
farmers = [
   "Farmer Joe", 
   "Upland Bros.", 
   "Smith Gardening",
   "Agrifun", 
   "Organiculture", 
   "BioGoods Ltd.", 
   "Cornylee Corp."
]
harvest = np.array(
   [
      [0.8, 2.4, 2.5, 3.9, 0.0, 4.0, 0.0],
      [2.4, 0.0, 4.0, 1.0, 2.7, 0.0, 0.0],
      [1.1, 2.4, 0.8, 4.3, 1.9, 4.4, 0.0],
      [0.6, 0.0, 0.3, 0.0, 3.1, 0.0, 0.0],
      [0.7, 1.7, 0.6, 2.6, 2.2, 6.2, 0.0],
      [1.3, 1.2, 0.0, 0.0, 0.0, 3.2, 5.1],
      [0.1, 2.0, 0.0, 1.4, 0.0, 1.9, 6.3]
   ]
)


Plotting heatmap

In [85]:
data = go.Heatmap(
   x = vegetables,
   y = farmers,
   z = harvest,
   type = 'heatmap',
   colorscale = 'Magma', 
   
)
fig = go.Figure(data = data)
fig.update_layout(
    autosize=False,
    width=800,
    height=800
)
iplot(fig)