<a href="https://colab.research.google.com/github/eugenebaraka/Interactive-Vizualizations-with-Plotly/blob/main/dataviz.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Interactive Plotting in Python

In [3]:
#importing required packages
import plotly
import bokeh

Choosing which plot to use:
- Identify the types of graph and data: for statistical plots, seaborn package is direct and enough. For complex visuals like choropleth maps, this is where plotly comes in.
- Identify the target output: For something fast during EDA, using Matplotlit or searborn is enough. However, if we are making presentations on websites it is a good idea to use a more professional viz using plotly or bokeh. 
- Choose your favorite
- Check development status: if the package isn't constantly updated, it might not be spending time on it. 

## Plotly
Used for interactive data viz which work better in the browser. 

Can use chart-studio.plotly.com

## Plotly Express

Mimics seaborn and matplotlib

In [4]:
import numpy as np
import pandas as pd
import plotly.express as px

iris_df = px.data.iris()
iris_df.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species,species_id
0,5.1,3.5,1.4,0.2,setosa,1
1,4.9,3.0,1.4,0.2,setosa,1
2,4.7,3.2,1.3,0.2,setosa,1
3,4.6,3.1,1.5,0.2,setosa,1
4,5.0,3.6,1.4,0.2,setosa,1


In [6]:
fig = px.scatter(iris_df, x = 'sepal_length', y = 'petal_length')
fig.show()

Unlike figures from Matplotlib or Seaborn, figures generated using Plotly are interactive. Hoving over the data points above, you will see that the plot is interactive. 

In [7]:
fig = px.scatter(iris_df, x = 'sepal_length', 
                 y = 'petal_length', title = "my pretty plot", 
                 width = 800, height = 500)
fig.show()

- px.bar()
- px.line()
- px.scatter()
- px.histogram()
- px.box()

In [13]:
fig = px.scatter(iris_df, x = 'sepal_length', 
                 y = 'petal_length', title = "Relationship between Sepal and Petal Lengths", 
                 color = "species", size = "petal_width",
                 width = 800, height = 500)
fig.show()


In [15]:
fig = px.histogram(iris_df, x = "petal_width", 
                   nbins = 12, title = "Distribution of Petal Width",
                   labels = {"petal_width" : "Petal width"})
fig.show()

In [19]:

fig = px.box(
    iris_df,
    x = 'species', 
    y = 'petal_width',
    labels = {"petal_width" : "Petal width",
              "species": "Species"},
    title = "Distribution of Petal width by species"
)
fig.show()

## Plotly Graph Objects API


In [23]:
import plotly.graph_objects as go

fig = go.Figure()
chart_data = go.Scatter(x = iris_df['sepal_length'], y = iris_df['petal_length'], 
           mode = 'markers')


fig.add_trace(chart_data)
fig.show()

In [25]:
fig.update_layout(width = 800, height = 600, 
                         title = 'Sepal vs Petal length', 
                  xaxis_title = 'Sepal Length', yaxis_title = 'Petal Length')

fig.show()


In [26]:
fig = go.Figure()
setosa = iris_df[iris_df['species'] == 'setosa']
setosa = iris_df[iris_df['species'] == 'virginia']
setosa = iris_df[iris_df['species'] == 'versicolor']

chartdata1 = go.Scatter(x = setosa['sepal_length'], y = setosa['petal_length'])


fig.show()

In [27]:
x = np.random.random(1000)
y = np.random.random(1000)
s = np.round((1/(1+x))*y*30, 0)
c = np.random.randint(1, 10, 1000)

fig = go.Figure()
scatter_data = go.Scatter(x = x, y = y, mode = 'markers')

fig.show()

In [28]:
tips_df = px.data.tips()
tips_df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


In [29]:
tips_df['tip_pct'] = (tips_df['tip']/tips_df['total_bill'])*100

tips_male = [tips_df['sex'] == 'Male']
tips_female = [tips_df['sex'] == 'Female']
fig = go.Figure()

xbins = {'start':0, 'end': 100, 'size':1} #control the histogram scale

male_hist = go.Histogram(x = tips_male['tip_pct'], 
                         name = 'Male', 
                         xbins = xbins, 
                         opacity = 0.5)

female_hist = go.Histogram(x = tips_female['tip_pct'], 
                         name = 'Female', 
                         xbins = xbins, 
                         opacity = 0.5)

fig.add_trace(male_hist)
fig.add_trace(female_hist)

fig.show()


TypeError: ignored