# Interactive Visualization with Plotly

![elgif](https://media.giphy.com/media/jR8EDxMbqi1QQ/giphy.gif)

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Importing-libraries" data-toc-modified-id="Importing-libraries-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Importing libraries</a></span></li><li><span><a href="#We-load-data" data-toc-modified-id="We-load-data-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>We load data</a></span></li><li><span><a href="#Bar-Charts" data-toc-modified-id="Bar-Charts-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Bar Charts</a></span></li><li><span><a href="#Clustered-Bar-Chart" data-toc-modified-id="Clustered-Bar-Chart-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Clustered Bar Chart</a></span></li><li><span><a href="#Histograms" data-toc-modified-id="Histograms-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Histograms</a></span></li><li><span><a href="#Distplot" data-toc-modified-id="Distplot-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Distplot</a></span></li><li><span><a href="#ScatterPlot" data-toc-modified-id="ScatterPlot-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>ScatterPlot</a></span></li><li><span><a href="#LineChart" data-toc-modified-id="LineChart-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>LineChart</a></span></li><li><span><a href="#Boxplot" data-toc-modified-id="Boxplot-9"><span class="toc-item-num">9&nbsp;&nbsp;</span>Boxplot</a></span></li></ul></div>

In [1]:
# images not rendering?: copy the URL to this notebook and post it into https://nbviewer.jupyter.org/

## Importing libraries

First things first: installation --> [documentation here](https://plotly.com/python/getting-started/)

Plotly Express Data Package: --> [documentation here](https://plotly.com/python-api-reference/generated/plotly.express.data.html)

In [2]:
pip install plotly

Collecting plotly
  Downloading plotly-5.24.1-py3-none-any.whl.metadata (7.3 kB)
Collecting tenacity>=6.2.0 (from plotly)
  Downloading tenacity-9.0.0-py3-none-any.whl.metadata (1.2 kB)
Downloading plotly-5.24.1-py3-none-any.whl (19.1 MB)
   ---------------------------------------- 0.0/19.1 MB ? eta -:--:--
   ------ --------------------------------- 2.9/19.1 MB 15.3 MB/s eta 0:00:02
   ------------- -------------------------- 6.3/19.1 MB 16.1 MB/s eta 0:00:01
   ------------------- -------------------- 9.4/19.1 MB 15.5 MB/s eta 0:00:01
   -------------------------- ------------- 12.8/19.1 MB 15.5 MB/s eta 0:00:01
   ---------------------------------- ----- 16.3/19.1 MB 15.5 MB/s eta 0:00:01
   ---------------------------------------  18.9/19.1 MB 15.7 MB/s eta 0:00:01
   ---------------------------------------- 19.1/19.1 MB 14.9 MB/s eta 0:00:00
Downloading tenacity-9.0.0-py3-none-any.whl (28 kB)
Installing collected packages: tenacity, plotly
Successfully installed plotly-5.24.1 tena

In [3]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
import seaborn as sns

⚠️ Potential error. `pip install --upgrade nbformat`?

## We load data

In [4]:
penguins = sns.load_dataset("penguins")
tips = sns.load_dataset("tips")
titanic = pd.read_csv('titanic.csv', index_col=0)
titanic2= sns.load_dataset("titanic")

In [5]:
penguins.head()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,Male
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,Female
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,Female
3,Adelie,Torgersen,,,,,
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,Female


In [6]:
px.data.gapminder()

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
0,Afghanistan,Asia,1952,28.801,8425333,779.445314,AFG,4
1,Afghanistan,Asia,1957,30.332,9240934,820.853030,AFG,4
2,Afghanistan,Asia,1962,31.997,10267083,853.100710,AFG,4
3,Afghanistan,Asia,1967,34.020,11537966,836.197138,AFG,4
4,Afghanistan,Asia,1972,36.088,13079460,739.981106,AFG,4
...,...,...,...,...,...,...,...,...
1699,Zimbabwe,Africa,1987,62.351,9216418,706.157306,ZWE,716
1700,Zimbabwe,Africa,1992,60.377,10704340,693.420786,ZWE,716
1701,Zimbabwe,Africa,1997,46.809,11404948,792.449960,ZWE,716
1702,Zimbabwe,Africa,2002,39.989,11926563,672.038623,ZWE,716


In [7]:
df = px.data.gapminder().query("country == 'Canada'") # How to download data from plotly, taken from the documentation

In [8]:
df

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
240,Canada,Americas,1952,68.75,14785584,11367.16112,CAN,124
241,Canada,Americas,1957,69.96,17010154,12489.95006,CAN,124
242,Canada,Americas,1962,71.3,18985849,13462.48555,CAN,124
243,Canada,Americas,1967,72.13,20819767,16076.58803,CAN,124
244,Canada,Americas,1972,72.88,22284500,18970.57086,CAN,124
245,Canada,Americas,1977,74.21,23796400,22090.88306,CAN,124
246,Canada,Americas,1982,75.76,25201900,22898.79214,CAN,124
247,Canada,Americas,1987,76.86,26549700,26626.51503,CAN,124
248,Canada,Americas,1992,77.95,28523502,26342.88426,CAN,124
249,Canada,Americas,1997,78.61,30305843,28954.92589,CAN,124


## Bar Charts
Show the counts of the observations in each categorical cell using bars.

In [9]:
fig = px.bar(df, x="year", y="pop")
fig.show()

In [10]:
help(px.bar)

Help on function bar in module plotly.express._chart_types:

bar(data_frame=None, x=None, y=None, color=None, pattern_shape=None, facet_row=None, facet_col=None, facet_col_wrap=0, facet_row_spacing=None, facet_col_spacing=None, hover_name=None, hover_data=None, custom_data=None, text=None, base=None, error_x=None, error_x_minus=None, error_y=None, error_y_minus=None, animation_frame=None, animation_group=None, category_orders=None, labels=None, color_discrete_sequence=None, color_discrete_map=None, color_continuous_scale=None, pattern_shape_sequence=None, pattern_shape_map=None, range_color=None, color_continuous_midpoint=None, opacity=None, orientation=None, barmode='relative', log_x=False, log_y=False, range_x=None, range_y=None, text_auto=False, title=None, template=None, width=None, height=None) -> plotly.graph_objs._figure.Figure
        In a bar plot, each row of `data_frame` is represented as a rectangular
        mark.

    Parameters
    ----------
    data_frame: DataFrame or

In [11]:
penguins.head()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,Male
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,Female
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,Female
3,Adelie,Torgersen,,,,,
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,Female


In [12]:
penguins.species.value_counts().values

array([152, 124,  68], dtype=int64)

In [13]:
fig = px.bar(penguins, x=penguins.species.value_counts().index, y=penguins.species.value_counts().values)

In [14]:
fig.show()

## Clustered Bar Chart


In [15]:
#I group the dataframe
agrupado = penguins.groupby(["species"])["sex"].value_counts().unstack()
agrupado

sex,Female,Male
species,Unnamed: 1_level_1,Unnamed: 2_level_1
Adelie,73,73
Chinstrap,34,34
Gentoo,58,61


In [16]:
agrupado.Female.values

array([73, 34, 58], dtype=int64)

In [17]:
penguins.sample()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
158,Chinstrap,Dream,46.1,18.2,178.0,3250.0,Female


In [18]:
penguins.species.unique()

array(['Adelie', 'Chinstrap', 'Gentoo'], dtype=object)

In [19]:
animals= penguins.species.unique() # This is a list with the names


fig = go.Figure(data=[
    go.Bar(name="Female", x=animals, y=agrupado.Female),
    go.Bar(name="Male", x=animals, y=agrupado.Male)
])

fig.show()

In [20]:
animals= penguins.species.unique()  # This is a list with the names

fig = go.Figure(data=[
    go.Bar(name="Female", x=animals, y=agrupado.Female),
    go.Bar(name="Male", x=animals, y=agrupado.Male)
])

#Changing the type of the bars
#fig.update_layout(barmode="stack")
fig.show()

When multiple rows share the same x value (in this case Female or Male), the rectangles are stacked by default.

## Histograms

https://plotly.com/python/histograms/

In [21]:
tips.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


In [22]:

fig = px.histogram(tips, x="total_bill")#,nbins=15, 20, 50, 100
# fig.update_layout(bargap=0.1) What about 1?
fig.show()

In [23]:
fig = px.histogram(titanic, x="Age")
#fig.add_vline(titanic.Age.median(), line_width=3, line_dash="dash", line_color="green")
#fig.add_vline(titanic.Age.mean(), line_width=3, line_dash="dash", line_color="red")
#fig.add_hline(20, line_width=3, line_dash="dash", line_color="red")
fig.show()

## Distplot

In [24]:
tit = titanic.copy()
tit.dropna(inplace=True)

In [25]:
import plotly.figure_factory as ff
hist_data = [tit.Age]
labels = ["Edad"]

In [26]:
tit.head()

Unnamed: 0_level_0,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
Survived,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S
1,3,"Sandstrom, Miss. Marguerite Rut",female,4.0,1,1,PP 9549,16.7,G6,S
1,1,"Bonnell, Miss. Elizabeth",female,58.0,0,0,113783,26.55,C103,S


In [27]:
penguins.dropna(inplace=True)

In [28]:
fig = ff.create_distplot(hist_data, labels)
fig.show()

In [29]:
penguins.head()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,Male
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,Female
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,Female
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,Female
5,Adelie,Torgersen,39.3,20.6,190.0,3650.0,Male


In [30]:
hist_data = [penguins.bill_length_mm, penguins.bill_depth_mm]
group_labels = ["bill_length_mm","bill_depth_mm"] # name of the dataset

fig = ff.create_distplot(hist_data, group_labels)
fig.show()

## ScatterPlot

In [31]:
penguins.head()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,Male
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,Female
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,Female
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,Female
5,Adelie,Torgersen,39.3,20.6,190.0,3650.0,Male


In [32]:
fig = px.scatter(penguins, x="flipper_length_mm", y="body_mass_g")
fig.show()

In [33]:
fig = px.scatter(penguins, x="flipper_length_mm", y="body_mass_g") #, color="bill_length_mm")
fig.show()

In [34]:
fig = px.scatter(penguins, x="flipper_length_mm", y="body_mass_g") # , color="species")
fig.show()

In [35]:
fig = px.scatter(penguins, x="body_mass_g", y="flipper_length_mm", color="species")#, size ="bill_depth_mm")
fig.show()

In [36]:
fig = px.scatter_matrix( penguins, dimensions=['bill_length_mm','bill_depth_mm','flipper_length_mm','body_mass_g', "species"],
                        width=1000, height=800
) 
fig.show()

In [37]:
fig = px.scatter_matrix( penguins, dimensions=['bill_length_mm','bill_depth_mm','flipper_length_mm','body_mass_g'], 
                        color="species",
                        width=1000, height=800) 
fig.show()

In [38]:
fig = px.scatter_matrix(penguins,
                dimensions=['bill_length_mm','bill_depth_mm','flipper_length_mm','body_mass_g'],
                color="species"
                       )
fig.update_traces(diagonal_visible=False)
fig.show()

In [39]:
penguins.sample()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
84,Adelie,Dream,37.3,17.8,191.0,3350.0,Female


In [40]:
# Define indices corresponding to species categories, using pandas label encoding
index_vals = penguins['species'].astype('category').cat.codes

fig = go.Figure(data=go.Splom(
                dimensions=[dict(label='bill length',
                                 values=penguins['bill_length_mm']),
                            dict(label='bill depth',
                                 values=penguins['bill_depth_mm']),
                            dict(label='flipper length',
                                 values=penguins['flipper_length_mm']),
                            dict(label='body mass',
                                 values=penguins['body_mass_g'])],
                text=penguins['species'],
                showupperhalf=False,
                marker=dict(color=index_vals,
                            showscale=True,
                            line_color='white', line_width=0.5)
                ))


fig.update_layout(
    title='Penguins',
    dragmode='select',
    width=600,
    height=600,
    hovermode='closest',
)

fig.show()

## LineChart

In [41]:
flights = sns.load_dataset("flights")
flights.head()

Unnamed: 0,year,month,passengers
0,1949,Jan,112
1,1949,Feb,118
2,1949,Mar,132
3,1949,Apr,129
4,1949,May,121


In [42]:
feb = flights[flights.month == "Feb"]

In [43]:
fig = px.line(feb, x="year", y="passengers")
fig.show()

In [44]:
fig = px.line(flights, x="year", y="passengers", color="month")
fig.show()

## Boxplot

In [45]:
fig = px.box(titanic, x="Age")
fig.show()

In [46]:
fig = px.box(titanic, x="Pclass", y="Age")
fig.show()

In [47]:
fig = px.box(titanic, x="Pclass", y="Age", points="all") #Points adds the points to the left of each box
fig.show()

In [48]:
titanic2= sns.load_dataset("titanic")

In [49]:
fig = px.box(titanic2, x="pclass", y="age", color="survived", points="all", width=1100, height=600) #Points adds the dots to the left
fig.show()

We change the colors by putting the value of the column in the key

In [50]:
fig = px.box(titanic2, x="pclass", y="age", color="survived", color_discrete_map={1: '#19D3F3', 0: 'red'}) 
fig.show()

# BONUS: Sliders

In [51]:
df = px.data.gapminder()
fig = px.scatter(df, x="gdpPercap", y="lifeExp", animation_frame="year",
                 size="pop",
           color="continent", hover_name="country",
           log_x=True, size_max=55, range_x=[100,100000], range_y=[25,90])

# fig["layout"].pop("updatemenus") # optional, drop animation buttons
fig.show()