# Visualización Interactiva con Plotly

![elgif](https://media.giphy.com/media/jR8EDxMbqi1QQ/giphy.gif)

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Importamos-librerías" data-toc-modified-id="Importamos-librerías-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Importamos librerías</a></span></li><li><span><a href="#Cargamos-datitos" data-toc-modified-id="Cargamos-datitos-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Cargamos datitos</a></span></li><li><span><a href="#Gráficos-de-Barras" data-toc-modified-id="Gráficos-de-Barras-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Gráficos de Barras</a></span></li><li><span><a href="#Gráfico-de-barras-agrupadas" data-toc-modified-id="Gráfico-de-barras-agrupadas-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Gráfico de barras agrupadas</a></span></li><li><span><a href="#Histogramas" data-toc-modified-id="Histogramas-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Histogramas</a></span></li><li><span><a href="#Distplot" data-toc-modified-id="Distplot-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Distplot</a></span></li><li><span><a href="#ScatterPlot" data-toc-modified-id="ScatterPlot-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>ScatterPlot</a></span></li><li><span><a href="#LineChart" data-toc-modified-id="LineChart-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>LineChart</a></span></li><li><span><a href="#Boxplot" data-toc-modified-id="Boxplot-9"><span class="toc-item-num">9&nbsp;&nbsp;</span>Boxplot</a></span></li></ul></div>

## Importamos librerías 

Lo primerito es instalar --> [documentación aquí](https://plotly.com/python/getting-started/)

In [2]:
import plotly.express as px
import seaborn as sns
import pandas as pd
import plotly.graph_objects as go

## Cargamos datitos

In [3]:
penguins = sns.load_dataset("penguins")
tips = sns.load_dataset("tips")
titanic = pd.read_csv('data/titanic.csv', index_col=0)
titanic2= sns.load_dataset("titanic")

In [4]:
penguins.head()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,Male
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,Female
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,Female
3,Adelie,Torgersen,,,,,
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,Female


## Gráficos de Barras
Muestre los recuentos de las observaciones en cada casilla categórica utilizando barras.

In [5]:
data_canada = px.data.gapminder().query("country == 'Canada'") #Cómo bajarme datos de plotly, sacado de la documentación

In [6]:
data_canada

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num
240,Canada,Americas,1952,68.75,14785584,11367.16112,CAN,124
241,Canada,Americas,1957,69.96,17010154,12489.95006,CAN,124
242,Canada,Americas,1962,71.3,18985849,13462.48555,CAN,124
243,Canada,Americas,1967,72.13,20819767,16076.58803,CAN,124
244,Canada,Americas,1972,72.88,22284500,18970.57086,CAN,124
245,Canada,Americas,1977,74.21,23796400,22090.88306,CAN,124
246,Canada,Americas,1982,75.76,25201900,22898.79214,CAN,124
247,Canada,Americas,1987,76.86,26549700,26626.51503,CAN,124
248,Canada,Americas,1992,77.95,28523502,26342.88426,CAN,124
249,Canada,Americas,1997,78.61,30305843,28954.92589,CAN,124


In [9]:
fig = px.bar(data_canada, x="year", y="pop")
fig.show()

In [10]:
penguins.head()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,Male
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,Female
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,Female
3,Adelie,Torgersen,,,,,
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,Female


In [11]:
penguins.species.value_counts().values

array([152, 124,  68])

In [12]:
fig = px.bar(penguins, x=penguins.species.value_counts().index, y=penguins.species.value_counts().values)

In [13]:
fig.show()

## Gráfico de barras agrupadas
Si voy a la documentación de plotly veo esto.... 
Me dice que puedo utilizar fig.update_layout para actualizar la figura.       
Veo que pasamos las categorías de las especies con la x en forma de lista y en la `y` introducimos un array con las cantidades que queremos mostrar.

In [14]:
animals=['giraffes', 'orangutans', 'monkeys']

fig = go.Figure(data=[
    go.Bar(name='SF Zoo', x=animals, y=[20, 14, 23]),
    go.Bar(name='LA Zoo', x=animals, y=[12, 18, 29])
])
# Change the bar mode
fig.update_layout(barmode='group')
fig.show()

¿Cómo hacemos esto teniendo los datos en un dataframe?

In [15]:
#Agrupo el dataframe
agrupado = penguins.groupby(["species"])["sex"].value_counts().unstack()
agrupado

sex,Female,Male
species,Unnamed: 1_level_1,Unnamed: 2_level_1
Adelie,73,73
Chinstrap,34,34
Gentoo,58,61


In [17]:
agrupado.Female.values

array([73, 34, 58])

In [18]:
penguins.sample()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
324,Gentoo,Biscoe,47.3,13.8,216.0,4725.0,


In [19]:
penguins.species.unique()

array(['Adelie', 'Chinstrap', 'Gentoo'], dtype=object)

In [20]:
animals= penguins.species.unique() # ESTO ES UNA LISTA CON LOS NOMBRES 
fig = go.Figure(data=[
    go.Bar(name="Female", x=animals, y=agrupado.Female),
    go.Bar(name="Male", x=animals, y=agrupado.Male)
])
fig.show()

In [21]:
animals= penguins.species.unique() # ESTO ES UNA LISTA CON LOS NOMBRES 
fig = go.Figure(data=[
    go.Bar(name="Female", x=animals, y=agrupado.Female),
    go.Bar(name="Male", x=animals, y=agrupado.Male)
])
# CAMBIAR EL TIPO DE BARRAS
fig.update_layout(barmode="stack")
fig.show()

Cuando varias filas comparten el mismo valor de x (en este caso Femenino o Masculino), los rectángulos se apilan por defecto.

## Histogramas

https://plotly.com/python/histograms/

In [24]:
tips.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


In [29]:
fig = px.histogram(tips, x="total_bill")
fig.update_layout(bargap=0.2)
fig.show()

In [36]:
fig = px.histogram(titanic, x="Age")
fig.add_vline(titanic.Age.median(), line_width=2, line_dash="dash", line_color="green")
fig.add_vline(titanic.Age.mean(), line_width=3, line_dash="dash", line_color="red")
fig.add_hline(40, line_width=3, line_dash="dash", line_color="red")
fig.show()

## Distplot

In [37]:
tit = titanic.copy()
tit.dropna(inplace=True)

In [38]:
import plotly.figure_factory as ff
hist_data = [tit.Age]
labels = ["Edad"]

In [39]:
tit.head()

Unnamed: 0_level_0,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
Survived,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S
1,3,"Sandstrom, Miss. Marguerite Rut",female,4.0,1,1,PP 9549,16.7,G6,S
1,1,"Bonnell, Miss. Elizabeth",female,58.0,0,0,113783,26.55,C103,S


In [40]:
penguins.dropna(inplace=True)

In [41]:
fig = ff.create_distplot(hist_data, labels)
fig.show()

In [42]:
penguins.head()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,Male
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,Female
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,Female
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,Female
5,Adelie,Torgersen,39.3,20.6,190.0,3650.0,Male


In [43]:
hist_data = [penguins.bill_length_mm, penguins.bill_depth_mm]
group_labels = ["bill_length_mm","bill_depth_mm"] # name of the dataset

fig = ff.create_distplot(hist_data, group_labels)
fig.show()

## ScatterPlot

In [44]:
penguins.head()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,Male
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,Female
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,Female
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,Female
5,Adelie,Torgersen,39.3,20.6,190.0,3650.0,Male


Voy a hacer un scatterplot para visualizar las especies como en seaborn ayer pero tengo que borrar los nan, porque me va a dar errorm, recordad que los nan 

In [45]:
fig = px.scatter(penguins, x="flipper_length_mm", y="body_mass_g")
fig.show()

In [46]:
fig = px.scatter(penguins, x="flipper_length_mm", y="body_mass_g", color="bill_length_mm")
fig.show()

In [47]:
fig = px.scatter(penguins, x="flipper_length_mm", y="body_mass_g", color="species")
fig.show()

In [48]:
fig = px.scatter(penguins, x="body_mass_g", y="flipper_length_mm", color="species", size ="bill_depth_mm")
fig.show()

In [49]:
fig = px.scatter_matrix( penguins, dimensions=['bill_length_mm','bill_depth_mm','flipper_length_mm','body_mass_g'], color="species",width=1000, height=800) 
fig.show()

In [50]:
fig = px.scatter_matrix( penguins, dimensions=['bill_length_mm','bill_depth_mm','flipper_length_mm','body_mass_g'],width=1000, height=800) 
fig.show()

In [51]:
fig = px.scatter_matrix( penguins, dimensions=['bill_length_mm','bill_depth_mm','flipper_length_mm','body_mass_g', "species"],width=1000, height=800) 
fig.show()

## LineChart

In [52]:
vuelos = sns.load_dataset("flights")
vuelos.head()

Unnamed: 0,year,month,passengers
0,1949,Jan,112
1,1949,Feb,118
2,1949,Mar,132
3,1949,Apr,129
4,1949,May,121


In [53]:
feb = vuelos[vuelos.month == "Feb"]

In [54]:
fig = px.line(feb, x="year", y="passengers")
fig.show()

In [55]:
fig = px.line(vuelos, x="year", y="passengers", color="month")
fig.show()

## Boxplot

In [56]:
fig = px.box(titanic, x="Age")
fig.show()

In [57]:
fig = px.box(titanic, x="Pclass", y="Age")
fig.show()

In [58]:
fig = px.box(titanic, x="Pclass", y="Age", points="all") #Points añade los puntos a la izquierda de cada box
fig.show()

In [59]:
titanic2= sns.load_dataset("titanic")

In [60]:
fig = px.box(titanic2, x="pclass", y="age", color="survived", points="all", width=1100, height=600) #Points añade los puntos a la izquierda de cada box
fig.show()

Cambiamos los colores poniendo en la key el valor de la columna (si no lo entendéis avisadme luego)

In [63]:
fig = px.box(titanic2, x="pclass", y="age", color="survived", color_discrete_map={1: '#19D3F4', 0: 'green'}) 
fig.show()

Los gráficos de plotly no siempre se ven bien en github.

Podéis poner en el readme que para visualizar mejor los jupyter notebook se use:

[nbviewer](https://nbviewer.org/)


Solamente tenéis que copiar el enlace a vuestro jupyter en el campo de texto y listo!!!