# Gráficos interactivos con Python, Pandas, Plotly y Cufflinks
***Cufflinks*** actúa como inrmediario entre pandas y plotly para lograr los gráficos interectivos.

## 1. Instalando las librerías

In [1]:
%pip install pandas
%pip install plotly
%pip install cufflinks





[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


## 2. Importamos las librerías

In [2]:
import pandas as pd
import cufflinks as cf
from IPython.display import display,HTML
import plotly.graph_objects as go
"""
De cufflinks aplicamos una configuración para compartir público y con tema blanco en modo online:
"""
cf.set_config_file(sharing='public',theme='ggplot',offline=True)

Para observar todos los temas disponibles:

In [3]:
cf.getThemes()

['ggplot', 'pearl', 'solar', 'space', 'white', 'polar', 'henanigans']

## 3. Leyendo un DataSet
El DataSet es el conjunto de datos que vamos a leer desde un CSV para transformarlo en un ***DataFrame***.

In [4]:
pd.read_csv('population_total.csv')

Unnamed: 0,country,year,population
0,China,2020.0,1.439324e+09
1,China,2019.0,1.433784e+09
2,China,2018.0,1.427648e+09
3,China,2017.0,1.421022e+09
4,China,2016.0,1.414049e+09
...,...,...,...
4180,United States,1965.0,1.997337e+08
4181,United States,1960.0,1.867206e+08
4182,United States,1955.0,1.716853e+08
4183,India,1960.0,4.505477e+08


### 3.1 Formatear/Configurar DataFrame
Para trabajar eficientemente con el DataFrame, necesitamos manipular y transformar los datos.

In [5]:
#Cargamos y guardamos el DF en una variable
df_population=pd.read_csv('population_total.csv')
#Eliminamos los valores nulos
df_population=df_population.dropna()
#Dar forma al DF con pivot
df_population = df_population.pivot(index='year',columns='country',values='population')
#Podemos elegir los paises que deseamos incluir
df_population=df_population[['United States','India','China','Indonesia','Brazil']]
df_population

country,United States,India,China,Indonesia,Brazil
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1955.0,171685336.0,409880600.0,612241600.0,77273425.0,62533919.0
1960.0,186720571.0,450547700.0,660408100.0,87751068.0,72179226.0
1965.0,199733676.0,499123300.0,724219000.0,100267062.0,83373530.0
1970.0,209513341.0,555189800.0,827601400.0,114793178.0,95113265.0
1975.0,219081251.0,623102900.0,926240900.0,130680727.0,107216205.0
1980.0,229476354.0,698952800.0,1000089000.0,147447836.0,120694009.0
1985.0,240499825.0,784360000.0,1075589000.0,164982451.0,135274080.0
1990.0,252120309.0,873277800.0,1176884000.0,181413402.0,149003223.0
1995.0,265163745.0,963922600.0,1240921000.0,196934260.0,162019896.0
2000.0,281710909.0,1056576000.0,1290551000.0,211513823.0,174790340.0


# Gráficas

## 1. Gráfico de lineas con Pandas

In [6]:
df_population.iplot(kind='line')

### 1.1 Personalizar gráfica

In [7]:
df_population.iplot(kind='line',xTitle='Year',yTitle='Population',title='Year vs Population')

## 2. Gráfica de barras

In [8]:
#Población de los 5 paises en el año 2022, recordamos que el index es el año
df_population_2020=df_population[df_population.index.isin([2020])]
df_population_2020=df_population_2020.T
df_population_2020.iplot(
    kind='bar',xTitle='Countries',yTitle='Population',title='Year vs Population',color='red'
    )

## 3. Gráfica de barras multiple

In [9]:
df_population_sample=df_population[df_population.index.isin([1980,
                                                             1990,2000,
                                                             2010,2020])]
df_population_sample.iplot(
    kind='bar',xTitle='Countries',yTitle='Population',title='Year vs Population')

## 4. Gráfica de cajas

In [10]:
df_population['United States'].iplot(kind='box')

In [11]:
df_population.iplot(kind='box')

## 5. Histograma

In [12]:
df_population['United States'].iplot(kind='hist')

## 6. Multiple histigrama

In [13]:
df_population[['United States','Indonesia']].iplot(kind='hist')

## 7. Gráfico de pastel
Para este gráfico de pastel usamos el DF de la población en el año 2020 pero no necesitaremos como índice a los paises.

In [14]:
df_population_2020=df_population_2020.reset_index()
#Renombramos los valores definiendo las columnas y pasando de int a str
df_population_2020=df_population_2020.rename(columns={2020:'2020'})
df_population_2020.iplot(kind='pie',labels='country',values='2020')

## 8. Gráfico de dispersión

In [15]:
df_population.iplot(kind='scatter',mode='markers',xTitle='Year',
                    yTitle='Population',title='Year vs Population')