![BTS](img/Logo-BTS.jpg)

# Session 15: Advanced Visualization

### Juan Luis Cano Rodríguez <juan.cano@bts.tech> - Data Science Foundations (2018-11-16)

Open this notebook in Google Colaboratory: [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Juanlu001/bts-mbds-data-science-foundations/blob/master/sessions/15-Advanced-Visualization.ipynb)

# Assignment 3 - Interactive Plotly

### PELIN GUNDOGDU

## Exercise 1: "Gapminder" interactive visualization

We will reproduce an example similar to this:

from IPython.display import YouTubeVideo
YouTubeVideo("jbkSRLYSojo", width=800, height=600)

1. Load all the datasets in the `data/gapminder` directory, indexing them by `Country`.
2. Create a function that receives a `year` _as an integer_ and returns a new dataframe with `Country` as the index and the columns `Fertility`, `Life expectancy`, `Population` and `Group`.
3. Create a Plotly `FigureWidget` and visualize a scatter plot of `Life expectancy` vs `Fertility`, using the `Population` as bubble size (you will need some scaling) and coloring by `Group`. _Hint: it will be easier to do as many scatters as regions_
4. Decorate the figure with proper X and Y axis labels, a title, a big text showing the year, and a legend (if not present). _Note: The legend might not show the colors_
5. Create a function `update_year` that receives a `year` _as an integer_ and updates the data of the existing figure with the values from the selected year. _Note: The update might not be very efficient_
6. Create an horizontal slider that ranges from the minimum to the maximum year
7. Bind the `update_year` function to changes in the horizontal slider and use it to interactively change the plot

1. Load all the datasets in the data/gapminder directory, indexing them by Country.

In [1]:
import pandas as pd
import numpy as np
from plotly import graph_objs as go
from ipywidgets import interact

In [2]:
fertility = pd.read_csv("data/gapminder/fertility.csv", index_col="Country")
life_exp = pd.read_csv("data/gapminder/life_expectancy.csv", index_col="Country")
population = pd.read_csv("data/gapminder/population.csv", index_col="Country")
region = pd.read_csv("data/gapminder/regions.csv", index_col="Country")

fertility.head()
life_exp.head()
population.head()
region.head()

2. Create a function that receives a `year` _as an integer_ and returns a new dataframe with `Country` as the index and the columns `Fertility`, `Life expectancy`, `Population` and `Group`.

In [3]:
def country_year(year):
     return pd.DataFrame({"Population": population[str(year)],
                  "Life expectancy" : life_exp[str(year)],
                  "Fertility" : fertility[str(year)],
                  "Group" : region["Group"]})

#country_year(1989)
#df = country_year(1964)
#df.head()

3. Create a Plotly `FigureWidget` and visualize a scatter plot of `Life expectancy` vs `Fertility`, using the `Population` as bubble size (you will need some scaling) and coloring by `Group`. _Hint: it will be easier to do as many scatters as regions_
4. Decorate the figure with proper X and Y axis labels, a title, a big text showing the year, and a legend (if not present). _Note: The legend might not show the colors_
5. Create a function `update_year` that receives a `year` _as an integer_ and updates the data of the existing figure with the values from the selected year. _Note: The update might not be very efficient_

In [4]:
# Question 5 - update_year function
fig = go.FigureWidget()
fig.layout.xaxis.title = "Fertility"
fig.layout.yaxis.title = "Life expectancy"

In [5]:
# Question 5 - update_year function
def update_year(year):
    fig.data = []
    fig.layout.title = "Fertility vs Life expectancy - " + str(year)
    df_year = country_year(year)

    
# Question 3 -  a scatter plot of Life expectancy vs Fertility, the Population as bubble size and coloring by Group
    for group_name, sub_df in df_year.groupby("Group"):
        sc = fig.add_scatter(
            x=sub_df["Fertility"], 
            y=sub_df["Life expectancy"], 
            mode="markers",
            marker={
                "size" : np.sqrt(sub_df["Population"].fillna(1)) / 400
            },
            name=group_name
        )
#Question 4 - proper X and Y axis labels, a title, a big text showing the year, and a legend
    
    return fig

6. Create an horizontal slider that ranges from the minimum to the maximum year

In [6]:
import ipywidgets as widgets

year_bar = widgets.IntSlider(
                value=fertility.columns[0],
                min=fertility.columns[0],
                max=fertility.columns[-1],
                step=1,
                description='Year:',
                disabled=False,
                continuous_update=True,
                orientation='horizontal',
                readout=True,
                readout_format='d'
)


7. Bind the `update_year` function to changes in the horizontal slider and use it to interactively change the plot

In [7]:
@interact(x=year_bar)
def h(x):
    return(update_year(x))

interactive(children=(IntSlider(value=1964, description='Year:', max=2013, min=1964), Output()), _dom_classes=…