# IS 445 - Group 5 - Final Project
#### Group Members: Nataly Panczyk, Betty Guerrero, Thomas McShane, Shubham Jain

## Objective

The goal of this project was to create a variety of complementary visualizations using data collected from the 2020 World Happiness Report. This notebook will provide a walk through of the final visualizations created by our group as well as descriptions for each visualization, major challenges in creating that visualization, and why we think these visualizations are valuable.

## About the Data
 https://worldhappiness.report/ed/2021/#appendices-and-data

The first eight World Happiness Reports were produced by the founding trio of co-editors assembled in Thimphu in July 2011 pursuant to the Bhutanese Resolution passed by the General Assembly in June 2011 that invited national governments to "give more importance to happiness and well-being in determining how to achieve and measure social and economic development." The overall goal of this organization is to create new metrics for measuring economic progression around the world beyond fiscal quantities. This dataset was particularly interesting to us because it's funded and supported by prominent universities, non-profits, and private companies, and shows one effective impact of these types of institutions when they work collaboratively for a genuinely good cause. This dataset seeks to benefit the development of our world for the humans that inhabit it, which is what inspired us to choose it for our final visualization project. 

## Comparing Life Expectancies in Different Countries From 2005 to 2006
The first component of this dataset that our group sought to analyze was the life expectancy in the various countries over time. We went about creating this visualization by first sorting the data by country, and then plotting the various points in 2 different ways using ipywidgets. First, the plot will display a checkbox that defaults to the static visualization. This will show the life expectancies in every country with a different colored marker (as indicated by the legend) for each year in the dataset, 2006-2020. By unchecking this box, a new widget will appear (a slider), that allows the user to define the end year of the dataset. This allows for the user to interact with the visualization on a more iterative level and identify the countries with the greatest changes in life expectancy more easily. The greatest difficulty we faced in creating this visualization was getting ipywidgets to update our plots when the sliders and checkboxes were interacted with. Eventually, we resolved this issue by using the %matplotlib notebook command instead of %matplotlib inline.

In [14]:
# import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colorbar
import pandas as pd
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

%matplotlib notebook

# use pandas to read the csv from the World Happiness Report
url = 'https://raw.githubusercontent.com/npanczyk/npanczyk.github.io/main/happiness.csv'
df = pd.read_csv(url)
# sort the data by year
sorted_df = df.sort_values(by='year')
# make an array of the sorted data
sorted_arr = sorted_df.values
# array of years (in order)
ordered_years = sorted_arr[:,1]
# find the index of the last entry of 2005 (earliest year)
year_list = np.arange(2005, 2021, 1)
l_list = np.zeros(len(year_list))
for year in ordered_years:
    for i in range( len(l_list)):
        if year == year_list[i]:
            l_list[i] += 1
            if i < len(l_list) - 1:
                l_list[i + 1] = l_list[i]
        else:
            continue
l = int(l_list[0])
early_data = sorted_arr[0:l,:]
# repeat this process in reverse to find the 2020 data
sorted2_df = df.sort_values(by='year', ascending=False)
sorted2_arr = sorted2_df.values
# array of years (in reverse order)
reverse_years = sorted2_arr[:,1]
# find the index of the last entry of 2020 (latest year)
l2 = 0
for i in reverse_years:
    if i == 2020:
        l2 += 1
    else:
        break
        
late_data = sorted2_arr[0:l2,:]
late_countries = late_data[:,0]

total_countries = np.unique(sorted_arr[:,0])

# 2005 data to plot
early_countries = early_data[:,0]
early_life_exp = early_data[:,5]


plt.style.use('seaborn-darkgrid')
fig, ax = plt.subplots(figsize=(25,8))

def make_plot(year):
    def plot_country(year):
        if year == 2005:
            return ax.plot(early_countries, early_life_exp, marker='o', ls='', label='2005')
        else:
            stop = np.where(year_list == year)[0]
            start = stop - 1
            data = sorted_arr[int(l_list[start]):int(l_list[stop]),:]
            countries = data[:,0]
            life_exp = data[:,5]
            return ax.plot(countries, life_exp, marker='o', ls='', label=str(year))
    plot_country(year)
    ax.set_title('Life Expectancy by Country from 2005-2020')
    ax.set_xlabel('Country')
    ax.set_ylabel('Life Expectancy at Birth (Years)')
    ax.set_xticks(total_countries)
    ax.set_xticklabels(total_countries,rotation=90)
    ax.legend(loc='upper right')
    return

def static_plot(Static):
    if Static == True:
        for i in year_list:
            make_plot(i)
    if Static == False:
        plt.cla()
        interact(slide, year2=(2007,2021,1), continuous_update=True)

def slide(year2):
    plt.cla()
    years = np.arange(2005,year2,1)
    for i in years:
        make_plot(i)

interact(static_plot, Static=True)


<IPython.core.display.Javascript object>

interactive(children=(Checkbox(value=True, description='Static'), Output()), _dom_classes=('widget-interact',)…

<function __main__.static_plot(Static)>