Analysis of Happiness in the World From 2015-2019

Alex Xiao and Henrique Corte

# Outline: 
    1. Introduction
        1.1 Motivation
        1.2 Libraries Used
    2. Data Collection
    3. Data management/representation
    4. Data Analysis and Visualization
    5. Hypothesis testing and machine learning
    6. Conclusions

# 1. Introduction
    1.1 Motivation:
        Happiness appears to be an increasingly important factor to our existence as quality of life has increased. People have begun to embrace being happy in the moment. As we begin to optimize and desire more of this intangible qualia, we as a species have begun to categorize and break "happiness" down into what appears to be relevant components in a desire to further our understanding. While there are many individualistic things one can do to improve their happiness such as practicing meditation and gratefulness, this tutorial will attempt to examine larger nation-wide factors that can impact happiness.
        
    1.2 Libraries Used:
        Pandas: used to represent and see data in dataframes
        Collections: used for general data structure manipulation
        Matplotlib: used generate data plots
        SKLearn: used to analyze data and predict factors
        Numpy: used to support data and pandas
        Statistics: more generalized statistical methods
        Scipy: used for some statistical analysis
        Statsmodels: used for additional statistical analysis

In [1]:
import pandas as pd
from collections import defaultdict
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
import numpy as np
import statistics
import scipy
import statsmodels.api as sm

Source: https://www.kaggle.com/unsdsn/world-happiness

In [7]:
data_15 = pd.read_csv('2015.csv', sep=',')
data_16 = pd.read_csv('2016.csv', sep=',')
data_17 = pd.read_csv('2017.csv', sep=',')
data_18 = pd.read_csv('2018.csv', sep=',')
data_19 = pd.read_csv('2019.csv', sep=',')

In [6]:
data_15

Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Lower Confidence Interval,Upper Confidence Interval,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
0,Denmark,Western Europe,1,7.526,7.460,7.592,1.44178,1.16374,0.79504,0.57941,0.44453,0.36171,2.73939
1,Switzerland,Western Europe,2,7.509,7.428,7.590,1.52733,1.14524,0.86303,0.58557,0.41203,0.28083,2.69463
2,Iceland,Western Europe,3,7.501,7.333,7.669,1.42666,1.18326,0.86733,0.56624,0.14975,0.47678,2.83137
3,Norway,Western Europe,4,7.498,7.421,7.575,1.57744,1.12690,0.79579,0.59609,0.35776,0.37895,2.66465
4,Finland,Western Europe,5,7.413,7.351,7.475,1.40598,1.13464,0.81091,0.57104,0.41004,0.25492,2.82596
...,...,...,...,...,...,...,...,...,...,...,...,...,...
152,Benin,Sub-Saharan Africa,153,3.484,3.404,3.564,0.39499,0.10419,0.21028,0.39747,0.06681,0.20180,2.10812
153,Afghanistan,Southern Asia,154,3.360,3.288,3.432,0.38227,0.11037,0.17344,0.16430,0.07112,0.31268,2.14558
154,Togo,Sub-Saharan Africa,155,3.303,3.192,3.414,0.28123,0.00000,0.24811,0.34678,0.11587,0.17517,2.13540
155,Syria,Middle East and Northern Africa,156,3.069,2.936,3.202,0.74719,0.14866,0.62994,0.06912,0.17233,0.48397,0.81789


63892

Unnamed: 0,arrest,age,race,sex,arrestDate,arrestTime,arrestLocation,incidentOffense,incidentLocation,charge,chargeDescription,district,post,neighborhood,Location 1,lat,long
88,11127208.0,50,B,M,01/01/2011,21:00:00,4500 Marble Hall Rd,24-Towed Vehicle,Winston Av & The Alameda,4 3550,Cds:Possess-Not Marihuana || Poss.Cocaine,NORTHEASTERN,421.0,New Northwood,"(39.3449490442, -76.5971162623)",39.344949,-76.597116
89,11127205.0,42,B,M,01/01/2011,21:00:00,5100 The Alameda St,24-Towed Vehicle,Winston Av & The Alameda,1 0573,Cds: Possession-Marihuana || Poss Marijuana,NORTHERN,524.0,Kenilworth Park,"(39.3490312605, -76.5991216997)",39.349031,-76.599122
91,11127242.0,52,B,M,01/01/2011,22:15:00,6800 Mcclean Blvd,4D-Agg. Asslt.- Hand,1000 Aliceanna St,1 1415,Asslt-Sec Degree || Common Assault,NORTHEASTERN,424.0,Harford-Echodale/Perring Parkway,"(39.3705311607, -76.5670452261)",39.370531,-76.567045
92,11127238.0,56,B,F,01/01/2011,22:16:00,6800 Mcclean Blvd,4D-Agg. Asslt.- Hand,1000 Aliceanna St,1 1415,Asslt-Sec Degree || Common Assault,NORTHEASTERN,424.0,Harford-Echodale/Perring Parkway,"(39.3704721295, -76.5670520131)",39.370472,-76.567052
96,11127209.0,27,U,M,01/01/2011,23:40:00,1700 Ramsay St,87O-Narcotics (Outside),1700 Ramsay St,2 2220,Trespass: Private Property || Cds Violation,SOUTHERN,935.0,New Southwest/Mount Clare,"(39.2832459671, -76.6442837779)",39.283246,-76.644284


array(['79-Other', 'Unknown Offense', '81-Recovered Property',
       '54-Armed Person', '20A-Followup', '4E-Common Assault',
       '87-Narcotics', '4B-Agg. Asslt.- Cut', '75-Destruct. Of Property',
       '55-Disorderly Person', '4C-Agg. Asslt.- Oth.',
       '4D-Agg. Asslt.- Hand', '6D-Larceny- From Auto',
       '5A-Burg. Res. (Force)', '111-Protective Order',
       '24-Towed Vehicle', '87O-Narcotics (Outside)',
       '29-Driving While Intox.', '49-Family Disturbance',
       '6G-Larceny- From Bldg.', '97-Search & Seizure', '7A-Stolen Auto',
       '5D-Burg. Oth. (Force)', '23-Unauthorized Use', '55A-Prostitution',
       '3LF-Robb Bank-Firearm', '6C-Larceny- Shoplifting',
       '4A-Agg. Asslt.- Gun', '109-Loitering', '6E-Larceny- Auto Acc',
       '3AF-Robb Hwy-Firearm', '88-Unfounded Call', '3K-Robb Res. (Ua)',
       '26-Recovered Vehicle', '4F-Assault By Threat',
       '61-Person Wanted On War', '73-False Pretense',
       '3AK-Robb Hwy-Knife', '5C-Burg. Res. (Noforce)',
  