# **WORLD HAPPINESS REPORT**
## **I. Overview**
### 1. What makes people in a country happy?
#### Task Details
* This dataset shows the happiest countries on earth, which is great info when you're looking for your next move. But what if you wanted to create a new country with the goal of having the happiest citizens? What if you're a president looking to improve their country? How would you do that?
* The goal of this task is to find out what factors contribute to happiness. You can join any other data and use any insights you might have that show a strong correlation between the factors you come up with.

#### Expected Submission
* Kaggle Notebooks are expected as the primary means, although I won't discard any other submitted information (e.g. a presentation submitted as a dataset). All solutions should contain the correlation between the factors you discovered and the happiness score of this dataset (I'm looking for the actual number). Any charts explaining this correlation will also help.

#### Evaluation
* This is highly subjective. The best solutions will both have a good correlation score, but are also creative and explain the work well.

In [None]:
import numpy as np
import pandas as pd
from plotly.offline import init_notebook_mode, iplot, plot
import plotly as py
init_notebook_mode(connected=True)
import plotly.graph_objs as go
from wordcloud import WordCloud
import matplotlib.pyplot as plt

In [None]:
year_2015 = pd.read_csv('/kaggle/input/world-happiness/2015.csv')
year_2016 = pd.read_csv('/kaggle/input/world-happiness/2016.csv')
year_2017 = pd.read_csv('/kaggle/input/world-happiness/2017.csv')
year_2018 = pd.read_csv('/kaggle/input/world-happiness/2018.csv')
year_2019 = pd.read_csv('/kaggle/input/world-happiness/2019.csv')

### **2. Dataset**
#### Context
The World Happiness Report is a landmark survey of the state of global happiness. The first report was published in 2012, the second in 2013, the third in 2015, and the fourth in the 2016 Update. The World Happiness 2017, which ranks 155 countries by their happiness levels, was released at the United Nations at an event celebrating International Day of Happiness on March 20th. The report continues to gain global recognition as governments, organizations and civil society increasingly use happiness indicators to inform their policy-making decisions. Leading experts across fields – economics, psychology, survey analysis, national statistics, health, public policy and more – describe how measurements of well-being can be used effectively to assess the progress of nations. The reports review the state of happiness in the world today and show how the new science of happiness explains personal and national variations in happiness.

#### Content
The happiness scores and rankings use data from the Gallup World Poll. The scores are based on answers to the main life evaluation question asked in the poll. This question, known as the Cantril ladder, asks respondents to think of a ladder with the best possible life for them being a 10 and the worst possible life being a 0 and to rate their own current lives on that scale. The scores are from nationally representative samples for the years 2013-2016 and use the Gallup weights to make the estimates representative. The columns following the happiness score estimate the extent to which each of six factors – economic production, social support, life expectancy, freedom, absence of corruption, and generosity – contribute to making life evaluations higher in each country than they are in Dystopia, a hypothetical country that has values equal to the world’s lowest national averages for each of the six factors. They have no impact on the total score reported for each country, but they do explain why some countries rank higher than others.
#### Inspiration
What countries or regions rank the highest in overall happiness and each of the six factors contributing to happiness? How did country ranks or scores change between the 2015 and 2016 as well as the 2016 and 2017 reports? Did any country experience a significant increase or decrease in happiness?

##### 1. What is Dystopia?
* Dystopia is an imaginary country that has the world’s least-happy people. The purpose in establishing Dystopia is to have a benchmark against which all countries can be favorably compared (no country performs more poorly than Dystopia) in terms of each of the six key variables, thus allowing each sub-bar to be of positive width. The lowest scores observed for the six key variables, therefore, characterize Dystopia. Since life would be very unpleasant in a country with the world’s lowest incomes, lowest life expectancy, lowest generosity, most corruption, least freedom and least social support, it is referred to as “Dystopia,” in contrast to Utopia.

##### 2. What are the residuals?
* The residuals, or unexplained components, differ for each country, reflecting the extent to which the six variables either over- or under-explain average 2014-2016 life evaluations. These residuals have an average value of approximately zero over the whole set of countries. Figure 2.2 shows the average residual for each country when the equation in Table 2.1 is applied to average 2014- 2016 data for the six variables in that country. We combine these residuals with the estimate for life evaluations in Dystopia so that the combined bar will always have positive values. As can be seen in Figure 2.2, although some life evaluation residuals are quite large, occasionally exceeding one point on the scale from 0 to 10, they are always much smaller than the calculated value in Dystopia, where the average life is rated at 1.85 on the 0 to 10 scale.

##### 3. What do the columns succeeding the Happiness Score(like Family, Generosity, etc.) describe?

* The following columns: GDP per Capita, Family, Life Expectancy, Freedom, Generosity, Trust Government Corruption describe the extent to which these factors contribute in evaluating the happiness in each country.
* The Dystopia Residual metric actually is the Dystopia Happiness Score(1.85) + the Residual value or the unexplained value for each country as stated in the previous answer.

* If you add all these factors up, you get the happiness score so it might be un-reliable to model them to predict Happiness Scores.



In [None]:
year_2015.head()

In [None]:
year_2015.info()

In [None]:
year_2016.head()

In [None]:
year_2016.info()

In [None]:
year_2017.head()

In [None]:
year_2017.info()

In [None]:
year_2018.head()

In [None]:
year_2018['Perceptions of corruption'] = year_2018['Perceptions of corruption'].fillna(0) # replace NaN by 0
year_2018.info()

In [None]:
year_2019.head()

In [None]:
year_2019.info()

## **III. Result**
### **1. Happiness map 2015**

In [None]:
data = dict(type = 'choropleth', 
            colorscale = 'Viridis', 
            marker_line_width=1, 
            locations = year_2015['Country'], 
            locationmode = "country names", 
            z = year_2015['Happiness Score'], 
            text = year_2015['Country'], 
            colorbar = {'title' : 'Happiness Score'}
           )
layout = dict(title = 'Happiness Map 2015',
              geo = dict(projection = {'type':'mercator'}, showocean = False, showlakes = True, showrivers = True, )
             )
choromap = go.Figure(data = [data],layout = layout)
iplot(choromap,validate=False)

#### **Comment:**
* Look at the Happiness Map 2015, you can see the highest score almost belongs to the developed economy countries. Thus, can say that the economy has a big role in people's happiness.

### **2. Line chart of Happiness Score from 2015 to 2019 follow each country**

In [None]:
# Creating trace for 2015
trace_5 = go.Scatter(x = year_2015['Country'],
                    y = year_2015['Happiness Score'],
                    mode = "lines+markers",
                    name = "2015",
                    marker = dict(color = 'red'),
                    text= year_2015['Country'])

# Creating trace for 2016
trace_6 = go.Scatter(x = year_2015['Country'],
                    y = year_2016['Happiness Score'],
                    mode = "lines+markers",
                    name = "2016",
                    marker = dict(color = 'green'),
                    text= year_2015['Country'])

# Creating trace for 2017
trace_7 = go.Scatter(x = year_2015['Country'],
                    y = year_2017['Happiness.Score'],
                    mode = "lines+markers",
                    name = "2017",
                    marker = dict(color = 'blue'),
                    text= year_2015['Country'])

# Creating trace for 2018
trace_8 = go.Scatter(x = year_2015['Country'],
                    y = year_2018['Score'],
                    mode = "lines+markers",
                    name = "2018",
                    marker = dict(color = 'yellow'),
                    text= year_2015['Country'])

# Creating trace for 2019
trace_9 = go.Scatter(x = year_2015['Country'],
                    y = year_2019['Score'],
                    mode = "lines+markers",
                    name = "2019",
                    marker = dict(color = 'black'),
                    text= year_2015['Country'])
data = [trace_5, trace_6, trace_7, trace_8, trace_9]
layout = dict(title = 'Happiness Score from 2015 to 2019',
              xaxis= dict(title= 'Countries',ticklen= 5,zeroline= False),
              yaxis= dict(title= 'Happiness Score',ticklen= 5,zeroline= False),
              hovermode="x unified")

fig = dict(data = data, layout = layout)
iplot(fig)

#### Comment:
* Switzerland is the country has the highest happiness score

### **3. The correlation between the other features and Happiness Score**


In [None]:
year_2015.corr(method ='pearson')

#### **Comment:** 
The features have more affect (>0.5) to Happiness score in 2015 are: Economy, Family, Health, Freedom

In [None]:
year_2016.corr(method ='pearson')

#### **Comment:** 
The features have more affect (>0.5) to Happiness score in 2016 are: Confidence Interval, Economy, Family, Health, Freedom

In [None]:
year_2017.corr(method ='pearson')

#### **Comment:** 
The features have more affect (>0.5) to Happiness score in 2017 are: Whisker, Economy, Family, Health, Freedom

In [None]:
year_2018.corr(method ='pearson')

#### **Comment:** 
The features have more affect (>0.5) to Happiness score in 2018 are: Economy (GDP per capita), Social support, Healthy, Freedom

In [None]:
year_2019.corr(method ='pearson')

#### **Comment:** 
The features have more affect (>0.5) to Happiness score are: Economy (GDP per capita), Social support, Health, Freedom

### **4. Line chart visualize the features over years**

In [None]:
colors = ['red', 'green', 'blue', 'yellow', 'black', 'pink']
def line_chart_feature(df, features, year):
    """
        draw line chart follow the features of dataframe (df) in year
    """
    traces = []
    color = 0
    country = 'Country'
    if year == '2018' or year == '2019':
        country = 'Country or region'
    for feature in features:
        trace = go.Scatter(x = df[country],
                            y = df[feature],
                            mode = "lines+markers",
                            name = feature,
                            marker = dict(color = colors[color]),
                            text= df[country])
        traces.append(trace)
        color = color + 1
    layout = dict(title = 'features in ' + year + ' and happiness score',
              xaxis= dict(title= 'Countries',ticklen= 5,zeroline= False),
              hovermode="x unified")
    fig = dict(data = traces, layout = layout)
    iplot(fig)

In [None]:
line_chart_feature(year_2015, ['Happiness Score', 'Economy (GDP per Capita)','Family','Health (Life Expectancy)','Freedom'], '2015')

In [None]:
line_chart_feature(year_2016, ['Happiness Score', 'Upper Confidence Interval', 'Economy (GDP per Capita)','Family','Health (Life Expectancy)','Freedom'], '2016')

In [None]:
line_chart_feature(year_2017, ['Happiness.Score', 'Whisker.low','Economy..GDP.per.Capita.','Family', 'Health..Life.Expectancy.', 'Freedom'], '2017')

In [None]:
line_chart_feature(year_2018, ['Score', 'GDP per capita','Social support','Healthy life expectancy', 'Freedom to make life choices'], '2018')

In [None]:
line_chart_feature(year_2019, ['Score', 'GDP per capita','Social support','Healthy life expectancy', 'Freedom to make life choices'], '2019')

# **IV. Conclusion:**
From the above visualizations and comments, we can say that the most features make people in a country happy are:
* Economy
* Family
* Healthy
* Freedom
* Confidence Interval
* And have Social support in recently years 