**The World Happiness Report is a landmark survey of the state of global happiness. The first report was published in 2012, the second in 2013, the third in 2015, and the fourth in the 2016 Update. The World Happiness 2017, which ranks 155 countries by their happiness levels, was released at the United Nations at an event celebrating International Day of Happiness on March 20th.**

# Setting up Environment

In [1]:
library('tidyverse')
library('janitor')
library('skimr')
library('here')
library('dplyr')
library('lubridate')
library('ggplot2')
library(sf)
library("rnaturalearth")

# Collecting Data

In [2]:
data_2019 <- read_csv("../input/world-happiness/2019.csv")
data_2018 <- read_csv("../input/world-happiness/2018.csv")
data_2017 <- read_csv("../input/world-happiness/2017.csv")
data_2016 <- read_csv("../input/world-happiness/2016.csv")
data_2015 <- read_csv("../input/world-happiness/2015.csv")

# WRANGLE DATA AND COMBINE INTO A SINGLE FILE

Checking for the data and columns in all the data sets.

In [3]:
head(data_2015)
head(data_2016)
head(data_2017)
head(data_2018)
head(data_2019)

As we need to combine the data, here I am going to change the datatype of the Perceptionof corruption column in 2018 data as it is of double type in other data sets.

In [4]:
data_2018$'Perceptions of corruption' <- as.double(data_2018$'Perceptions of corruption')

In [5]:
colnames(data_2019)

Lets add a new column named year.

In [6]:
data_2015 <- mutate(data_2015, Year = as.character("2015"))
data_2016 <- mutate(data_2016, Year = as.character("2016"))
data_2017 <- mutate(data_2017, Year = as.character("2017"))
data_2018 <- mutate(data_2018, Year = as.character("2018"))
data_2019 <- mutate(data_2019, Year = as.character("2019"))

In [7]:
data_2017 <- rename(data_2017,
                  "Overall rank" = Happiness.Rank,
                  "Country or region" = Country,
                  Score = Happiness.Score,
                  "GDP per capita" = Economy..GDP.per.Capita.,
                  "Social support" = Family,
                  "Healthy life expectancy" = Health..Life.Expectancy.,
                  "Freedom to make life choices" = Freedom,
                  Generosity = Generosity,
                  "Perceptions of corruption" = Trust..Government.Corruption.)

data_2016 <- rename(data_2016,
                  "Overall rank" = "Happiness Rank"
                  , "Country or region" = Country
                  , Score = "Happiness Score"
                  , "GDP per capita" = "Economy (GDP per Capita)"
                  , "Social support" = Family
                  , "Healthy life expectancy" = "Health (Life Expectancy)"
                  , "Freedom to make life choices" = Freedom
                  , Generosity = Generosity
                  , "Perceptions of corruption" = "Trust (Government Corruption)")

data_2015 <- rename(data_2015,
                  "Overall rank" = "Happiness Rank"
                  , "Country or region" = Country
                  , Score = "Happiness Score"
                  , "GDP per capita" = "Economy (GDP per Capita)"
                  , "Social support" = Family
                  , "Healthy life expectancy" ="Health (Life Expectancy)"
                  , "Freedom to make life choices" = Freedom
                  , Generosity = Generosity
                  , "Perceptions of corruption" = "Trust (Government Corruption)")

In [8]:
colnames(data_2015)
colnames(data_2016)
colnames(data_2017)
colnames(data_2018)
colnames(data_2019)

Lets combine all the data in one file

In [9]:
data_2015_to_2019 <- bind_rows(data_2015, data_2016, data_2017, data_2018, data_2019)

In [10]:
View(data_2015_to_2019)

Now let us deselect those columns from the dataset data_2015_to_2019 and have a snippet into the combined data using the head function.

In [11]:
data_2015_to_2019 <- data_2015_to_2019 %>%
  select(-c(Region, "Standard Error", Dystopia.Residual, "Lower Confidence Interval",
            "Upper Confidence Interval", Whisker.high, Whisker.low))
 head(data_2015_to_2019)

In [12]:
data_2015_to_2019 <- rename(data_2015_to_2019,
                          Country = "Country or region",
                          Happiness_Rank = "Overall rank",
                          Happiness_Score = Score,
                          GDP_per_Capita = "GDP per capita",
                          Social_Support = "Social support",
                          Life_Expectancy = "Healthy life expectancy",
                          Freedom = "Freedom to make life choices",
                          Perceptions_of_Corruption = "Perceptions of corruption")


In [13]:
View(data_2015_to_2019)

In [14]:
data_2015 <- rename(data_2015,
                  Country = "Country or region")
Regions_merge <- data_2015 %>%
  select(Country, Region)

In [15]:
data_2015_to_2019 <- merge(data_2015_to_2019, Regions_merge, by = "Country")

In [16]:
View(data_2015_to_2019)

# Analyzing data

**Realtion between Happiness Score and GDP_per_Capita**

In [17]:
ggplot(data= data_2015_to_2019,aes(x=Happiness_Score, y=GDP_per_Capita))+
geom_point(aes(color=Happiness_Score))+
facet_wrap(~Year)

**Relation between Social Support and Happiness Score**

In [18]:
ggplot(data= data_2015_to_2019,aes(x=Happiness_Score, y=Social_Support))+
geom_point(aes(color=Happiness_Score))+
facet_wrap(~Year)


**Relation between Life Expectancy and Happiness Score**

In [19]:
ggplot(data= data_2015_to_2019,aes(x=Life_Expectancy, y=Happiness_Score ))+
geom_point(aes(color=Happiness_Score))+
facet_wrap(~Year)

**Relation between Perception of Corruption and Happiness Score**

In [20]:
ggplot(data= data_2015_to_2019,aes(x= Happiness_Score, y=Perceptions_of_Corruption ))+
geom_point(aes(color=Happiness_Score))+
facet_wrap(~Year)+geom_smooth()

**Relation between Freedom and Happiness Score**

In [21]:
ggplot(data= data_2015_to_2019,aes(x=Freedom, y=Happiness_Score ))+
geom_point(aes(color=Happiness_Score))+
facet_wrap(~Year)+geom_smooth(method='lm')
    

**Relation between Generosity and Happiness Score**

In [22]:
ggplot(data= data_2015_to_2019,aes(x=Happiness_Score, y= Generosity))+
geom_point(aes(color=Happiness_Score))+
facet_wrap(~Year)
    

In [23]:
gdp_cor <- round(cor(data_2015_to_2019$GDP_per_Capita, 
                               data_2015_to_2019$Happiness_Score), 2)

ss_cor <- round(cor(data_2015_to_2019$Social_Support, 
                              data_2015_to_2019$Happiness_Score), 2)

le_cor <- round(cor(data_2015_to_2019$Life_Expectancy, 
                              data_2015_to_2019$Happiness_Score), 2)

fr_cor <- round(cor(data_2015_to_2019$Freedom, 
                             data_2015_to_2019$Happiness_Score), 2)

poc_cor <- round(cor(data_2015_to_2019$Perceptions_of_Corruption,
                               data_2015_to_2019$Happiness_Score, use = 'complete.obs'), 2)

gen_cor <- round(cor(data_2015_to_2019$Generosity, data_2015_to_2019$Happiness_Score, 
                         method = 'pearson'), 2) 
gdp_cor
ss_cor
le_cor
fr_cor
poc_cor
gen_cor

# Conclusion

The GDP per capital relates most to the happiness of nations. This is because it had the highest positive corelation value with the overall happiness score of nations.

1. GDP per capital relates the most to happiness [0.8] 
2. Life expectancy follows in at second position [0.75] 
3. Social support is third [0.65] 
4. Freedom is forth [0.55] 
5. Perception of corruption [0.41] 
6. Generosity is the least contributor [0.14]