# **Overview**

Throughout this assignment, you will be performing specific well-defined tasks that’ll strengthen your concepts in data visualization. We will be using the 2021 world happiness report for the assignment and here is a brief context about the same - “The World Happiness Report 2021 focuses on the effects of COVID-19 and how people all over the world have fared. The objective of the report was two-fold, first to focus on the effects of COVID-19 on the structure and quality of people’s lives, and second to describe and evaluate how governments all over the world have dealt with the pandemic.”

As part of the assignment, you will have to accomplish the below tasks.


**Author:** Chintoo Kumar
**Contributor:** Chanukya Patnaik

# **Dataset**
Dataset Link: https://raw.githubusercontent.com/dphi-official/Datasets/master/world-happiness-report-2021.csv




**About the dataset:**

It contains data from 149 countries of the world. The dataset comprises 20 different attributes that provide information on each country's world happiness report scores for the year 2021. The happiness study ranks the nations of the world based on questions from the Gallup World Poll. The results are then correlated with other factors, including GDP and social security, etc.
Data Description:

* 'Country name': Name of the country.
* 'Regional indicator': Region the country belongs to.
* 'Ladder score': Changes in well-being. Happiness score or subjective well-being.
* 'Standard error of ladder score': changes of well-being based on
standard errors clustered at the country level. 
* 'upperwhisker':Age 60+
* 'lowerwhisker':Age<30
* 'Logged GDP per capita': Economic production of a country. The statistics of GDP per capita.
* 'Social support':Social support (or having someone to count on in times of trouble) is the national
average of the binary responses (either 0 or 1) to the GWP question “If you
were in trouble, do you have relatives or friends you can count on to help you
whenever you need them, or not?”
* 'Healthy life expectancy':Rank of the country based on the Happiness Score.
* 'Freedom to make life choices':The extent to which Freedom contributed to the calculation of the Happiness Score.
* 'Generosity': Generosity is the residual of regressing national average of response to the GWP
question “Have you donated money to a charity in the past month?” on GDP
per capita.
* 'Perceptions of corruption': Absence of corruption.
* 'Ladder score in Dystopia': A social evil lead to inhumanized or fearful lives for the people.
* 'Explained by: Log GDP per capita': explaining economic status of a country by comparings GDPs.
* 'Explained by: Social support': social factors and social behaviors—including
the quality and quantity of people’s social relationships—have also been shown to protect
well-being during the pandemic.
* 'Explained by: Healthy life expectancy': the objective benefits of happiness.
* 'Explained by: Freedom to make life choices':  perceived freedom to make life choice.
* 'Explained by: Generosity':  importance of ethics, policy implications, and links with the Organisation for Economic Co-operation and Development's (OECD) approach to measuring subjective well-being and other international and national efforts.
* 'Explained by: Perceptions of corruption': Weight score due to involving in some kind of corruption by a country.
* 'Dystopia + residual: psychological factors indicating some kinds of suffering in a society and lead to mental health problem.

# **Task 1: Data loading and Data Analysis**

* Load the data file name it as: df1
* Display the first 5 rows of the world-happiness-report-2021
* Display the last 10 observations of the world-happiness-report-2021
* Display a concise summary of the provided data and list out 2 observations/inferences that you observe from the result. You can use the info() method for this.
* Display the descriptive statistics of the world-happiness-report-2021
* Is there any missing values in each column of the provided dataset
* How many unique countries are there in western europe
* Display all the unique countries of western europe
* Filter and display the world happiness report score for the country 'India' in year 2021




In [5]:
#loading data file
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

df1 = pd.read_csv('https://raw.githubusercontent.com/dphi-official/Datasets/master/world-happiness-report-2021.csv')
df1


Unnamed: 0,Country name,Regional indicator,Ladder score,Standard error of ladder score,upperwhisker,lowerwhisker,Logged GDP per capita,Social support,Healthy life expectancy,Freedom to make life choices,Generosity,Perceptions of corruption,Ladder score in Dystopia,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Dystopia + residual
0,Finland,Western Europe,7.842,0.032,7.904,7.780,10.775,0.954,72.000,0.949,-0.098,0.186,2.43,1.446,1.106,0.741,0.691,0.124,0.481,3.253
1,Denmark,Western Europe,7.620,0.035,7.687,7.552,10.933,0.954,72.700,0.946,0.030,0.179,2.43,1.502,1.108,0.763,0.686,0.208,0.485,2.868
2,Switzerland,Western Europe,7.571,0.036,7.643,7.500,11.117,0.942,74.400,0.919,0.025,0.292,2.43,1.566,1.079,0.816,0.653,0.204,0.413,2.839
3,Iceland,Western Europe,7.554,0.059,7.670,7.438,10.878,0.983,73.000,0.955,0.160,0.673,2.43,1.482,1.172,0.772,0.698,0.293,0.170,2.967
4,Netherlands,Western Europe,7.464,0.027,7.518,7.410,10.932,0.942,72.400,0.913,0.175,0.338,2.43,1.501,1.079,0.753,0.647,0.302,0.384,2.798
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
144,Lesotho,Sub-Saharan Africa,3.512,0.120,3.748,3.276,7.926,0.787,48.700,0.715,-0.131,0.915,2.43,0.451,0.731,0.007,0.405,0.103,0.015,1.800
145,Botswana,Sub-Saharan Africa,3.467,0.074,3.611,3.322,9.782,0.784,59.269,0.824,-0.246,0.801,2.43,1.099,0.724,0.340,0.539,0.027,0.088,0.648
146,Rwanda,Sub-Saharan Africa,3.415,0.068,3.548,3.282,7.676,0.552,61.400,0.897,0.061,0.167,2.43,0.364,0.202,0.407,0.627,0.227,0.493,1.095
147,Zimbabwe,Sub-Saharan Africa,3.145,0.058,3.259,3.030,7.943,0.750,56.201,0.677,-0.047,0.821,2.43,0.457,0.649,0.243,0.359,0.157,0.075,1.205


Display the first 5 rows of the world-happiness-report-2021

In [6]:
df1.head()

Unnamed: 0,Country name,Regional indicator,Ladder score,Standard error of ladder score,upperwhisker,lowerwhisker,Logged GDP per capita,Social support,Healthy life expectancy,Freedom to make life choices,Generosity,Perceptions of corruption,Ladder score in Dystopia,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Dystopia + residual
0,Finland,Western Europe,7.842,0.032,7.904,7.78,10.775,0.954,72.0,0.949,-0.098,0.186,2.43,1.446,1.106,0.741,0.691,0.124,0.481,3.253
1,Denmark,Western Europe,7.62,0.035,7.687,7.552,10.933,0.954,72.7,0.946,0.03,0.179,2.43,1.502,1.108,0.763,0.686,0.208,0.485,2.868
2,Switzerland,Western Europe,7.571,0.036,7.643,7.5,11.117,0.942,74.4,0.919,0.025,0.292,2.43,1.566,1.079,0.816,0.653,0.204,0.413,2.839
3,Iceland,Western Europe,7.554,0.059,7.67,7.438,10.878,0.983,73.0,0.955,0.16,0.673,2.43,1.482,1.172,0.772,0.698,0.293,0.17,2.967
4,Netherlands,Western Europe,7.464,0.027,7.518,7.41,10.932,0.942,72.4,0.913,0.175,0.338,2.43,1.501,1.079,0.753,0.647,0.302,0.384,2.798


Display the last 10 observations of the world-happiness-report-2021

In [7]:
df1.iloc[139:149,:]

Unnamed: 0,Country name,Regional indicator,Ladder score,Standard error of ladder score,upperwhisker,lowerwhisker,Logged GDP per capita,Social support,Healthy life expectancy,Freedom to make life choices,Generosity,Perceptions of corruption,Ladder score in Dystopia,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Dystopia + residual
139,Burundi,Sub-Saharan Africa,3.775,0.107,3.985,3.565,6.635,0.49,53.4,0.626,-0.024,0.607,2.43,0.0,0.062,0.155,0.298,0.172,0.212,2.876
140,Yemen,Middle East and North Africa,3.658,0.07,3.794,3.521,7.578,0.832,57.122,0.602,-0.147,0.8,2.43,0.329,0.831,0.272,0.268,0.092,0.089,1.776
141,Tanzania,Sub-Saharan Africa,3.623,0.071,3.762,3.485,7.876,0.702,57.999,0.833,0.183,0.577,2.43,0.433,0.54,0.3,0.549,0.307,0.231,1.263
142,Haiti,Latin America and Caribbean,3.615,0.173,3.953,3.276,7.477,0.54,55.7,0.593,0.422,0.721,2.43,0.294,0.173,0.227,0.257,0.463,0.139,2.06
143,Malawi,Sub-Saharan Africa,3.6,0.092,3.781,3.419,6.958,0.537,57.948,0.78,0.038,0.729,2.43,0.113,0.168,0.298,0.484,0.213,0.134,2.19
144,Lesotho,Sub-Saharan Africa,3.512,0.12,3.748,3.276,7.926,0.787,48.7,0.715,-0.131,0.915,2.43,0.451,0.731,0.007,0.405,0.103,0.015,1.8
145,Botswana,Sub-Saharan Africa,3.467,0.074,3.611,3.322,9.782,0.784,59.269,0.824,-0.246,0.801,2.43,1.099,0.724,0.34,0.539,0.027,0.088,0.648
146,Rwanda,Sub-Saharan Africa,3.415,0.068,3.548,3.282,7.676,0.552,61.4,0.897,0.061,0.167,2.43,0.364,0.202,0.407,0.627,0.227,0.493,1.095
147,Zimbabwe,Sub-Saharan Africa,3.145,0.058,3.259,3.03,7.943,0.75,56.201,0.677,-0.047,0.821,2.43,0.457,0.649,0.243,0.359,0.157,0.075,1.205
148,Afghanistan,South Asia,2.523,0.038,2.596,2.449,7.695,0.463,52.493,0.382,-0.102,0.924,2.43,0.37,0.0,0.126,0.0,0.122,0.01,1.895


Display a concise summary of the provided data and list out 2 observations/inferences that you observe from the result. You can use the info() method for this.

In [8]:
df1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 149 entries, 0 to 148
Data columns (total 20 columns):
 #   Column                                      Non-Null Count  Dtype  
---  ------                                      --------------  -----  
 0   Country name                                149 non-null    object 
 1   Regional indicator                          149 non-null    object 
 2   Ladder score                                149 non-null    float64
 3   Standard error of ladder score              149 non-null    float64
 4   upperwhisker                                149 non-null    float64
 5   lowerwhisker                                149 non-null    float64
 6   Logged GDP per capita                       149 non-null    float64
 7   Social support                              149 non-null    float64
 8   Healthy life expectancy                     149 non-null    float64
 9   Freedom to make life choices                149 non-null    float64
 10  Generosity    

Display the descriptive statistics of the world-happiness-report-2021

In [9]:
df1.describe()

Unnamed: 0,Ladder score,Standard error of ladder score,upperwhisker,lowerwhisker,Logged GDP per capita,Social support,Healthy life expectancy,Freedom to make life choices,Generosity,Perceptions of corruption,Ladder score in Dystopia,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Dystopia + residual
count,149.0,149.0,149.0,149.0,149.0,149.0,149.0,149.0,149.0,149.0,149.0,149.0,149.0,149.0,149.0,149.0,149.0,149.0
mean,5.532839,0.058752,5.648007,5.417631,9.432208,0.814745,64.992799,0.791597,-0.015134,0.72745,2.43,0.977161,0.793315,0.520161,0.498711,0.178047,0.135141,2.430329
std,1.073924,0.022001,1.05433,1.094879,1.158601,0.114889,6.762043,0.113332,0.150657,0.179226,0.0,0.40474,0.258871,0.213019,0.137888,0.09827,0.114361,0.537645
min,2.523,0.026,2.596,2.449,6.635,0.463,48.478,0.382,-0.288,0.082,2.43,0.0,0.0,0.0,0.0,0.0,0.0,0.648
25%,4.852,0.043,4.991,4.706,8.541,0.75,59.802,0.718,-0.126,0.667,2.43,0.666,0.647,0.357,0.409,0.105,0.06,2.138
50%,5.534,0.054,5.625,5.413,9.569,0.832,66.603,0.804,-0.036,0.781,2.43,1.025,0.832,0.571,0.514,0.164,0.101,2.509
75%,6.255,0.07,6.344,6.128,10.421,0.905,69.6,0.877,0.079,0.845,2.43,1.323,0.996,0.665,0.603,0.239,0.174,2.794
max,7.842,0.173,7.904,7.78,11.647,0.983,76.953,0.97,0.542,0.939,2.43,1.751,1.172,0.897,0.716,0.541,0.547,3.482


 Is there any missing values in each column of the provided dataset? No, there isn't any missing value.

How many unique countries are there in western europe? There are 21 countries

In [16]:
df1['Regional indicator'].value_counts()

Sub-Saharan Africa                    36
Western Europe                        21
Latin America and Caribbean           20
Middle East and North Africa          17
Central and Eastern Europe            17
Commonwealth of Independent States    12
Southeast Asia                         9
South Asia                             7
East Asia                              6
North America and ANZ                  4
Name: Regional indicator, dtype: int64

Display all the unique countries of western europe

In [25]:
result= df1.loc[df1['Regional indicator'] == 'Western Europe']
result

Unnamed: 0,Country name,Regional indicator,Ladder score,Standard error of ladder score,upperwhisker,lowerwhisker,Logged GDP per capita,Social support,Healthy life expectancy,Freedom to make life choices,Generosity,Perceptions of corruption,Ladder score in Dystopia,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Dystopia + residual
0,Finland,Western Europe,7.842,0.032,7.904,7.78,10.775,0.954,72.0,0.949,-0.098,0.186,2.43,1.446,1.106,0.741,0.691,0.124,0.481,3.253
1,Denmark,Western Europe,7.62,0.035,7.687,7.552,10.933,0.954,72.7,0.946,0.03,0.179,2.43,1.502,1.108,0.763,0.686,0.208,0.485,2.868
2,Switzerland,Western Europe,7.571,0.036,7.643,7.5,11.117,0.942,74.4,0.919,0.025,0.292,2.43,1.566,1.079,0.816,0.653,0.204,0.413,2.839
3,Iceland,Western Europe,7.554,0.059,7.67,7.438,10.878,0.983,73.0,0.955,0.16,0.673,2.43,1.482,1.172,0.772,0.698,0.293,0.17,2.967
4,Netherlands,Western Europe,7.464,0.027,7.518,7.41,10.932,0.942,72.4,0.913,0.175,0.338,2.43,1.501,1.079,0.753,0.647,0.302,0.384,2.798
5,Norway,Western Europe,7.392,0.035,7.462,7.323,11.053,0.954,73.3,0.96,0.093,0.27,2.43,1.543,1.108,0.782,0.703,0.249,0.427,2.58
6,Sweden,Western Europe,7.363,0.036,7.433,7.293,10.867,0.934,72.7,0.945,0.086,0.237,2.43,1.478,1.062,0.763,0.685,0.244,0.448,2.683
7,Luxembourg,Western Europe,7.324,0.037,7.396,7.252,11.647,0.908,72.6,0.907,-0.034,0.386,2.43,1.751,1.003,0.76,0.639,0.166,0.353,2.653
9,Austria,Western Europe,7.268,0.036,7.337,7.198,10.906,0.934,73.3,0.908,0.042,0.481,2.43,1.492,1.062,0.782,0.64,0.215,0.292,2.784
12,Germany,Western Europe,7.155,0.04,7.232,7.077,10.873,0.903,72.5,0.875,0.011,0.46,2.43,1.48,0.993,0.757,0.6,0.195,0.306,2.824


Filter and display the world happiness report score for the country 'India' in year 2021

In [29]:
df1[df1['Country name']== 'India']

Unnamed: 0,Country name,Regional indicator,Ladder score,Standard error of ladder score,upperwhisker,lowerwhisker,Logged GDP per capita,Social support,Healthy life expectancy,Freedom to make life choices,Generosity,Perceptions of corruption,Ladder score in Dystopia,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Dystopia + residual
138,India,South Asia,3.819,0.026,3.869,3.769,8.755,0.603,60.633,0.893,0.089,0.774,2.43,0.741,0.316,0.383,0.622,0.246,0.106,1.405


# **Task2 : Visualization of results using Matplotlib library**

* Display the data of top 5 east asian countries based on 'Generosity'
* Build a plot (line plot) that shows the variation of 'Ladder score' among the above 5 east asian countries (based on generosity)
* Build a plot that shows the variation of 'Ladder score' among the 5 south east asian countries
* Create a dataframe object : df_2021 with the following countries: 'China', 'Nepal', 'Bangladesh', 'Pakistan', 'Myanmar', 'India', 'Afghanistan'. Now, build a scatter plot to show the relation these countries vs their 'Logged GDP per capita'

Display the data of top 5 east asian countries based on 'Generosity'

In [2]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

df1 = pd.read_csv('https://raw.githubusercontent.com/dphi-official/Datasets/master/world-happiness-report-2021.csv')

result1= df1.loc[df1['Regional indicator'] == 'East Asia']
result1

URLError: <urlopen error [Errno 11001] getaddrinfo failed>

In [34]:
result2 = df1[df1['Regional indicator'].str.contains('Asia')]
result2

Unnamed: 0,Country name,Regional indicator,Ladder score,Standard error of ladder score,upperwhisker,lowerwhisker,Logged GDP per capita,Social support,Healthy life expectancy,Freedom to make life choices,Generosity,Perceptions of corruption,Ladder score in Dystopia,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Dystopia + residual
23,Taiwan Province of China,East Asia,6.584,0.038,6.659,6.51,10.871,0.898,69.6,0.784,-0.07,0.721,2.43,1.48,0.982,0.665,0.49,0.142,0.139,2.687
31,Singapore,Southeast Asia,6.377,0.043,6.46,6.293,11.488,0.915,76.953,0.927,-0.018,0.082,2.43,1.695,1.019,0.897,0.664,0.176,0.547,1.379
53,Thailand,Southeast Asia,5.985,0.047,6.077,5.893,9.805,0.888,67.401,0.884,0.287,0.895,2.43,1.107,0.957,0.596,0.611,0.375,0.028,2.309
55,Japan,East Asia,5.94,0.04,6.02,5.861,10.611,0.884,75.1,0.796,-0.258,0.638,2.43,1.389,0.949,0.838,0.504,0.02,0.192,2.048
60,Philippines,Southeast Asia,5.88,0.052,5.982,5.778,9.076,0.83,62.0,0.917,-0.097,0.742,2.43,0.853,0.828,0.426,0.651,0.125,0.126,2.872
61,South Korea,East Asia,5.845,0.042,5.928,5.763,10.651,0.799,73.9,0.672,-0.083,0.727,2.43,1.403,0.758,0.801,0.353,0.134,0.135,2.262
69,Mongolia,East Asia,5.677,0.042,5.76,5.595,9.4,0.935,62.5,0.708,0.116,0.856,2.43,0.966,1.065,0.442,0.397,0.263,0.053,2.492
76,Hong Kong S.A.R. of China,East Asia,5.477,0.049,5.573,5.38,11.0,0.836,76.82,0.717,0.067,0.403,2.43,1.525,0.841,0.893,0.408,0.232,0.342,1.236
78,Vietnam,Southeast Asia,5.411,0.039,5.488,5.334,8.973,0.85,68.034,0.94,-0.098,0.796,2.43,0.817,0.873,0.616,0.679,0.124,0.091,2.211
80,Malaysia,Southeast Asia,5.384,0.049,5.48,5.289,10.238,0.817,67.102,0.895,0.125,0.839,2.43,1.259,0.797,0.587,0.624,0.27,0.064,1.784


In [36]:
result2.sort_values('Generosity')

Unnamed: 0,Country name,Regional indicator,Ladder score,Standard error of ladder score,upperwhisker,lowerwhisker,Logged GDP per capita,Social support,Healthy life expectancy,Freedom to make life choices,Generosity,Perceptions of corruption,Ladder score in Dystopia,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Dystopia + residual
55,Japan,East Asia,5.94,0.04,6.02,5.861,10.611,0.884,75.1,0.796,-0.258,0.638,2.43,1.389,0.949,0.838,0.504,0.02,0.192,2.048
83,China,East Asia,5.339,0.029,5.397,5.281,9.673,0.811,69.593,0.904,-0.146,0.755,2.43,1.061,0.785,0.665,0.636,0.093,0.117,1.982
148,Afghanistan,South Asia,2.523,0.038,2.596,2.449,7.695,0.463,52.493,0.382,-0.102,0.924,2.43,0.37,0.0,0.126,0.0,0.122,0.01,1.895
78,Vietnam,Southeast Asia,5.411,0.039,5.488,5.334,8.973,0.85,68.034,0.94,-0.098,0.796,2.43,0.817,0.873,0.616,0.679,0.124,0.091,2.211
60,Philippines,Southeast Asia,5.88,0.052,5.982,5.778,9.076,0.83,62.0,0.917,-0.097,0.742,2.43,0.853,0.828,0.426,0.651,0.125,0.126,2.872
61,South Korea,East Asia,5.845,0.042,5.928,5.763,10.651,0.799,73.9,0.672,-0.083,0.727,2.43,1.403,0.758,0.801,0.353,0.134,0.135,2.262
23,Taiwan Province of China,East Asia,6.584,0.038,6.659,6.51,10.871,0.898,69.6,0.784,-0.07,0.721,2.43,1.48,0.982,0.665,0.49,0.142,0.139,2.687
100,Bangladesh,South Asia,5.025,0.046,5.115,4.934,8.454,0.693,64.8,0.877,-0.041,0.682,2.43,0.635,0.52,0.514,0.603,0.161,0.164,2.427
31,Singapore,Southeast Asia,6.377,0.043,6.46,6.293,11.488,0.915,76.953,0.927,-0.018,0.082,2.43,1.695,1.019,0.897,0.664,0.176,0.547,1.379
88,Maldives,South Asia,5.198,0.072,5.339,5.057,9.826,0.913,70.6,0.854,0.024,0.825,2.43,1.115,1.015,0.697,0.575,0.204,0.073,1.52
