<a href="https://colab.research.google.com/github/OlgaKantarzhy/data-and-python/blob/main/How_happy_is_the_world.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# How happy is the world?

![HappinessImage-Benjamin Scott](https://drive.google.com/uc?id=1wWdXTclLAjPpZUiKkR8XGg0Sl7iD1u8T)  

(Image by Benjamin Scott, [source](https://www.natureindex.com/news-blog/data-visualization-these-are-the-happiest-countries-world-happiness-report-twenty-nineteen))   

The Sustainable Development Solutions Network (SDSN) collects data across the world relating to happiness.  They use this data to rank countries in order of happiness factor.

This is not an exact science but can give food for thought in terms of what factors might have the most impact on a nation's happiness levels.

Data is taken from the Gallup World Poll, so not collected directly by SDSN.  
Countries are grouped by region.  

### The factors included are:
Economy (measured in GDP per Capita)
Family (support systems)
Health (measured by Life Expectancy)
Freedom (sense of)
Trust (Government Corruption)
Generosity (charitable inclinations)
Dystopia Residual
*  Dystopic is the theoretical most unhappy country with the lowest levels in all six of the above factors  
*  The Residual measure is a calculated as the average of the six distances from lowest

Let's take a look at the data


---
### Open a data set

Open the data set, an Excel file with only one sheet (so sheet_name is not necessary) from here: https://github.com/futureCodersSE/working-with-data/blob/main/Happiness-Data/2015.xlsx?raw=true

Interrogate the data (head, tail, iloc) to get to know what it contains.


In [2]:
import pandas as pd
url = "https://github.com/futureCodersSE/working-with-data/blob/main/Happiness-Data/2015.xlsx?raw=true"
happiness = pd.read_excel(url)
happiness.head()

Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Standard Error,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
0,Switzerland,Western Europe,1,7.587,0.03411,1.39651,1.34951,0.94143,0.66557,0.41978,0.29678,2.51738
1,Iceland,Western Europe,2,7.561,0.04884,1.30232,1.40223,0.94784,0.62877,0.14145,0.4363,2.70201
2,Denmark,Western Europe,3,7.527,0.03328,1.32548,1.36058,0.87464,0.64938,0.48357,0.34139,2.49204
3,Norway,Western Europe,4,7.522,0.0388,1.459,1.33095,0.88521,0.66973,0.36503,0.34699,2.46531
4,Canada,North America,5,7.427,0.03553,1.32629,1.32261,0.90563,0.63297,0.32957,0.45811,2.45176


In [None]:
happiness.describe()

Unnamed: 0,Happiness Rank,Happiness Score,Standard Error,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
count,158.0,158.0,158.0,158.0,158.0,158.0,158.0,158.0,158.0,158.0
mean,79.493671,5.375734,0.047885,0.846137,0.991046,0.630259,0.428615,0.143422,0.237296,2.098977
std,45.754363,1.14501,0.017146,0.403121,0.272369,0.247078,0.150693,0.120034,0.126685,0.55355
min,1.0,2.839,0.01848,0.0,0.0,0.0,0.0,0.0,0.0,0.32858
25%,40.25,4.526,0.037268,0.545808,0.856823,0.439185,0.32833,0.061675,0.150553,1.75941
50%,79.5,5.2325,0.04394,0.910245,1.02951,0.696705,0.435515,0.10722,0.21613,2.095415
75%,118.75,6.24375,0.0523,1.158448,1.214405,0.811013,0.549092,0.180255,0.309883,2.462415
max,158.0,7.587,0.13693,1.69042,1.40223,1.02525,0.66973,0.55191,0.79588,3.60214


In [None]:
happiness.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 158 entries, 0 to 157
Data columns (total 12 columns):
 #   Column                         Non-Null Count  Dtype  
---  ------                         --------------  -----  
 0   Country                        158 non-null    object 
 1   Region                         158 non-null    object 
 2   Happiness Rank                 158 non-null    int64  
 3   Happiness Score                158 non-null    float64
 4   Standard Error                 158 non-null    float64
 5   Economy (GDP per Capita)       158 non-null    float64
 6   Family                         158 non-null    float64
 7   Health (Life Expectancy)       158 non-null    float64
 8   Freedom                        158 non-null    float64
 9   Trust (Government Corruption)  158 non-null    float64
 10  Generosity                     158 non-null    float64
 11  Dystopia Residual              158 non-null    float64
dtypes: float64(9), int64(1), object(2)
memory usage: 1

---
### Sort the data in different ways

The data is currently sorted in order of rank.  To sort the data in the table, run the code below, which identifies the column on which to sort in the brackets.

Then, **try sorting on other columns** *Note: you must type the column heading in the quotes and exactly as it appears in the table (including capitalisation)*.  To sort on multiple columns, enter a list of column headings in the brackets (e.g. `.sort_values(['Region','Freedom'])`



In [None]:
sorted_table = happiness.sort_values(['Family'])
sorted_table  # output the table below

Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Standard Error,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
147,Central African Republic,Sub-Saharan Africa,148,3.678,0.06112,0.07850,0.00000,0.06699,0.48879,0.08289,0.23835,2.72230
157,Togo,Sub-Saharan Africa,158,2.839,0.06727,0.20868,0.13995,0.28443,0.36453,0.10731,0.16681,1.56726
152,Afghanistan,Southern Asia,153,3.575,0.03084,0.31982,0.30285,0.30335,0.23414,0.09719,0.36510,1.95210
154,Benin,Sub-Saharan Africa,155,3.340,0.03656,0.28665,0.35386,0.31910,0.48450,0.08010,0.18260,1.63328
116,India,Southern Asia,117,4.565,0.02043,0.64499,0.38174,0.51529,0.39786,0.08492,0.26475,2.27513
...,...,...,...,...,...,...,...,...,...,...,...,...
43,Uzbekistan,Central and Eastern Europe,44,6.003,0.04361,0.63244,1.34043,0.59772,0.65821,0.30826,0.22837,2.23741
0,Switzerland,Western Europe,1,7.587,0.03411,1.39651,1.34951,0.94143,0.66557,0.41978,0.29678,2.51738
2,Denmark,Western Europe,3,7.527,0.03328,1.32548,1.36058,0.87464,0.64938,0.48357,0.34139,2.49204
17,Ireland,Western Europe,18,6.940,0.03676,1.33596,1.36948,0.89533,0.61777,0.28703,0.45901,1.97570


In [None]:
sorted_table = happiness.sort_values(['Region',"Country"])
sorted_table

Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Standard Error,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
9,Australia,Australia and New Zealand,10,7.284,0.04083,1.33358,1.30923,0.93156,0.65124,0.35637,0.43562,2.26646
8,New Zealand,Australia and New Zealand,9,7.286,0.03371,1.25018,1.31967,0.90837,0.63938,0.42922,0.47501,2.26425
94,Albania,Central and Eastern Europe,95,4.959,0.05013,0.87867,0.80434,0.81325,0.35733,0.06413,0.14272,1.89894
126,Armenia,Central and Eastern Europe,127,4.350,0.04763,0.76821,0.77711,0.72990,0.19847,0.03900,0.07855,1.75873
79,Azerbaijan,Central and Eastern Europe,80,5.212,0.03363,1.02389,0.93793,0.64045,0.37030,0.16065,0.07799,2.00073
...,...,...,...,...,...,...,...,...,...,...,...,...
87,Portugal,Western Europe,88,5.102,0.04802,1.15991,1.13935,0.87519,0.51469,0.01078,0.13719,1.26462
35,Spain,Western Europe,36,6.329,0.03468,1.23011,1.31379,0.95562,0.45951,0.06398,0.18227,2.12367
7,Sweden,Western Europe,8,7.364,0.03157,1.33171,1.28907,0.91087,0.65980,0.43844,0.36262,2.37119
0,Switzerland,Western Europe,1,7.587,0.03411,1.39651,1.34951,0.94143,0.66557,0.41978,0.29678,2.51738


In [None]:
sorted_table = happiness.sort_values(['Economy (GDP per Capita)'], ascending=False)
sorted_table

Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Standard Error,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
27,Qatar,Middle East and Northern Africa,28,6.611,0.06257,1.69042,1.07860,0.79733,0.64040,0.52208,0.32573,1.55674
16,Luxembourg,Western Europe,17,6.946,0.03499,1.56391,1.21963,0.91894,0.61583,0.37798,0.28034,1.96961
38,Kuwait,Middle East and Northern Africa,39,6.295,0.04456,1.55422,1.16594,0.72492,0.55499,0.25609,0.16228,1.87634
23,Singapore,Southeastern Asia,24,6.798,0.03780,1.52186,1.02000,1.02525,0.54252,0.49210,0.31105,1.88501
3,Norway,Western Europe,4,7.522,0.03880,1.45900,1.33095,0.88521,0.66973,0.36503,0.34699,2.46531
...,...,...,...,...,...,...,...,...,...,...,...,...
115,Liberia,Sub-Saharan Africa,116,4.571,0.11068,0.07120,0.78968,0.34201,0.28531,0.06232,0.24362,2.77729
143,Niger,Sub-Saharan Africa,144,3.845,0.03602,0.06940,0.77265,0.29707,0.47692,0.15639,0.19387,1.87877
130,Malawi,Sub-Saharan Africa,131,4.292,0.06130,0.01604,0.41134,0.22562,0.43054,0.06977,0.33128,2.80791
156,Burundi,Sub-Saharan Africa,157,2.905,0.08658,0.01530,0.41587,0.22396,0.11850,0.10062,0.19727,1.83302


---
### Summarising the data

Look at the happiness dataframe.  Create new dataframes from a range of rows, columns, statistical information, etc.

For each dataframe, add a text cell to explain what it is showing

In [None]:
happiness.groupby('Region').sum().sort_values(by='Happiness Score',ascending=False)

  happiness.groupby('Region').sum().sort_values(by='Happiness Score',ascending=False)


Unnamed: 0_level_0,Happiness Rank,Happiness Score,Standard Error,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
Region,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Sub-Saharan Africa,5116,168.112,2.21195,15.21892,32.3634,11.29327,14.63776,4.95511,8.84547,80.7992
Central and Eastern Europe,2291,154.655,1.31102,27.33071,30.53823,20.84444,10.38979,2.51354,4.41565,58.62059
Western Europe,620,140.482,0.79013,27.27051,26.19334,19.09211,11.54845,4.86072,6.34428,45.17489
Latin America and Caribbean,1032,135.183,1.34479,19.28994,24.30385,15.48515,11.03827,2.57778,4.79133,57.6967
Middle East and Northern Africa,1552,108.138,0.92674,21.33947,18.4098,14.11231,7.23502,3.63404,3.80751,39.60017
Southeastern Asia,731,47.857,0.38422,7.10149,8.46421,6.09621,5.01394,1.36148,3.77335,16.04718
Eastern Asia,387,33.757,0.22335,6.91068,6.59656,5.26433,2.77494,0.76617,1.35531,10.08964
Southern Asia,792,32.066,0.22513,3.9234,4.51725,3.78581,2.61336,0.71775,2.39,14.11738
Australia and New Zealand,19,14.57,0.07454,2.58376,2.6289,1.83993,1.29062,0.78559,0.91063,4.53071
North America,20,14.546,0.07392,2.7208,2.56972,1.76742,1.17901,0.48847,0.85916,4.96187


The happiest countries are located in Sub-Saharan Africa (the largest Happiness Score by Region)

Top-10 countries with the highest Happiness Score

In [4]:
happiness.groupby('Country')['Happiness Score'].sum().sort_values(ascending=False).head(10)

Country
Switzerland    7.587
Iceland        7.561
Denmark        7.527
Norway         7.522
Canada         7.427
Finland        7.406
Netherlands    7.378
Sweden         7.364
New Zealand    7.286
Australia      7.284
Name: Happiness Score, dtype: float64

Top-10 countries with the lowest Happiness Score

In [6]:
happiness.groupby('Country')['Happiness Score'].sum().sort_values(ascending=True).head(10)

Country
Togo            2.839
Burundi         2.905
Syria           3.006
Benin           3.340
Rwanda          3.465
Afghanistan     3.575
Burkina Faso    3.587
Ivory Coast     3.655
Guinea          3.656
Chad            3.667
Name: Happiness Score, dtype: float64

The averages by regions:

In [7]:
happiness.groupby('Region').mean()

  happiness.groupby('Region').mean()


Unnamed: 0_level_0,Happiness Rank,Happiness Score,Standard Error,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
Region,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Australia and New Zealand,9.5,7.285,0.03727,1.29188,1.31445,0.919965,0.64531,0.392795,0.455315,2.265355
Central and Eastern Europe,79.0,5.332931,0.045208,0.942438,1.053042,0.718774,0.358269,0.086674,0.152264,2.0214
Eastern Asia,64.5,5.626167,0.037225,1.15178,1.099427,0.877388,0.46249,0.127695,0.225885,1.681607
Latin America and Caribbean,46.909091,6.144682,0.061127,0.876815,1.10472,0.70387,0.50174,0.117172,0.217788,2.622577
Middle East and Northern Africa,77.6,5.4069,0.046337,1.066974,0.92049,0.705615,0.361751,0.181702,0.190376,1.980008
North America,10.0,7.273,0.03696,1.3604,1.28486,0.88371,0.589505,0.244235,0.42958,2.480935
Southeastern Asia,81.222222,5.317444,0.042691,0.789054,0.940468,0.677357,0.557104,0.151276,0.419261,1.78302
Southern Asia,113.142857,4.580857,0.032161,0.560486,0.645321,0.54083,0.373337,0.102536,0.341429,2.016769
Sub-Saharan Africa,127.9,4.2028,0.055299,0.380473,0.809085,0.282332,0.365944,0.123878,0.221137,2.01998
Western Europe,29.52381,6.689619,0.037625,1.298596,1.247302,0.909148,0.549926,0.231463,0.302109,2.151185


In [None]:
happiness[happiness['Health (Life Expectancy)']==happiness['Health (Life Expectancy)'].max()]

Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Standard Error,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
23,Singapore,Southeastern Asia,24,6.798,0.0378,1.52186,1.02,1.02525,0.54252,0.4921,0.31105,1.88501


The country with the greatest life expectancy is Singapore.

In [None]:
happiness[happiness['Health (Life Expectancy)']==happiness['Health (Life Expectancy)'].min()]

Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Standard Error,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
122,Sierra Leone,Sub-Saharan Africa,123,4.507,0.07068,0.33024,0.95571,0.0,0.4084,0.08786,0.21488,2.51009


The country with the smallest life expectancy is Sierra Leeone.

Top-10 countries with the greatest scores for Economy (GDP per Capita)

In [None]:
happiness.groupby('Country')['Economy (GDP per Capita)'].sum().sort_values(ascending=False).head(10)

Country
Qatar                   1.69042
Luxembourg              1.56391
Kuwait                  1.55422
Singapore               1.52186
Norway                  1.45900
United Arab Emirates    1.42727
Switzerland             1.39651
Saudi Arabia            1.39541
United States           1.39451
Hong Kong               1.38604
Name: Economy (GDP per Capita), dtype: float64

---
### Next steps

There are data sets for the years 2015 to 2019 available.  To access and try out other years, change 2015 to the required year in the URL in the first code cell.  Leave the rest exactly as it is.  

Other years may have different column headings and so there will be different data to play with.