<a href="https://colab.research.google.com/github/Rusanyuk/Python-For-Data/blob/main/How_happy_is_the_world.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# How happy is the world?

![HappinessImage-Benjamin Scott](https://drive.google.com/uc?id=1wWdXTclLAjPpZUiKkR8XGg0Sl7iD1u8T)  

(Image by Benjamin Scott, [source](https://www.natureindex.com/news-blog/data-visualization-these-are-the-happiest-countries-world-happiness-report-twenty-nineteen))   

The Sustainable Development Solutions Network (SDSN) collects data across the world relating to happiness.  They use this data to rank countries in order of happiness factor.

This is not an exact science but can give food for thought in terms of what factors might have the most impact on a nation's happiness levels.

Data is taken from the Gallup World Poll, so not collected directly by SDSN.  
Countries are grouped by region.  

### The factors included are:
Economy (measured in GDP per Capita)
Family (support systems)
Health (measured by Life Expectancy)
Freedom (sense of)
Trust (Government Corruption)
Generosity (charitable inclinations)
Dystopia Residual
*  Dystopic is the theoretical most unhappy country with the lowest levels in all six of the above factors  
*  The Residual measure is a calculated as the average of the six distances from lowest

Let's take a look at the data


---
### Open a data set

Open the data set, an Excel file with only one sheet (so sheet_name is not necessary) from here: https://github.com/futureCodersSE/working-with-data/blob/main/Happiness-Data/2015.xlsx?raw=true

Interrogate the data (head, tail, iloc) to get to know what it contains.


In [26]:
import pandas as pd
url = "https://github.com/futureCodersSE/working-with-data/blob/main/Happiness-Data/2016.xlsx?raw=true"
happiness = pd.read_excel(url)
display(happiness.head())
display(happiness.tail())
middle = len(happiness)//2
display(happiness.iloc[middle-5:middle+5])

Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Lower Confidence Interval,Upper Confidence Interval,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
0,Denmark,Western Europe,1,7.526,7.46,7.592,1.44178,1.16374,0.79504,0.57941,0.44453,0.36171,2.73939
1,Switzerland,Western Europe,2,7.509,7.428,7.59,1.52733,1.14524,0.86303,0.58557,0.41203,0.28083,2.69463
2,Iceland,Western Europe,3,7.501,7.333,7.669,1.42666,1.18326,0.86733,0.56624,0.14975,0.47678,2.83137
3,Norway,Western Europe,4,7.498,7.421,7.575,1.57744,1.1269,0.79579,0.59609,0.35776,0.37895,2.66465
4,Finland,Western Europe,5,7.413,7.351,7.475,1.40598,1.13464,0.81091,0.57104,0.41004,0.25492,2.82596


Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Lower Confidence Interval,Upper Confidence Interval,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
152,Benin,Sub-Saharan Africa,153,3.484,3.404,3.564,0.39499,0.10419,0.21028,0.39747,0.06681,0.2018,2.10812
153,Afghanistan,Southern Asia,154,3.36,3.288,3.432,0.38227,0.11037,0.17344,0.1643,0.07112,0.31268,2.14558
154,Togo,Sub-Saharan Africa,155,3.303,3.192,3.414,0.28123,0.0,0.24811,0.34678,0.11587,0.17517,2.1354
155,Syria,Middle East and Northern Africa,156,3.069,2.936,3.202,0.74719,0.14866,0.62994,0.06912,0.17233,0.48397,0.81789
156,Burundi,Sub-Saharan Africa,157,2.905,2.732,3.078,0.06831,0.23442,0.15747,0.0432,0.09419,0.2029,2.10404


Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Lower Confidence Interval,Upper Confidence Interval,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
73,Croatia,Central and Eastern Europe,74,5.488,5.402,5.574,1.18649,0.60809,0.70524,0.23907,0.04002,0.18434,2.52462
74,Hong Kong,Eastern Asia,75,5.458,5.362,5.554,1.5107,0.87021,0.95277,0.48079,0.31647,0.40097,0.92614
75,Somalia,Sub-Saharan Africa,76,5.44,5.321,5.559,0.0,0.33613,0.11466,0.56778,0.3118,0.27225,3.83772
76,Kosovo,Central and Eastern Europe,77,5.401,5.308,5.494,0.90145,0.66062,0.54,0.14396,0.06547,0.27992,2.80998
77,Turkey,Middle East and Northern Africa,78,5.389,5.295,5.483,1.16492,0.87717,0.64718,0.23889,0.12348,0.04707,2.29074
78,Indonesia,Southeastern Asia,79,5.314,5.237,5.391,0.95104,0.87625,0.49374,0.39237,0.00322,0.56521,2.03171
79,Jordan,Middle East and Northern Africa,80,5.303,5.187,5.419,0.99673,0.86216,0.60712,0.36023,0.13297,0.14262,2.20142
80,Azerbaijan,Central and Eastern Europe,81,5.291,5.226,5.356,1.12373,0.76042,0.54504,0.35327,0.17914,0.0564,2.2735
81,Philippines,Southeastern Asia,82,5.279,5.16,5.398,0.81217,0.87877,0.47036,0.54854,0.11757,0.21674,2.23484
82,China,Eastern Asia,83,5.245,5.199,5.291,1.0278,0.79381,0.73561,0.44012,0.02745,0.04959,2.17087


---
### Sort the data in different ways

The data is currently sorted in order of rank.  To sort the data in the table, run the code below, which identifies the column on which to sort in the brackets.

Then, **try sorting on other columns** *Note: you must type the column heading in the quotes and exactly as it appears in the table (including capitalisation)*.  To sort on multiple columns, enter a list of column headings in the brackets (e.g. `.sort_values(['Region','Freedom'])`



In [27]:
family_sorted_table = happiness.sort_values(['Family'])
display(family_sorted_table)  # output the table below
health_sorted_table = happiness.sort_values('Health (Life Expectancy)')
display(health_sorted_table)
multiple_sorted_table = happiness.sort_values(['Region', 'Freedom'])
display(multiple_sorted_table)
ukraine = happiness[happiness['Country']=='Ukraine']
display(ukraine)

Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Lower Confidence Interval,Upper Confidence Interval,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
154,Togo,Sub-Saharan Africa,155,3.303,3.192,3.414,0.28123,0.00000,0.24811,0.34678,0.11587,0.17517,2.13540
152,Benin,Sub-Saharan Africa,153,3.484,3.404,3.564,0.39499,0.10419,0.21028,0.39747,0.06681,0.20180,2.10812
153,Afghanistan,Southern Asia,154,3.360,3.288,3.432,0.38227,0.11037,0.17344,0.16430,0.07112,0.31268,2.14558
131,Malawi,Sub-Saharan Africa,132,4.156,4.041,4.271,0.08709,0.14700,0.29364,0.41430,0.07564,0.30968,2.82859
155,Syria,Middle East and Northern Africa,156,3.069,2.936,3.202,0.74719,0.14866,0.62994,0.06912,0.17233,0.48397,0.81789
...,...,...,...,...,...,...,...,...,...,...,...,...,...
18,Ireland,Western Europe,19,6.907,6.836,6.978,1.48341,1.16157,0.81455,0.54008,0.29754,0.44963,2.15988
0,Denmark,Western Europe,1,7.526,7.460,7.592,1.44178,1.16374,0.79504,0.57941,0.44453,0.36171,2.73939
48,Uzbekistan,Central and Eastern Europe,49,5.987,5.896,6.078,0.73591,1.16810,0.50163,0.60848,0.28333,0.34326,2.34638
7,New Zealand,Australia and New Zealand,8,7.334,7.264,7.404,1.36066,1.17278,0.83096,0.58147,0.41904,0.49401,2.47553


Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Lower Confidence Interval,Upper Confidence Interval,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
110,Sierra Leone,Sub-Saharan Africa,111,4.635,4.505,4.765,0.36485,0.62800,0.00000,0.30685,0.08196,0.23897,3.01402
143,Chad,Sub-Saharan Africa,144,3.763,3.672,3.854,0.42214,0.63178,0.03824,0.12807,0.04952,0.18667,2.30637
138,Ivory Coast,Sub-Saharan Africa,139,3.916,3.826,4.006,0.55507,0.57576,0.04476,0.40663,0.15530,0.20338,1.97478
140,Angola,Sub-Saharan Africa,141,3.866,3.753,3.979,0.84731,0.66366,0.04991,0.00589,0.08434,0.12071,2.09459
102,Nigeria,Sub-Saharan Africa,103,4.875,4.750,5.000,0.75216,0.64498,0.05108,0.27854,0.03050,0.23219,2.88586
...,...,...,...,...,...,...,...,...,...,...,...,...,...
36,Spain,Western Europe,37,6.361,6.288,6.434,1.34253,1.12945,0.87896,0.37545,0.06137,0.17665,2.39663
57,South Korea,Eastern Asia,57,5.835,5.747,5.923,1.35948,0.72194,0.88645,0.25168,0.07716,0.18824,2.35015
52,Japan,Eastern Asia,53,5.921,5.850,5.992,1.38007,1.06054,0.91491,0.46761,0.18985,0.10224,1.80584
21,Singapore,Southeastern Asia,22,6.739,6.674,6.804,1.64555,0.86758,0.94719,0.48770,0.46987,0.32706,1.99375


Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Lower Confidence Interval,Upper Confidence Interval,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
8,Australia,Australia and New Zealand,9,7.313,7.241,7.385,1.44443,1.10476,0.85120,0.56837,0.32331,0.47407,2.54650
7,New Zealand,Australia and New Zealand,8,7.334,7.264,7.404,1.36066,1.17278,0.83096,0.58147,0.41904,0.49401,2.47553
86,Bosnia and Herzegovina,Central and Eastern Europe,87,5.163,5.063,5.263,0.93383,0.64367,0.70766,0.09511,0.00000,0.29889,2.48406
122,Ukraine,Central and Eastern Europe,123,4.324,4.236,4.412,0.87287,1.01413,0.58628,0.12859,0.01829,0.20363,1.50066
120,Armenia,Central and Eastern Europe,121,4.360,4.266,4.454,0.86086,0.62477,0.64083,0.14037,0.03616,0.07793,1.97864
...,...,...,...,...,...,...,...,...,...,...,...,...,...
4,Finland,Western Europe,5,7.413,7.351,7.475,1.40598,1.13464,0.81091,0.57104,0.41004,0.25492,2.82596
0,Denmark,Western Europe,1,7.526,7.460,7.592,1.44178,1.16374,0.79504,0.57941,0.44453,0.36171,2.73939
9,Sweden,Western Europe,10,7.291,7.227,7.355,1.45181,1.08764,0.83121,0.58218,0.40867,0.38254,2.54734
1,Switzerland,Western Europe,2,7.509,7.428,7.590,1.52733,1.14524,0.86303,0.58557,0.41203,0.28083,2.69463


Unnamed: 0,Country,Region,Happiness Rank,Happiness Score,Lower Confidence Interval,Upper Confidence Interval,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
122,Ukraine,Central and Eastern Europe,123,4.324,4.236,4.412,0.87287,1.01413,0.58628,0.12859,0.01829,0.20363,1.50066


---
### Summarising the data

Look at the happiness dataframe.  Create new dataframes from a range of rows, columns, statistical information, etc.

For each dataframe, add a text cell to explain what it is showing

This dataframe shows Happiness Score mean data grouped by Region in descending order.

In [28]:
region_happiness_mean = happiness.groupby('Region')['Happiness Score'].mean().sort_values(ascending=False)
display(region_happiness_mean)

Region
Australia and New Zealand          7.323500
North America                      7.254000
Western Europe                     6.685667
Latin America and Caribbean        6.101750
Eastern Asia                       5.624167
Middle East and Northern Africa    5.386053
Central and Eastern Europe         5.370690
Southeastern Asia                  5.338889
Southern Asia                      4.563286
Sub-Saharan Africa                 4.136421
Name: Happiness Score, dtype: float64

In [29]:
region_trust_median = happiness.groupby('Region')['Trust (Government Corruption)'].median().sort_values(ascending=False)
display(region_trust_median)

Region
Australia and New Zealand          0.371175
Western Europe                     0.262480
North America                      0.230985
Middle East and Northern Africa    0.123480
Southeastern Asia                  0.115560
Latin America and Caribbean        0.105800
Sub-Saharan Africa                 0.095860
Southern Asia                      0.087220
Eastern Asia                       0.071730
Central and Eastern Europe         0.047620
Name: Trust (Government Corruption), dtype: float64

---
### Next steps

There are data sets for the years 2015 to 2019 available.  To access and try out other years, change 2015 to the required year in the URL in the first code cell.  Leave the rest exactly as it is.  

Other years may have different column headings and so there will be different data to play with.