# Sorting a DataFrame

[Pandas `sort_values()` documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html)

### Import Dependencies

In [16]:
import pandas as pd
import os

### Load the data file

In [17]:
csv_path = os.path.join("..", "Resources","Happiness_2017.csv")

happiness_df = pd.read_csv(csv_path)
happiness_df.head()

Unnamed: 0,Country,Happiness.Rank,Happiness.Score,Whisker.high,Whisker.low,Economy..GDP.per.Capita.,Family,Health..Life.Expectancy.,Freedom,Generosity,Trust..Government.Corruption.,Dystopia.Residual
0,Norway,1,7.537,7.594445,7.479556,1.616463,1.533524,0.796667,0.635423,0.362012,0.315964,2.277027
1,Denmark,2,7.522,7.581728,7.462272,1.482383,1.551122,0.792566,0.626007,0.35528,0.40077,2.313707
2,Iceland,3,7.504,7.62203,7.38597,1.480633,1.610574,0.833552,0.627163,0.47554,0.153527,2.322715
3,Switzerland,4,7.494,7.561772,7.426227,1.56498,1.516912,0.858131,0.620071,0.290549,0.367007,2.276716
4,Finland,5,7.469,7.527542,7.410458,1.443572,1.540247,0.809158,0.617951,0.245483,0.382612,2.430182


### Sorting the DataFrame based on "Freedom" column

**NOTE:** By default, `sort_values()` will sort from lowest to highest.

In [18]:
happiness_df.sort_values("Freedom")

Unnamed: 0,Country,Happiness.Rank,Happiness.Score,Whisker.high,Whisker.low,Economy..GDP.per.Capita.,Family,Health..Life.Expectancy.,Freedom,Generosity,Trust..Government.Corruption.,Dystopia.Residual
139,Angola,140,3.795,3.951642,3.638358,0.858428,1.104412,0.049869,0.000000,0.097926,0.069720,1.614482
129,Sudan,130,4.139,4.345747,3.932253,0.659517,1.214009,0.290921,0.014996,0.182317,0.089848,1.687066
144,Haiti,145,3.603,3.734715,3.471285,0.368610,0.640450,0.277321,0.030370,0.489204,0.099872,1.697168
153,Burundi,154,2.905,3.074690,2.735310,0.091623,0.629794,0.151611,0.059901,0.204435,0.084148,1.683024
151,Syria,152,3.462,3.663669,3.260331,0.777153,0.396103,0.500533,0.081539,0.493664,0.151347,1.061574
...,...,...,...,...,...,...,...,...,...,...,...,...
1,Denmark,2,7.522,7.581728,7.462272,1.482383,1.551122,0.792566,0.626007,0.355280,0.400770,2.313707
2,Iceland,3,7.504,7.622030,7.385970,1.480633,1.610574,0.833552,0.627163,0.475540,0.153527,2.322715
128,Cambodia,129,4.168,4.278518,4.057483,0.601765,1.006238,0.429783,0.633376,0.385923,0.068106,1.042941
0,Norway,1,7.537,7.594445,7.479556,1.616463,1.533524,0.796667,0.635423,0.362012,0.315964,2.277027


### Sort descending

To sort from highest to lowest, we must provide  `ascending=False` as an argument to `sort_values()`

In [19]:
happiness_df.sort_values("Freedom", ascending = False)

Unnamed: 0,Country,Happiness.Rank,Happiness.Score,Whisker.high,Whisker.low,Economy..GDP.per.Capita.,Family,Health..Life.Expectancy.,Freedom,Generosity,Trust..Government.Corruption.,Dystopia.Residual
46,Uzbekistan,47,5.971,6.065538,5.876463,0.786441,1.548969,0.498273,0.658249,0.415984,0.246528,1.816914
0,Norway,1,7.537,7.594445,7.479556,1.616463,1.533524,0.796667,0.635423,0.362012,0.315964,2.277027
128,Cambodia,129,4.168,4.278518,4.057483,0.601765,1.006238,0.429783,0.633376,0.385923,0.068106,1.042941
2,Iceland,3,7.504,7.622030,7.385970,1.480633,1.610574,0.833552,0.627163,0.475540,0.153527,2.322715
1,Denmark,2,7.522,7.581728,7.462272,1.482383,1.551122,0.792566,0.626007,0.355280,0.400770,2.313707
...,...,...,...,...,...,...,...,...,...,...,...,...
151,Syria,152,3.462,3.663669,3.260331,0.777153,0.396103,0.500533,0.081539,0.493664,0.151347,1.061574
153,Burundi,154,2.905,3.074690,2.735310,0.091623,0.629794,0.151611,0.059901,0.204435,0.084148,1.683024
144,Haiti,145,3.603,3.734715,3.471285,0.368610,0.640450,0.277321,0.030370,0.489204,0.099872,1.697168
129,Sudan,130,4.139,4.345747,3.932253,0.659517,1.214009,0.290921,0.014996,0.182317,0.089848,1.687066


### Sorting based on multiple columns

**NOTE:** When sorting my multiple columns, the function will prioritize the columns in the order that they are provided in the list.

In [20]:
# In case of a tie in Family, Generosity is used to place the order
sorted_df = happiness_df.sort_values(['Family', 'Generosity'], ascending = False)

### Reset the index

After we sort, the rows will be in a different order, but they will still maintain their pre-sort indexes, which means that the row at index 0 will not necessarily be the row with the highest/lowest value. By resetting the index, we update the indexes to reflect the new sort order.

The `drop=True` argument for `reset_index()` simply means that we don't want to have the original indexes stored as a column in the dataFrame. We just want to drop them.

In [21]:
sorted_df.reset_index(drop = True)

Unnamed: 0,Country,Happiness.Rank,Happiness.Score,Whisker.high,Whisker.low,Economy..GDP.per.Capita.,Family,Health..Life.Expectancy.,Freedom,Generosity,Trust..Government.Corruption.,Dystopia.Residual
0,Iceland,3,7.504,7.622030,7.385970,1.480633,1.610574,0.833552,0.627163,0.475540,0.153527,2.322715
1,Ireland,15,6.977,7.043352,6.910649,1.535707,1.558231,0.809783,0.573110,0.427858,0.298388,1.773869
2,Denmark,2,7.522,7.581728,7.462272,1.482383,1.551122,0.792566,0.626007,0.355280,0.400770,2.313707
3,Uzbekistan,47,5.971,6.065538,5.876463,0.786441,1.548969,0.498273,0.658249,0.415984,0.246528,1.816914
4,New Zealand,8,7.314,7.379510,7.248490,1.405706,1.548195,0.816760,0.614062,0.500005,0.382817,2.046456
...,...,...,...,...,...,...,...,...,...,...,...,...
150,Malawi,136,3.970,4.077479,3.862521,0.233442,0.512569,0.315090,0.466915,0.287170,0.072712,2.081786
151,Benin,143,3.657,3.745784,3.568217,0.431085,0.435300,0.209930,0.425963,0.207948,0.060929,1.885631
152,Togo,150,3.495,3.594038,3.395962,0.305445,0.431883,0.247106,0.380426,0.196896,0.095665,1.837229
153,Syria,152,3.462,3.663669,3.260331,0.777153,0.396103,0.500533,0.081539,0.493664,0.151347,1.061574


In [22]:
sorted_df[sorted_df['Country']=='United States']

Unnamed: 0,Country,Happiness.Rank,Happiness.Score,Whisker.high,Whisker.low,Economy..GDP.per.Capita.,Family,Health..Life.Expectancy.,Freedom,Generosity,Trust..Government.Corruption.,Dystopia.Residual
13,United States,14,6.993,7.074657,6.911343,1.546259,1.419921,0.774287,0.505741,0.392579,0.135639,2.218113
