# DATA SORTING
---


In order to sort a DataFrame, you can use the function sort_values(). Data can be sorted in Ascending or Descending order.

<https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html>

The data in the cars.csv file contains basic information about several luxury cars.

In [1]:
import pandas as pd
cars = pd.read_csv('cars.csv')
cars

Unnamed: 0,Name,PriceUSD,MotorCapacityL
0,Rolls-Royce Phantom,535500,6.75
1,Bentley Continental GT,185000,4.0
2,Mercedes-Maybach S-Class,173100,4.0
3,Ferrari 812 Superfast,335250,6.5
4,Lamborghini Aventador,417800,6.5
5,Aston Martin DB11,208100,4.0
6,Porsche Panamera,86300,3.0
7,Maserati Quattroporte,106000,3.0
8,Lexus LS,80500,3.5
9,Tesla Model S,82990,Dual Electric Motors


To sort the list of cars alphabetically, use the sort function and enter the name of the column by which the data should be sorted.

In [2]:
cars.sort_values(by='Name')

Unnamed: 0,Name,PriceUSD,MotorCapacityL
5,Aston Martin DB11,208100,4.0
1,Bentley Continental GT,185000,4.0
3,Ferrari 812 Superfast,335250,6.5
4,Lamborghini Aventador,417800,6.5
8,Lexus LS,80500,3.5
7,Maserati Quattroporte,106000,3.0
2,Mercedes-Maybach S-Class,173100,4.0
6,Porsche Panamera,86300,3.0
0,Rolls-Royce Phantom,535500,6.75
9,Tesla Model S,82990,Dual Electric Motors


You can also arrange the data in reverse order, e.g. starting with the most expensive car.

In [3]:
cars.sort_values(by='PriceUSD', ascending=False)

Unnamed: 0,Name,PriceUSD,MotorCapacityL
0,Rolls-Royce Phantom,535500,6.75
4,Lamborghini Aventador,417800,6.5
3,Ferrari 812 Superfast,335250,6.5
5,Aston Martin DB11,208100,4.0
1,Bentley Continental GT,185000,4.0
2,Mercedes-Maybach S-Class,173100,4.0
7,Maserati Quattroporte,106000,3.0
6,Porsche Panamera,86300,3.0
9,Tesla Model S,82990,Dual Electric Motors
8,Lexus LS,80500,3.5


### Tasks

The data below contains the exam results and whether the exam was passed.

Name     | Total | Result
---------|-------|---------
Olivier  | 90    | passed
Naomi    | 42    | failed
Olivia   | 71    | passed
Nolan    | 100   | passed
Dylan    | 39    | failed

Follow the instructions below. After completing each point, display the contents of the DataFrame.

* Based on the data in the table, create the content of the DataFrame.

In [4]:
# Alternatively:
# exam_results = pd.DataFrame(
#     {
#         "Name": ["Olivier", "Naomi", "Olivia", "Nolan", "Dylan"],
#         "Total": [90, 42, 71, 100, 39],
#         "Result": ["passed", "failed", "passed", "passed", "failed"],
#     }
# )

exam_results = pd.DataFrame(
    [
        ["Olivier", 90, "passed"],
        ["Naomi", 42, "failed"],
        ["Olivia", 71, "passed"],
        ["Nolan", 100, "passed"],
        ["Dylan", 39, "failed"],
    ],
    columns=["Name", "Total", "Result"],
)

exam_results

Unnamed: 0,Name,Total,Result
0,Olivier,90,passed
1,Naomi,42,failed
2,Olivia,71,passed
3,Nolan,100,passed
4,Dylan,39,failed


* Sort the list of people alphabetically.

In [5]:
exam_results.sort_values("Name")

Unnamed: 0,Name,Total,Result
4,Dylan,39,failed
1,Naomi,42,failed
3,Nolan,100,passed
2,Olivia,71,passed
0,Olivier,90,passed


* Sort the list of people according to their points, starting with the highest value.

In [6]:
exam_results.sort_values("Total", ascending=False)

Unnamed: 0,Name,Total,Result
3,Nolan,100,passed
0,Olivier,90,passed
2,Olivia,71,passed
1,Naomi,42,failed
4,Dylan,39,failed


* Sort the list of people by their score, starting with the people who passed the exam.

In [7]:
exam_results.sort_values(["Result", "Total"], ascending=False)

Unnamed: 0,Name,Total,Result
3,Nolan,100,passed
0,Olivier,90,passed
2,Olivia,71,passed
1,Naomi,42,failed
4,Dylan,39,failed


* Sort the list of people by their score, starting with those who passed the exam and then by those names alphabetically. Hint. Use the link above to see how to sort data by more than one column.

In [8]:
exam_results.sort_values(["Result", "Name", "Total"], ascending=[False, True, False])

Unnamed: 0,Name,Total,Result
3,Nolan,100,passed
2,Olivia,71,passed
0,Olivier,90,passed
4,Dylan,39,failed
1,Naomi,42,failed


Add the 'University' column to your exam DataFrame. Assign the first two people the name 'UEK' and the remaining three people the name 'AGH'. Display updated DataFrame.

In [9]:
exam_results["University"] = ["UEK", "UEK", "AGH", "AGH", "AGH"]
exam_results

Unnamed: 0,Name,Total,Result,University
0,Olivier,90,passed,UEK
1,Naomi,42,failed,UEK
2,Olivia,71,passed,AGH
3,Nolan,100,passed,AGH
4,Dylan,39,failed,AGH


With the modified DataFrame:

* Sort the list by the univeristy, and then by the name. Display the sorted data.

In [10]:
exam_results.sort_values(["University", "Name"])

Unnamed: 0,Name,Total,Result,University
4,Dylan,39,failed,AGH
3,Nolan,100,passed,AGH
2,Olivia,71,passed,AGH
1,Naomi,42,failed,UEK
0,Olivier,90,passed,UEK


* Display the people names alphabetically (only one column of data).

In [11]:
exam_results.loc[:, "Name"].sort_values()

4      Dylan
1      Naomi
3      Nolan
2     Olivia
0    Olivier
Name: Name, dtype: object

* Display two columns: Total and Name, in the given order. Sort the data by the number of points obtained, descending.

In [12]:
exam_results.loc[:, ["Total", "Name"]].sort_values("Total", ascending=False)

Unnamed: 0,Total,Name
3,100,Nolan
0,90,Olivier
2,71,Olivia
1,42,Naomi
4,39,Dylan
