# Top Travelers

Write a solution to report the distance traveled by each user.

Return the result table ordered by travelled_distance in descending order, if two or more users traveled the same distance, order them by their name in ascending order.

The result format is in the following example.

**Example 1:**
```
Input: 
Users table:
+------+-----------+
| id   | name      |
+------+-----------+
| 1    | Alice     |
| 2    | Bob       |
| 3    | Alex      |
| 4    | Donald    |
| 7    | Lee       |
| 13   | Jonathan  |
| 19   | Elvis     |
+------+-----------+

Rides table:
+------+----------+----------+
| id   | user_id  | distance |
+------+----------+----------+
| 1    | 1        | 120      |
| 2    | 2        | 317      |
| 3    | 3        | 222      |
| 4    | 7        | 100      |
| 5    | 13       | 312      |
| 6    | 19       | 50       |
| 7    | 7        | 120      |
| 8    | 19       | 400      |
| 9    | 7        | 230      |
+------+----------+----------+

Output: 
+----------+--------------------+
| name     | travelled_distance |
+----------+--------------------+
| Elvis    | 450                |
| Lee      | 450                |
| Bob      | 317                |
| Jonathan | 312                |
| Alex     | 222                |
| Alice    | 120                |
| Donald   | 0                  |
+----------+--------------------+
```
Explanation: 
Elvis and Lee traveled 450 miles, Elvis is the top traveler as his name is alphabetically smaller than Lee.
Bob, Jonathan, Alex, and Alice have only one ride and we just order them by the total distances of the ride.
Donald did not have any rides, the distance traveled by him is 0.

Once you have implemented your solution, run the code block containing your `top_travellers` function. Be sure that your function [returns](https://www.geeksforgeeks.org/python-return-statement/) a dataframe, and doesn't simply print.

In [35]:
import pandas as pd

# Here, it looks like we can concat and/or group DataFrames
# We ought to compare user_id from rides df with id from persons df

users = pd.read_csv("data/users.csv")
rides = pd.read_csv("data/rides.csv")

def top_travellers(users, rides):
    """
    Takes two DataFrames, groups by user_id and id, renames distance to travelled_distance, and sorts values from highest to 
    lowest first, and if there are equivalent distances, values are sorted in alphebetical order.

    Args:
        users (DataFrame): DataFrame with the id and name of each person
        rides (DataFrame): DataFrame with the user_id and distance travelled per ride

    Returns:
        DataFrame: Sorted DataFrame that has successfully linked ids together and 
        orders distance travelled from highest to lowest. If distance is equivalent, it is sorted in alphebetical order.
    """
    # Summarize total distance travelled by each user & condense into one value with sum()
    grouped_rides = rides.groupby("user_id").sum()
    # Merge users with aggregated rides data
    merged = pd.merge(users, grouped_rides, how="left", left_on="id", right_on="user_id")[["name", "distance"]]
    # Rename columns
    merged.columns = ["name", "travelled_distance"]
    # Fill NaN values with 0 for travelled_distance & convert to int vs float default
    merged['travelled_distance'] = merged['travelled_distance'].fillna(0).astype(int)
    # Sort the DataFrame: first by travelled_distance (descending) then by name (ascending)
    sorted_merged = merged.sort_values(by=['travelled_distance', 'name'], ascending=[False, True]).reset_index(drop=True)
        
    return sorted_merged

## Test Block

**NOTE: Before running the code block below, please be sure that the function you implemented above runs successfully (i.e. a green check-mark appears when running your code)**

Run the following block of code to test your `top_travellers` function. If all cases evaluate to true, your code is functioning correctly!

In [36]:
from tests import test_top

users = pd.read_csv("data/users.csv")
rides = pd.read_csv("data/rides.csv")

output = top_travellers(users, rides)
result1 = test_top.test1(output)

print("Output\n:", output)
print("Test case 1:", result1)

Output
:        name  travelled_distance
0     Elvis                 450
1       Lee                 450
2       Bob                 317
3  Jonathan                 312
4      Alex                 222
5     Alice                 120
6    Donald                   0
Test case 1: False
