# Sorting

## Introduction

### Tap tap tap.................  You can be guaranteed that you will be required to do sorting on all of the exams in the course.

It's common for programmer to want to sort groups of data. For example, we might want to sort a list of employees by their start date. Python provides a built in function for doing this sort of work: `sorted()` (see the [documentation](https://docs.python.org/3/library/functions.html#sorted) for more details).

- Python has an entire tutorial dedicated to [Sorting](https://docs.python.org/3/howto/sorting.html).
  - Many of the examples, and even some of the text in this notebook, are taken directly from this tutorial.

## Simple Sorting

A simple ascending sort is very easy: just call the `sorted()` function. It returns a new sorted list:

In [None]:
my_tuple = (5, 2, 3, 1, 4)
sorted(my_tuple)

- You can pass any iterable to `sorted()`.
- Notice that `sorted()` returns a `list`.
  - If you need a different type, you'll need to cast it to the new type.

If we want, we can use the `reverse` keyword to return the items in reverse order:

In [None]:
my_tuple = (5, 2, 3, 1, 4)
sorted(my_tuple, reverse=True)

## Key Functions

By default, Python will simply use the `<` operator to compare values. So, when determining order of integers, `sorted()` will evaluate the expression `a < b` for various values and use the results to order the items.

However, we can call a function on each item *before* this comparison is made. This gives us a lot of power to arbitrarily order our iterables. 

**For example, we can sort a list of tuples by checking the 3rd element in the list.**

In [None]:
# Define the list of tuples
# each tuple has name, grade, credits earned
student_tuples = [
    ('john', 'A', 15),
    ('jane', 'B', 12),
    ('dave', 'B', 10),
]

# Method 1: Create the function traditionally --------------------------
def sorting_func(tup):
    return tup[2]

# Return the sorted tuples
sorted(student_tuples, key=sorting_func)

### Tap tap tap, you want to know this one very well!!

In [None]:
# Method 2: Use a lambda function --------------------------------------

# sort in ascending order
print(sorted(student_tuples, key=lambda tup: tup[2]))

# sort in descending order
print(sorted(student_tuples, key=lambda tup: -tup[2]))

In [None]:
# sort by name, in descending order

# this one throws an error. Why?
# sorted(student_tuples, key=lambda tup: -tup[0])

# this one is correct
# sorted(student_tuples, key=lambda tup: tup[0], reverse=True)

#### If you are asked sort strings in descending order, you must use the `reverse=True` parameter.

### What if we want to sort by a specific index, and break any ties using a different index?

### Tap tap tap........................

#### For example, sort the `student_tuples` list first by grade ascending and break any ties by credits earned in descending order.

In [None]:
sorted(student_tuples, key=lambda tup: (tup[1], -tup[2]))

**As above Note that the `-` syntax only works for numeric (integer and float) variables.**

If we were to try to do a descending sort with `-` using one of the string variables, it would throw an error.

In [None]:
# uncomment to see the error
# sorted(student_tuples, key=lambda tup: (tup[1], -tup[0]))

In [None]:
# you have to use the "reverse" keyword, but it applies to all the sorting
# sorted(student_tuples, key=lambda tup: (tup[1], tup[0]), reverse=True)

### On an exam, you will not be asked to sort by multiple elements, in which the element in descending order is not a numeric variable.

## A more complex example, using `sorted()`

### Recall this complex data structure, from the nested data notebook.

In [None]:
my_family = [
  { "family_name": "Tunnicliffe",
    "num_people": 4,
    "local": True,
    "city": "Bethpage, NY",
    "date_established": 2014,
    "names": ["Diane", "Steve", "Dylan", "Landon"],
    "number_of_children": 2,
    "children": [
      {
        "name": "Dylan",
        "age": 5,
        "favorite_color": "black",
        "nickname": "Dillybeans",
        "loves": "Super Mario",
      },
      {
        "name": "Landon",
        "age": 2,
        "favorite_color": "blue",
        "nickname": "Landybean",
        "loves": "trucks",
      }
    ]
  },
  { "family_name": "Agulnick",
    "num_people": 5,
    "local": False,
    "city": "Newton, MA",
    "date_established": 1987,
    "names": ["Ellen", "Mark", "Diane", "Joshua", "Allison"],
    "number_of_children": 3,
    "children": [
      {
        "name": "Diane",
        "age": 31,
        "favorite_color": "pink",
        "nickname": "Dini",
        "loves": "unicorns",
      },
      {
        "name": "Joshua",
        "age": 28,
        "favorite_color": "red",
        "nickname": "Joshie",
        "loves": "trains",
      },
      {
        "name": "Allison",
        "age": 26,
        "favorite_color": "purple",
        "nickname": "Alli",
        "loves": "candy",
      }
    ]
  }
]

As before, we can go to Python Tutor to visualize the data.

https://pythontutor.com/python-debugger.html#mode=edit

### Find the Oldest and Youngest child

#### This question uses sorting and a lambda function, in conjunction with other programming logic.

#### The code here represents what might be required to solve a (simple or easy) 2-point question on an exam.

### Requirement:

Return a tuple with two string elements.

The first element is the name of the oldest child.

The second element is the name of the youngest child.

In [None]:
'''
define the variables required, set to empty values
    oldest child
    youngest child
    list of children
    
loop over the input list
    for each child
        append the child dictionary to the list
        
sort the list by child age, from oldest to youngest

the first child in the list will be the oldest, so assign their name to the oldest child variable

the last child in the list will be the youngest, so assign their name the youngest child varilable

create and return the tuple

'''

In [None]:
def oldest_youngest(my_family):
    
    #### YOUR CODE HERE
#     oldest_child = None
#     youngest_child = None
#     children = []

#     for unit in my_family:
#         for child in unit['children']:
#             children.append(child)
            
#     print(children)

#     sorted_children = (sorted(children, key = lambda child: child['age'], reverse = True))

#     oldest_child = sorted_children[0]['name']
#     youngest_child = sorted_children[-1]['name']
    
#     return (oldest_child, youngest_child)

    pass   # placeholder

In [None]:
# will return error until function above is written
(oldest,youngest) = oldest_youngest(my_family)

print(f"The oldest child is {oldest}. The youngest child is {youngest}.")

## Final Thoughts: Sorting in Place

Using `sorted()` **does not change the original iterable.** It simply returns a new list.

However, Python lists have a method, `list.sort()`, which **does** change the original list. This means you will be modifing your original data!

- **Do not use this unless you know *for a fact* that you will not need the original list.**

In [None]:
first_list = [2, 7, 3, 9, 10, 1]
second_list = first_list.copy()

# Are these the *same* list, or are they different?
print("First list ID:", id(first_list))
print("Second list ID:", id(second_list))

if id(first_list) == id(second_list):
    print("Wait, these are the same list!")
else:
    print("OK, modifying the first list won't impact the second list.")

In [None]:
first_list_sorted = sorted(first_list)
print("What does our sorted first_list look like?", first_list_sorted)
print("What does our original first_list look like?", first_list)

In [None]:
second_list_output = second_list.sort()
print("What does the return value of the .sort() method look like?", second_list_output)
print("What does our original second_list look like?", second_list)

### Tap tap tap, this is important!!

We can see from the output above that the `.sort()` method will change the original list. Keep this in mind if you need to sort something.

For this reason, we generally recommend to use the `sorted()` function and assign the result to a **NEW VARIABLE**.

**This is one of the scenarios that we typically see in exams, when the test case variables return a failure because the student has modified an input variable.**

**The student has modified an input variable using `sort()`, when they should have created a new variable using `sorted()`.**

## What are your questions on sorting?