# More on Lists, Tuples and Sorting
## CSE231 - Bonus Reading

Welcome back! Last time, we talked about quirky things in Python for Exam 1, but now we're gonna talk about something that's a bit more fun, data structures! Specifically lists and tuples, we'll be talking about two more (sets and dictionaries) fairly soon. Theoretically, you could actually code anything without a data structure, just that it would be *a lot* more work. You probably realized that on Lab05, where we had to find maxes and mins by iterating over a multitude of values and comparing each with the currently set one. 

All programming languages come with data structures to make it so you don't have to pull your hair out doing this. What you were doing in Lab05 is actually similar to what's happening behind the scenes when you're running a function like `max()` or `min()`. You'll be learning a lot more about behind-the-scenes algorithms in CSE331 if you're a computer science/engineering major. 

Alright, so let's finally talk about lists, tuples and sorting -- the main subject of today's notebook. With respect to lists, we'll be talking about list comprehension and some more things you can do with `.sort()`/`sorted()`. With tuples, we'll be talking about this concept in Python called "unpacking". These are small things you can easily live without in Python, so it's not really something we cover much in-class or in the lecture videos. But, I just wanted to offer this reading as an aside for anyone that's super interested in programming. 

Another reason we don't talk about this stuff is because these things are mostly Python-only. Features of the Python language that are *usually* not present in other languages. We're trying to get you guys to be *general* programmers, people that can think algorithmically and potentially program in *any* language. Python is, by far, my favorite language and has so many useful features as you'll see, and so in case you're also really into Python, here's some extra stuff you can do with the language!

________________________________

## List Comprehension

### Some Real Talk (you don't have to read through this part)

"What's the point of list comprehension?", you might ask. To which I would respond, "there is no point.. other than to clean up your code". That's right, there is *pretty much* no other reason. The thing about this though, is that clean-looking code is subjective. Some (insane) people like their code smushed together, and some (sensible) people space things out. You might prefer to initialize an empty list and append to it with a loop, which is completely okay. But the reason I'm showing you this is so that you know what it's doing when you encounter *someone else* doing it. Dr. Enbody loves having list comprehension on the exams because *he knows it's confusing*, and I believe it's really only talked about in the book -- the book that a majority of students don't buy. Evil? Possibly.

So for those of you who don't want to read like I did when I took the class, here's how list comprehension works lmao. 

So remember when I said "there is *pretty much* no other reason" to use list comprehension? There is a *very* small argument for why you should use list comprehension other than just aesthetic, and that's runtime. 

Below, I'm going to be calculating the runtime of two of the same list creations, one with list comprehension and one with `.append()`. Don't worry about the module code I'm using, my main point is having to do with the runtime output you'll see. 

In [1]:
import timeit

ti = timeit.default_timer()

L = []
for i in range(2**20):    # The .append() methodology
    L.append(i)

tf = timeit.default_timer()

print("Runtime (s):", tf - ti)

Runtime (s): 0.333204300000034


In [2]:
import timeit

ti = timeit.default_timer()

L = [i for i in range(2**20)]    # The list comprehension methodology

tf = timeit.default_timer()

print("Runtime (s):", tf - ti)

Runtime (s): 0.20493220000025758


The list comprehension method is *slightly* faster.

Now you might say, "Wow, that's a very insignificant amount of time difference. Why should I care?". Well, imagine you're working on a project that has to initialize a *ton* of these. Typical large-scale Python projects will have upwards of 20 files all working together, each having to initialize lists with different values, all to different lengths, etc.. This time adds up, and it has potential to add up *a lot* in the end.

When you're working for a big tech company, *there are going to be hundreds of files working together.* You're going to be dealing with a *huge* codebase, a database, logs, users that don't know how to use the damn website, etc.. You're probably going to want to minimize runtime as much as you can, even if in small amounts. Plus, you might impress your lead developer along the way. You're *especially* going to want less runtime if you're trying to run your program over, and over, and over, attempting to fix bugs, find bugs, catch mistakes, etc.. 

The synopsis is, not only can list comprehension make your code look cleaner to rational people, but it can maybe, just maybe save you some extra time in the end.

<img src="img1.png" width="600"/>
<img src="img2.png" width="600"/>

For pretty much everything in this course, you're okay to not use list comprehension. When you're working for a big tech company however, it *might* be in your best interest to use it as much as you can. It really depends on the scope of the project, how big your lists need to be, and how many times you're going to be creating such big lists.

Alright, so let's go over some examples of list comprehension and talk about how they work. We'll start off simple and work our way up. We'll also be tackling a small coding problem that we can apply list comprehension to.

### Demonstration

In [3]:
my_list = [ i for i in range(5) ]

print(my_list)

[0, 1, 2, 3, 4]


Strange, isn't it? We're so used to using the colon at the end of the for-loop, and having `i` before creating the loop just feels... off. I personally like to think of it like:

```
[ make `i` an element of this list for `i` in range(5) ]
```

Note that we express the element that is to be added to the list before the loop. You will *always* want to have it expressed in that order.

```
[ {element expression} for {element} in {loop-structure} ]
```

You'll see later on that we can expand upon this basic layout.

The `i` here, like using a for-loop traditionally, is still just a variable, meaning that we can perform any action we want on it. This is where the power of list comprehension really shines.

In [4]:
my_list = [ i**2 for i in range(5) ]

print(my_list)

[0, 1, 4, 9, 16]


Now, in my head, I think:

```
[ make `i**2` an element of this list for `i` in range(5) ]
```

Going back to the layout we had from earlier, the `i**2` is the "element expression" in this context. 

```
[ {element expression} for {element} in {loop-structure} ]
```

If you're mathematically inclined, it might be helpful to think about it like plugging a number into a function and storing each number we get as output. So in this instance, we're plugging in every integer number in the domain $[0,5)$ into a function, $f(i) = i^2$, and storing each output into our list. 

Of course we don't just have to think about it in respect to doing arithmetic, we can use list comprehension to do other programming things.

Let's say I have a list of numbers stored as strings. Obviously I can't do math with these things, so what's a good way of converting all of the values I have in `my_list` to ints?

In [5]:
my_list = ["1", "2", "3", "4", "5"]

my_list = [ int(str_num) for str_num in my_list ]

print(my_list)

[1, 2, 3, 4, 5]


Ah! This is a really quick and easy way to convert massive amounts of elements to a different type while maintaining order. In the process, we also overwrote `my_list` to be our new and improved one with ints only, something that would have taken an extra line if we were to use the `.append()` methodology.

Let's expand upon this. Let's say that our list contains string-numbers again but we somehow managed to get `None` values strewn about the list. Let's say we also don't want to count zeroes in our new list.

In [6]:
my_list = ["0", "1", "2", None, None, "3", "4", None, "5", "0"]

my_list = [ int(i) for i in my_list if i != None and i != "0" ]

print(my_list)

[1, 2, 3, 4, 5]


We can accomplish this using conditional statements within the comprehension! It's certainly strange to read, so let's break it into parts:

```
[ 

int(i)    # convert to i to int

for i in my_list    # iterate through my_list
  
if i != None and i != "0"    # only append i under these conditions
  
]
```

or if you want to think about it like this:

```
[ convert i to int if i isn't None and i isn't "0", for each i in my_list ]
```

Going back to our fundamental layout from earlier, we can expand upon it to reflect this. 

```
[ {element expression} for {element} in {loop-structure} {append conditional} ]
```

I'm going to refer to conditional statements after the "loop-structure" as the "append conditional" because *this* conditional statement is what determines if the element gets appended to the list. The append conditional is evaluated, *and then* the element expression. So, equivalently, the code we have above is doing this:

In [7]:
my_list = ["0", "1", "2", None, None, "3", "4", None, "5", "0"]

empty_list = []

for i in my_list:
    
    if i != None and i != "0":    # {append conditional}
        empty_list.append( int(i) )    # {element expression}

my_list = empty_list

print(my_list)

[1, 2, 3, 4, 5]


### Example Problem

Let's do another example but in the context of a coding exercise. This one is a boolean logic classic that almost all programming classes do. We made a function in Lab04 that did this logic for us, you'll probably remember it.

__________________________________

From [Wikipedia](https://en.wikipedia.org/wiki/Leap_year): "Every year that is exactly divisible by four is a leap year, except for years that are exactly divisible by 100, but these centurial years are leap years if they are exactly divisible by 400."

**Problem**: Write a function that will return all of the leap years between a given start and end year. 

___________________________________

Since the topic is list comprehension, you can probably already imagine what I'm going to do here. Using list comprehension makes this problem look trivial.

Let's break down the leap year categorization criteria into a set of logical statements: 

-- Years that are exactly divisible by 4 are leap years, except

-- Years that are exactly divisible by 100 are not leap years, except

-- Years that are exactly divisible by 400 are leap years.

It's important to note that a year being exactly divisible by 400 is the only logical statement that *does not* have an exception, and so this might be a good starting point for our boolean operation. 

In order for a leap year to happen in all other circumstances, then that must mean it's exactly divisible by 4, _but **not**_ exactly divisible by 100. The word "but" in natural language indicates the use of a boolean conjunction, the `and`. The resulting boolean statement should then be:

```
(year % 400 == 0) or (year % 4 == 0) and not (year % 100 == 0)
```

or again, if you want to put it in terms of mathematics, we define a set such that `x` is evenly divisible by 400, or divisible by 4 and not 100.

$$ {\{ x \ |\ (x\% 400=0) ∨ ( (x \% 4=0) ∧¬(x \%100=0) ) \}} $$

So, all we need to do then, is create a function that takes a start and end year as arguments, let that determine the bounds, and run our logic for every year within that range.

In [8]:
# I'm going to make the range inclusive on the right bound just cause
# I think the function should. I never specified whether I wanted inclusive
# or exclusive ranges in the problem, so I just thought "eh why not" 

def get_leap_years(start_year, end_year):
    return [year for year in range(start_year, end_year + 1) if (year % 400 == 0) or (year % 4 == 0) and not (year % 100 == 0)]

ret = get_leap_years(1900, 2020)

print(ret)

[1904, 1908, 1912, 1916, 1920, 1924, 1928, 1932, 1936, 1940, 1944, 1948, 1952, 1956, 1960, 1964, 1968, 1972, 1976, 1980, 1984, 1988, 1992, 1996, 2000, 2004, 2008, 2012, 2016, 2020]


Really, the hardest part about the leap year problem is figuring out the logic. But I wanted to highlight the usage of list comprehension in the context of a problem, and so this one seemed fitting. We made what would normally be 4-5 lines into a single-line piece of beautiful, fast and lightweight code.

By the way, if you're not familiar with the mathematical syntax above, you'll be learning that in CSE260!

### Nested if and elif/else

For most purposes, you're likely going to be sticking with a single if-statement and a big boolean expression. Enbody likely won't be going this far in-depth with list comprehension, but I thought I would just include this in case you want to go deeper. 

If I wanted to check if a condition is `True`, but only if a preceding condition is `True`, then this would warrant the use of a nested conditional statement. But how can you express that within a list comprehension?

In [9]:
my_list = [x for x in range(100) if (x % 2 == 0) if (x % 5 == 0)]    # Like this!

print(my_list)

[0, 10, 20, 30, 40, 50, 60, 70, 80, 90]


The conditional within the comprehension is equivalent to saying:

```
if x % 2 == 0:
    if x % 5 == 0:
        (...)
```

Additionally, we can even add elifs and elses to the mix. But let's get a bit fancier, remember our layout from before? 

```
[ {element expression} for {element} in {loop-structure} {append conditional} ]
```

I called that first portion of the template "element expression" because it's exactly that -- an expression. Meaning you can do more than just apply functions and do math with the element, you can make it conditional as well.  

In [10]:
my_list = ["Even" if (i % 2 == 0) else "Odd" for i in range(10)]

print(my_list)

['Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd']


So above, "Even" is appended to the list `if i % 2 == 0`, else "Odd" is appended, where `i` ranges from 0 to 9. The `.append()` equivalent would be something akin to:

```
my_list = []

for i in range(10):
    if (i % 2 == 0):
        my_list.append("Even")
    else:
        my_list.append("Odd")
```

We made the "element expression" an "element conditional expression". You can of course complicate things further by adding an append conditional if you wanted.

For right now, that's about as far as I'm going to discuss list comprehension. There is one more thing you can do in list comprehension, which is nested loops. Dr. Enbody likely won't have anything on the exams about nested loops in list comprehensions due its complexity, so I wouldn't worry. If you're interested in learning more about list comprehension, there are tons of online resources you can find on it.

## Tuple Unpacking

We've been working with functions a lot now. So, you're probably aware that we can return 2, 3, maybe even 5 values at once from a function like this:

```
return var1, var2, var3
```

Most programming languages don't have this capability. Why can Python do this but other programming languages can't? Well, Python has a feature called *unpacking*, and it has to do with tuples and the way that the `,` operator works in Python.

So let's start simple with some syntax you might not have seen being used before:

In [11]:
my_pack = 1, 2, 3, 4, 5

print(my_pack)

(1, 2, 3, 4, 5)


Immediately, you'll probably notice that the values we listed take the form of a tuple. Even without the parentheses, Python assumes you're creating a tuple when you list out a bunch of comma-separated values in the right-hand side of an expression. It's similar to declaring a `list`, but if you want to explicitly tell Python you want a list, you have to wrap the values in square brackets.

So, going back to our multi-value function return, 

```
return var1, var2, var3
```

you'll probably notice that this is doing the exact same thing! Let's create an example:

In [12]:
# I'm going to make a function that simulates vector addition.
# We'll make it take three component values as arguments. 

# With the way vector addition works, we want to add 1 to each 
# component-wise, making it an ideal function to demonstrate multi-value returns.

def vector_increment( i, j, k ):
    return i + 1, j + 1, k + 1

return_values = vector_increment( 1, 2, 3 )

# since we have one variable on the left-hand side (LHS) of the function call,
# all of the return values get stored into a tuple as you'll see
print(return_values)

(2, 3, 4)


You'll probably also remember that we've traditionally had a corresponding number of variables on the left-hand side of a function call to store multi-value function returns. What you're doing when you write an expression like that, is you're *unpacking* the tuple, the feature I introduced at the beginning. To demonstrate, we'll continue using the `vector_increment()` function.

In [13]:
new_i, new_j, new_k = vector_increment( 1, 2, 3 )

print("i: {}, j: {}, k: {}".format(new_i, new_j, new_k))

i: 2, j: 3, k: 4


And so now you'll probably come to the conclusion that, in the same manner, this means we can write expressions like this:

In [14]:
a, b, c, d = (1, 2, 3, 4)

print("a: {}, b: {}, c: {}, d: {}".format(a, b, c, d))


# ...and like this! Remember from earlier when we talked about Python not needing parentheses for tuple declaration?
e, f, g, h = 5, 6, 7, 8

print("e: {}, f: {}, g: {}, h: {}".format(e, f, g, h))

a: 1, b: 2, c: 3, d: 4
e: 5, f: 6, g: 7, h: 8


Hopefully you can see the connections, here. This also means that, equivalently, a function return could be wrapped in parentheses to even further drive home the idea that multi-value returns are secretly tuples.

```
return (var1, var2, var3)
```

Unnecessary, but it's important to know this if you're ever debugging function returns (It also might show up on Enbody's exams). So, quickly going back to what I said at the beginning, "Python can return multiple variables at once where most programming languages can't", is actually false. What's actually happening is you're returning one thing -- a tuple of those values. Python simply just has a feature where you can split those values into 3 separate "returns". 

Here's an interesting question then, what would happen if you had two variables on the LHS of a 3-value function return?

In [15]:
var1, var2 = vector_increment(1, 2, 3)

ValueError: too many values to unpack (expected 2)

It might come as no surprise that you get an error. It says here that there are "too many values to unpack (expected 2)", which might sound a bit confusing. The "(expected 2)" is in reference to the two variables, `var1` and `var2`, Python is attempting to extract two values from the function call but is getting three instead.

As you can imagine, the logical conclusion of all of this is that you can either unpack *all* of the values into separate variables for a multi-value return, *or* you can store all of the values into *a single* variable that will create a tuple of all of the multi-value returns.

Is there a standard to this? Any sort of industry expectation? Surprisingly there isn't, at least not a strong one. When you create a tuple of multi-value returns, you would then obviously need to call the values you want by index, so like:

```
return_tuple = function(x)

return1 = return_tuple[1]
return2 = return_tuple[2]
(...)
```

If you were to call strictly by index, then there can definitely be some ambiguity. I would say that it's better to unpack and give all of the returns a name, but I think it can be pretty lenient depending on what your function does and what kind of associations you've made with it.

## Sort Functions

One of the biggest and most contested topics in computer science is sorting algorithms. We, fortunately, will not be creating one ourselves (You'll have to do that in CSE331!). Instead, we'll be talking about the sort functions already given to us in Python and what we can do with them, which does require that we talk a little bit about what's going on behind the scenes. 

Sorting is immensely useful to us humans. It's a way of quickly determining what's at the top and what's at the bottom. Computers on the other hand, don't give a shit. They can easily scan through a huge array of values and determine what's at the top and bottom using comparisons and search methods. Sorting the array first, however, might save some runtime in future calls. The computation necessary for determining the ranking for *every* item in an array is hugely complicated, and there are *hundreds* of algorithms that have been developed over the years for it. [Here's a fun little video](https://www.youtube.com/watch?v=kPRA0W1kECg) that visualizes a few of them.

The sorting algorithm used when you call `.sort()` or `sorted()` in Python is called [Timsort](https://en.wikipedia.org/wiki/Timsort) (not in the video), which you don't need to know anything about other than it's a super generalized sorting algorithm that has the capability to sort by almost anything you give it.

Both `.sort()` and `sorted()` have a parameter called `key`, enabling you to specify *how* you want your array sorted. This is a hugely powerful tool when you don't simply want something categorized by alphanumeric order, or if you want to sort by something unique within the elements inside your array. If you had a list of lists, for example, you could sort by the length of each inner-list. If you had a list of strings, you could sort by the alphabetical order of the last character in each string. There are an infinite number of things you can do with `key`, and so it's super helpful to know how it works. 

We're going to start off simple again, and work our way up. We're mainly going to focus on lists that contain other container-types, because this is usually where `key` comes in handy. 

In [16]:
my_matrix = [
    [1, 2, 3, 4],
    [5, 1, 6, 8],
    [6 ,8, 1, 4],
    [0, 1, 4, 3],
    [9, 5, 4, 7]
]

print(my_matrix)

[[1, 2, 3, 4], [5, 1, 6, 8], [6, 8, 1, 4], [0, 1, 4, 3], [9, 5, 4, 7]]


So here, we have a list of lists. Each simply contains 4 integer numbers, and we notice that it's out of order. When we call `.sort()` normally, it will compare the first element of each inner-list with the others.

In [17]:
my_matrix.sort()

print(my_matrix)

[[0, 1, 4, 3], [1, 2, 3, 4], [5, 1, 6, 8], [6, 8, 1, 4], [9, 5, 4, 7]]


You'll see that the first element of each inner-list is in numeric order, 0, 1, 5, 6, 9. But what if instead, we wanted to sort the matrix in numeric order by the last element in each inner-list? 

This is where the `key` comes in. We need some way to tell Python that, for each inner-list in our matrix, determine the ranking order given by the last element in each inner-list. Well, the way we do this is by giving `key` *a function* that tells `.sort()` which element in the inner-list to look at when determining the order.  

Stepping back a bit, the list of lists we created is called `my_matrix`. Importantly, each element in `my_matrix` are other, smaller lists. Then, in each of these smaller lists, we have integer number values. If we want to tell Python to sort by the last element in each smaller list, than that would be in terms of `{smaller list}[-1]`. We take the -1st index of whatever smaller list we're currently looking at during the sorting. Without a specified `key`, Python seemingly defaults to `{smaller list}[0]`, right? Since it's organizing each smaller list determined by whatever the first element in each smaller list is. 

So, let's create *a function* that takes in some `list` element, and spits back the -1st index of the list. 

In [18]:
def sort_by_last( li ):    # `li` will be a type(li) == list, the function can be named whatever
    return li[-1]

Now that we have a function that extracts the last element of some given `list` element, we can give this to the `key` parameter of `.sort()` and it'll work out everything else for us!

In [19]:
my_matrix.sort(key=sort_by_last)    # `key` takes the name of a function

print(my_matrix)

[[0, 1, 4, 3], [1, 2, 3, 4], [6, 8, 1, 4], [9, 5, 4, 7], [5, 1, 6, 8]]


As you can see, the first element of each list is now out of order, but the *last element* of each list _**is**_ in order! Pretty cool stuff! 

The syntax for this is pretty strange as you might have noticed, we just called a function as a parameter for another function? How does that work? Well, in certain cases, there are function parameters that can do this. It's usually some specially defined behaviour coded for the function. Unfortunately it's some pretty advanced Python that I'll talk about near the end of the course, since we haven't learned about everything there is to know about functions yet. 

For right now, just know that `key` takes the name of a function whose behaviour is catered for the sorting of an array. More on this in a bit.

Let's do some more examples! What if we had a list of strings, and we wanted to sort not by alphabetical order, but by the length of the string? Well you could do that in a similar fashion! We're going to create a function that takes some variable of type `str`, and have it return the length of that string.

In [20]:
my_matrix = [    # Most common US surnames, just for fun
    "Smith",
    "Johnson",
    "Williams",
    "Brown",
    "Jones",
    "Garcia",
    "Miller",
    "Davis",
    "Rodriguez",
    "Martinez"
]

def sort_by_len( string ):
    return len(string)

my_matrix.sort( key=sort_by_len )    # Longest surnames will be at the back

print(my_matrix)

['Smith', 'Brown', 'Jones', 'Davis', 'Garcia', 'Miller', 'Johnson', 'Williams', 'Martinez', 'Rodriguez']


Notice that we're kind of creating a middle-man here. We created a function that uses another function to determine the length of a string. But the `len()` function works on strings in the first place, right? So what we could say instead, is...

In [21]:
my_matrix.sort( key=len )

print(my_matrix)

['Smith', 'Brown', 'Jones', 'Davis', 'Garcia', 'Miller', 'Johnson', 'Williams', 'Martinez', 'Rodriguez']


Ta-da! The point I wanted to drive home here is that the funcion you supply to `key` can be any function, as long as it works with whatever type of element you have inside your list, and returns a value that the sort function can use to determine the ranking of each item. This of course means that you can't give `key` a function like `print()`, since `print()` doesn't return any sort of value that would be useful to organize data. 

### `lambda` Functions

If you've spent any time looking up how to sort an array by a certain item in Python, you've likely come across people using `.sort()` and `sorted()` with the `key` parameter using some complicated mess that consisted of the `lambda` keyword. They usually look like this:

```
sorted(my_array, key=lambda x : x**2)
```

This is called a "lambda function", or sometimes referred to as an "anonymous function". 

You probably noticed that the functions we've been using in our custom sorts from before were two lines. The function declaration, and then a return statement with some expression. In a lot of ways, having these small two-line functions strewn about your code sucks. They're really only used for sorting, and so they take up an unnecessary amount of space in your code. This is where lambda functions come in handy, because they allow us to create usable functions in just a single line and are able to be put inside of other expressions if needed. They're somtimes referred to as "anonymous functions" because, most of the time, you won't be giving them a name. They're simply for taking input and returning output quickly.

Let's take the example we had from the beginning and see if we can transform it into a lambda function. Lambda functions will always have this basic layout:

```
lambda {arguments} : {return expression}
```

So here's what we had from earlier:

In [22]:
my_matrix = [
    [1, 2, 3, 4],
    [5, 1, 6, 8],
    [6 ,8, 1, 4],
    [0, 1, 4, 3],
    [9, 5, 4, 7]
]

def sort_by_last( li ):    # `li` will be a type(li) == list, the function can be named whatever
    return li[-1]

my_matrix.sort( key=sort_by_last )

print(my_matrix)

[[0, 1, 4, 3], [1, 2, 3, 4], [6, 8, 1, 4], [9, 5, 4, 7], [5, 1, 6, 8]]


But now, what would this look like as a lambda function?

In [23]:
my_matrix = [
    [1, 2, 3, 4],
    [5, 1, 6, 8],
    [6 ,8, 1, 4],
    [0, 1, 4, 3],
    [9, 5, 4, 7]
]

my_matrix.sort( key=lambda li: li[-1] )

print(my_matrix)

[[0, 1, 4, 3], [1, 2, 3, 4], [6, 8, 1, 4], [9, 5, 4, 7], [5, 1, 6, 8]]


Look how clean that is! No function declaration needed. Of course, if you needed your sort function in many places, it might be a good idea to actually give it an official declaration and return, but for most purposes, a lambda function works really well in the context of `.sort()` and `sorted()`. 

So what's going on up above? Well let's take both variations we have of the function and compare them.

Lambda function:
```
lambda li : li[-1]
```

Declarative function:
```
def sort_by_last( li ):
    return li[-1]
```

Hopefully it's fairly obvious, but in the lambda function, we have `li` as a parameter, separated by a `:`, and then whatever return expression when need from the parameter. That `li[-1]` will be the expression evaluated at return time for the lambda function. This of course, is mirrored in the declarative function version we made. We're taking in a parameter, calling it `li`, and returning `li[-1]` just as we're doing in the lambda function. The only difference is that lamba functions don't have names, whereas declarative functions are given one (hence the name, "declarative"). We use the `lambda` keyword before the rest of our function expression to signal to Python that we're creating a lambda function, where we use `def` to signal the use of a declarative function.

Something important that I should address is that lambda functions can only evaluate *one* expression, but can still return multiple objects (as a tuple, like we've talked about). So you can't have what would be two separate lines in a declarative function simulated in a lambda function. As an example, here's a declarative function that necessitates two subsequent lines:

```
def function(x):
    print(x)
    return x*2
```

There is no way to emulate this behaviour in a lambda function. You would have to choose between returning `print(x)` or `x**2`.

```
lambda x : print(x)    # Good
```

```
lambda x : x*2    # Good
```

```
lambda x : print(x) x**2    # SyntaxError: invalid syntax, there is no way to run two different lines
```

Is there any way to give a lambda function a name? You might expect the answer to be "no" since they're sometimes referred to as "anonymous", but *you can* actually give them names and have them persist in your code space like variables and declarative functions.

In [24]:
double = lambda x : x * 2    # Multiplies an input number by 2

result = double(5)    # called like a declarative function

print(result)

10


### Example Problem

Let's do another example just to show the power that sort functions have. 

_________________________

From [Wikipedia](https://en.wikipedia.org/wiki/Gross_national_income): The gross national income (GNI), previously known as gross national product (GNP), is the total domestic and foreign output claimed by residents of a country, consisting of gross domestic product (GDP), plus factor incomes earned by foreign residents, minus income earned in the domestic economy by nonresidents

**Problem:** Let's say you have a list of countries and data on their GNI per capita at purchasing power parity (PPP) and population. Write a function that can sort the data by country name, GNI PPP, and population size.

Data set is given below.

_________________________

In [25]:
# Data is laid out in order:
# ( {country name}, {GNI PPP}, {population} )

# GNI PPP stats from World Bank, last updated in 2018
# Population data from United Nations Department of Economic and Social Affairs, last updated in 2019

dataset = [
    ("Qatar", 124410, 2832067),
    ("Singapore", 94670, 5804337),
    ("Kuwait", 84250, 4207083),
    ("Brunei", 82180, 433285),
    ("United Arab Emirates", 75440, 9770529),
    ("Luxembourg", 72200, 615729),
    ("Switzerland", 68820, 8591365),
    ("Norway", 68310, 5378857),
    ("Ireland", 67050, 339031),
    ("United States", 63690, 329064917),
    ("Netherlands", 56890, 17097130),
    ("Denmark", 56410, 5771876),
    ("Saudi Arabia", 55840, 34268528),
    ("Austria", 55300, 8955102),
    ("Iceland", 55190, 339031)
]

Now, traditionally, this would be a hugely daunting task. If we didn't know that we could sort by a certain item within our array, the naive approach might be to iterate through the array and make comparisons throughout whilst appending to another array to set the order. This would quickly get needlessly complicated, and so let's instead apply what we just learned.

From our dataset, we know that the country name is the 0th index, the GNI PPP is the 1st, and the population is the 2nd in each inner-tuple. All we need to do then, is give our function some encoded meaning. To explain, let's have a parameter called `sortby`, and have it take a string. If we receive `"country"`, we sort by the 0th index, else if we receive `"gnippp"`, we sort by the 1st, and so on. 

I'm going to use `sorted()` here, so as to not edit `dataset` directly. `sorted()` stores the output array into a copy, whereas `.sort()` edits the original variable's value.  

In [26]:
def sort_data(dataset, sortby): 
    sortby = sortby.lower()    # just so we can ignore capital letters
    
    if sortby == "country":
        sorting_index = 0
    elif sortby == "gnippp":
        sorting_index = 1
    elif sortby == "population":
        sorting_index = 2
    else:
        sorting_index = 0    # We'll just create an else-case that defaults to sort by country
    
    # each value within the dataset is a tuple, and so the key within
    # each comparison will be the `sorting_index` of that tuple.
    # 0 = country, 1 = gni ppp, 2 = population
    out_dataset = sorted(dataset, key=lambda tup: tup[sorting_index])
    
    return out_dataset


sorted_data = sort_data( dataset, "country" )    # Sorting alphabetically by country name

sorted_data

[('Austria', 55300, 8955102),
 ('Brunei', 82180, 433285),
 ('Denmark', 56410, 5771876),
 ('Iceland', 55190, 339031),
 ('Ireland', 67050, 339031),
 ('Kuwait', 84250, 4207083),
 ('Luxembourg', 72200, 615729),
 ('Netherlands', 56890, 17097130),
 ('Norway', 68310, 5378857),
 ('Qatar', 124410, 2832067),
 ('Saudi Arabia', 55840, 34268528),
 ('Singapore', 94670, 5804337),
 ('Switzerland', 68820, 8591365),
 ('United Arab Emirates', 75440, 9770529),
 ('United States', 63690, 329064917)]

In [27]:
sorted_data = sort_data( dataset, "gnippp" )    # Sorting by GNI PPP (highest at the bottom)

# You might notice that it's in the reverse order of the original
# declaration lmao. I copied straight from an already-organized 
# GNI PPP list, the population size was something I decided to add after

sorted_data

[('Iceland', 55190, 339031),
 ('Austria', 55300, 8955102),
 ('Saudi Arabia', 55840, 34268528),
 ('Denmark', 56410, 5771876),
 ('Netherlands', 56890, 17097130),
 ('United States', 63690, 329064917),
 ('Ireland', 67050, 339031),
 ('Norway', 68310, 5378857),
 ('Switzerland', 68820, 8591365),
 ('Luxembourg', 72200, 615729),
 ('United Arab Emirates', 75440, 9770529),
 ('Brunei', 82180, 433285),
 ('Kuwait', 84250, 4207083),
 ('Singapore', 94670, 5804337),
 ('Qatar', 124410, 2832067)]

In [28]:
sorted_data = sort_data( dataset, "population" )    # Sorting by population (highest at the bottom)

sorted_data

[('Ireland', 67050, 339031),
 ('Iceland', 55190, 339031),
 ('Brunei', 82180, 433285),
 ('Luxembourg', 72200, 615729),
 ('Qatar', 124410, 2832067),
 ('Kuwait', 84250, 4207083),
 ('Norway', 68310, 5378857),
 ('Denmark', 56410, 5771876),
 ('Singapore', 94670, 5804337),
 ('Switzerland', 68820, 8591365),
 ('Austria', 55300, 8955102),
 ('United Arab Emirates', 75440, 9770529),
 ('Netherlands', 56890, 17097130),
 ('Saudi Arabia', 55840, 34268528),
 ('United States', 63690, 329064917)]

It works! You might think that the function is a bit hard-code-y, but in the real world, you'll pretty often have to create functions catered to certain datasets because the way people create datasets varies *drastically*. Trust me, it's a pain in the ass. There are a ton of Python modules that can do this kind of stuff for you easily, but if you wanted to use raw Python, this would be the way. 

Knowing how to use sort functions is _**huge**_. Dr. Enbody will often have data to sift through for many projects, and they usually have you organize data or find the top/bottom values. If you ever need a top/bottom 5, sort the array and take a splice from `[0:5]` or `[-5:]`-- super trivial stuff if you know the sorting techniques.

_______________

## Some Wrap-Up Real Talk

If you've made it this far, you're probably in computer science or computer engineering. Let me just say a few things to ya'll.

**Don't let this class ruin your enthusiasm for programming if you were super interested before but are no longer interested due to the course projects and homework.**

You've probably encountered tons of people saying they didn't want to do computer science after taking this course, my best friend from high school was one of them. 

But, the reason this class gives you so much homework is so that you can get *the practice*. That's really how you get good at anything. By the time this class is over, you'll hopefully be able to learn new libraries, languages and tools on your own, we cannot teach you all of them because *more are being created everyday*. I would highly, highly recommend finding what you think is enjoyable and applying your programming knowledge to it. Python is great because if you want to develop a program that does *anything*, there is probably a library for it. If you want to make an AI, model a physics problem, hack your best friend, predict when the global warming apocalypse will happen, *anything*, you can almost always find a library catered for the task. Learn the libraries you're interested in, and you'll find the true reason why you wanted to know how to program in the first place.

I would imagine that if you're in computer science/engineering right now, you've probably done a little coding in high school. Perhaps you grew up around computers all your life, maybe you got into making Minecraft mods, maybe you just thought it was cool, whatever reason you have for being in this class, just know that it gets easier from here. Learning *how* to program is the biggest hurdle of this major, the rest comes naturally. To be in this major however, *you have to always be willing to learn*. This field changes _fast_. Python only became the most popular programming language (over Java) within the last 15 years. The people that aren't willing to learn the new tools become dinosaurs, they're no longer useful to the market. You have to keep up with this stuff, you have to be ready, and you have to find the fun in programming for yourself. 

If you're not finding the fun in programming, I would say you're playing a risky game being in computer science/engineering. I would tread lightly, but as I've said, look towards what interests you *about* programming and hopefully that can bring you the motivation to continue with the major. Don't come out of college doing something you never enjoyed.