# Tuples

We won't spend a lot of time on tuples, because they are _very_ similar to lists. 
- they can hold anything
- they are arbitrary length
- they are indexed (0, 1, 2, ..., length_of_tuple - 1)
The biggest difference are that tuples are immutable.

In [None]:
from pprint import pprint  # enables "pretty printing"

In [None]:
list_of_people = ['Albert Einstein', 'Marie Curie', 'Ada Lovelace']
tuple_of_people = ('Albert Einstein', 'Marie Curie', 'Ada Lovelace')

In [None]:
list_of_people

In [None]:
tuple_of_people

In [None]:
list_of_people[0] = 'Emmy Noether'
list_of_people

In [None]:
# do you think this errors out?
tuple_of_people[0]

In [None]:
# do you think this errors out?
tuple_of_people[0] + ' was a physicist'

In [None]:
# do you think this errors out?
tuple_of_people[0] = 'Emmy Noether'

Mostly I bring up tuples because we will see them in various places from built-in functions.

## Complex sorting

Suppose we collection a lot of information about the different states. We decide that we are going to store:
> abbreviation, population, area (sq miles), captial name, state name

for each state

In [None]:
state_info = [
     ['CA', 39_560_000, 163_694, 'Sacramento', 'California'],
     ['WA', 7_536_000 ,71_297, 'Olympia', 'Washington'],
     ['TX', 28_995_881 , 268_596, 'Austin', 'Texas'],
     ['OR', 4_190_713, 98_378, 'Salem', 'Oregon'],
     ['WY', 578_759 ,97_914, 'Cheyenne', 'Wyoming'],
     ['IL', 12_671_821, 57_914, 'Springfield', 'Illinois'],
     ['HI', 1_420_491 , 10_931, 'Honolulu', 'Hawaii'],
     ['AK', 710_249, 663_268, 'Juneau', 'Alaska'],
     ['NY', 19_453_561, 54_555, 'Albany', 'New York']
]

In [None]:
state_info

Let's say we wanted to sort the _values_ by population. We could use our normal trick:
1. We know lists are sorted by the first element first
2. Make a new list `[population, all_the_data]` for each element
3. Sort that
4. Then extract `all_the_data`

Let's see this in action:

In [None]:
# This is step 2
list_to_sort = [ [data[1], data] for data in state_info]

# This isn't strictly necessary, it is just here to help us visualize
# what is happening.
for element in list_to_sort:
    print(element)

In [None]:
# Now do step 3 and use the default sorting
sorted_list = sorted(list_to_sort)

for element in sorted_list:
    print(element)

Okay, but we still have the population repeated twice (once in the data structure, and once in the beginning where we are just using it to sort). Now that they are in order, let's just extract the data

In [None]:
sorted_data = [element[1] for element in sorted_list]

# Note that pretty printing is a way of skipping the 
# loop. We could also print the same way we printed in the last step
pprint(sorted_data)

## Exercise

Sort by area instead (i.e. we should have a sorted list of just the values at the end, but it should be sorted from smallest area to largest area of a state).

## Population density

The population density is defined as the number of people per square mile. So if we have a value 
```
element = [abbr, population, area, capital]
```
we would find the population density with
```
pop_density = element[1] / element[2]
```

Sort the values of `state_info` by _population density_.

In [None]:
def get_population_density(element):
    """For an element of the form [abb, population, area, capital] returns the population density"""
    return element[1] / element[2]

# Example of usage
get_population_density(state_info[0])

In [None]:
## For you: sort the states by population density

## Generalized sorting function

So we have seen how we "set things up" to use sorted. Our algorithm for "sort by `x`" is
1. Get the values
2. Make a new list of the form `[x, all_the_data]`
3. Sort this new list (which sorts by `x` by default), call it `sorted_list`
4. Extract just the `all_the_data` piece from `sorted_list`

We have to repeat a lot of code each time. The reason is "get `x`" can be complicated. It could be
- get the population (i.e. grab element with index 1)
- get the area (i.e. grab element with index 2)
- get the density (i.e. grab the ratio `element[1] / element[2]`)
- get the abbreviation (i.e. grab `element[0]`)
The rule we should always have is that the thing we want to sort by should always be a function of the value we are trying to sort.

Let's write our population sorting code in one cell:

In [None]:
# Note the only thing that changes when we try to sort is [**data[1]**e, ....]
list_to_sort = [ [data[1], data] for data in state_info]
sorted_list = sorted(list_to_sort)
sorted_data = [element[1] for element in sorted_list]


# use "pretty printing" instead
pprint(sorted_data)

How would we generalize this? Well, we might write a function `get_population(element)`, which might seem like a step backward:

In [None]:
def get_population(element):
    return element[1]

In [None]:
# Note the only thing that changes when we try to sort is [**data[1]**e, ....]
list_to_sort = [ [get_population(data), data] for data in state_info]
sorted_list = sorted(list_to_sort)
sorted_data = [element[1] for element in sorted_list]


# use "pretty printing" instead
pprint(sorted_data)

The one thing that is a little nicer about the code above is that it is clear what is happening in the first line of this cell. `data[1]` is a little weird, but `get_population(data)` expresses our intent better. It also becomes clear how to use this pattern to sort by area, or population density:

In [None]:
# Note the only thing that changes when we try to sort is [**data[1]**e, ....]
list_to_sort = [ [get_population_density(data), data] for data in state_info.values()]
sorted_list = sorted(list_to_sort)
sorted_data = [element[1] for element in sorted_list]


# use "pretty printing" instead
pprint(sorted_data)

That is still a lot of repeated code for changing just the thing we are sorting by. Let's write a function!

In [None]:
def fancy_sort(thing_to_sort, sort_by_func):
    """thing_to_sort is your list of things to sort by
       sort_by_func is a FUNCTION
       
       We apply the function to each element of the list thing_to_sort_by
       and sort the elements by the output of the function
    """
    list_to_sort = [[sort_by_func(data), data] for data in thing_to_sort]
    sorted_list = sorted(list_to_sort)
    sorted_data = [element[1] for element in sorted_list]
    return sorted_data

In [None]:
# now lets use it:
pprint(fancy_sort(state_info), get_population))

We are passing `get_population` as a function, we are NOT calling it. Look at the difference between the following two cells

In [None]:
# not calling the function
get_population

In [None]:
# Calling the function (not the parens)
get_population(state_info[0])

The first cell passes the function as a variable to our sorting function where it is the `sort_by_func`.

Inside `fancy_sort` we _call_ the function on line 8 (when making `list_to_sort`). If we wanted a different sorting (e.g. sorting by population density) we can do that easily:

In [None]:
# now lets use it:
pprint(fancy_sort(state_info, get_population_density))

## Exercise

1. Write a function, `get_area(element)`, and then use `fancy_sort` to sort by area
2. Write a function, `get_state_name(element)`, and then use `fancy_sort` to sort the states alphabetically

## First-class functions

The fact that we can pass functions the same way we pass variables leads to the phrase
> functions are first-class objects

If you are familiar with other programming languages (particularly in the C/C++ family), passing functions requires all sorts of special techniques. Here is another example of how we can use functions as variables

In [None]:
def add_one(x):
    return x+1

def square(x):
    return x*x

def listify(x):
    return (x, x*x, x*x*x)

In [None]:
# square the numbers from 0 to 10
[square(x) for x in range(10)]

In [None]:
# add one to the numbers from 0 to 10
[add_one(x) for x in range(10)]

In [None]:
# listify the numbers from 0 to 10
[listify(x) for x in range(10)]

So far, we have just called a function "normally". Now let's store the value in a variable, and then call the choice we made later:

In [None]:
# pick a function
my_choice = square

[my_choice(x) for x in range(0, 10)]

## lambda functions

Let's go back to our `fancy_sort` example. One critism is, to sort by population, we needed to define a function to get the population:

```python
def get_population(element):
    return element[1]

fancy_sort(state_info, get_population)
```

The first couple of lines might seem like overkill to _just_ get the first element from a list, especially if we never get the population ever again. A lambda function allows us to define a function in place. The syntax is

```python
f = lambda arg1, arg2, arg3, ... : <expression>

# This is the same as 
def f(arg1, arg2, arg3, ....):
    return <expression>
```

A lambda function evaluates a (one-line!) expression and returns it.

In [None]:
# Here is our square function
def square(x):
    return x*x

square(3)

In [None]:
# Here we can make a "square" function using lambda 
square2 = lambda x: x*x

square2(3)

In [None]:
# Here is the lambda function version of "get_population"
get_population = lambda element: element[1]

get_population(state_info[0])

In [None]:
# Write a lambda function for "get_area"


In [None]:
# Write a lambda function for "get_population_density"



In [None]:
# Here is a fancy lambda function
# It returns the last thing (string) evaluated
am_i_rich = lambda balance: 'yes' if balance > 1_000_000 else 'no'

am_i_rich(50_000)

### Using lambda functions in practice

You would almost never use lambda functions the way I have written them above. Lambda functions are sometimes called "anonymous functions", because the use case are functions that we use any through away (i.e. we never assign them a variable name). If we write 

```python
square = lambda x: x*x
```

we may as well define it properly:

```python
def square(x):
    return x*x
```
Proper functions are easier to read, they can support doc strings, and have slightly nicer technical properties (scoping and meta-programming properties). Where lambda functions shine are one off quick functions:

In [None]:
get_population(state_info[0])

In [None]:
# fancy_sort using "get_population"
fancy_sort(state_info, get_population)

In [None]:
# fancy_sort using a throw-away function
fancy_sort(state_info, lambda element:element[1])

## Introduction to the `key` argument in `sorted`

It turns out that `fancy_sort` works the same way as `sorted` if we use the `key` argument

In [None]:
sorted(state_info, key=lambda element:element[1])

i.e. `key` is the "official" name of what we called `sort_by_func`!

## Exercise

Here are the states as a list of dictionaries. It is a little nicer, as each field is named (we don't have to remember if population or area comes first)

In [None]:
better_state_info = [{'abb': element[0], 'pop': element[1], 
                      'area': element[2], 'capital': element[3], 
                      'state': element[4]} for element in state_info]

In [None]:
better_state_info

Sort the `better_state_info` by population. You can use either `fancy_sort` or `sorted`

## Fancy functions: zip and enumerate

Let's talk about three functions:
- zip
- enumerate
- range

They return _iterators_. We can step through iterators almost like lists, but are slightly different. We are going to cast them to lists, because we don't want to focus on the difference between _iterators_ and _lists_.


It is not something to spend a huge amount of time on for your first Python course. The only reason I bring it up at all is otherwise you will wonder what these iterators are -- the answer for now is "things to cast to a list"!

In [None]:
range(10)

In [None]:
list(range(10))

Range is the easiest one of these to understand, and we have used it a few times already. 

Zip takes two lists and matches them up, one at a time

In [None]:
alter_ego = ['Natasha Romanoff', 'Diana Prince', 'Jean Gray', 'Clark Kent', 'Bruce Wayne', 'Peter Parker']
superhero = ['Black Widow', 'Wonder Woman', 'Pheonix', 'Superman', 'Batman', 'Spiderman']

In [None]:
# this is a weird iterator
zip(alter_ego, superhero)

In [None]:
# Note this gives us a list of tuples, where we "match up" elements of the lists
list(zip(alter_ego, superhero))

In [None]:
# Let's reveal all the alteregos of the heros by stepping through both lists together:
for index in range(len(alter_ego)):
    normal_name = alter_ego[index]
    superhero_name = superhero[index]
    print(f"{superhero_name}'s secret identity is {normal_name}")

In [None]:
# Let's use zip (note that iterators work well with lists, we don't need an explicit cast)
for alter_and_super in zip(alter_ego, superhero):
    normal_name, superhero_name = alter_and_super
    print(f"{superhero_name}'s secret identity is {normal_name}")

In [None]:
# We can also use unpacking:
for normal_name, superhero_name in zip(alter_ego, superhero):
    print(f"{superhero_name}'s secret identity is {normal_name}")

## An application to making better states

We used the following list comprehension to make the `better_state_info`. There are a lot of repeated elements here

In [None]:
better_state_info = [{'abb': element[0], 'pop': element[1], 
                      'area': element[2], 'capital': element[3], 
                      'state': element[4]} for element in state_info]

Here is a way using zip:

In [None]:
fields = ('abb', 'pop', 'area', 'capital', 'state')
better_state_info = [{key: value for key, value in zip(fields, element)} for element in state_info]

In [None]:
better_state_info

## An application to our digit program

Remember when we wanted to check how many digits were in the same place in the two different numbers? We did the following

In [None]:
def count_number_in_same_place(code, guess):
    code = f'{code:04d}'
    guess = f'{guess:04d}'
    if len(code) != len(guess):
        raise ValueError("numbers are not the same length!")
    num_matches = 0
    for index in range(len(code)):
        if code[index] == guess[index]:
            num_matches += 1
    return num_matches

In [None]:
count_number_in_same_place(1245, 3251)

Can you rewrite this using zip?