# DSCI 511: Data acquisition and pre-processing<br>Chapter 2: Working with Python's data types
## Exercises
Note: numberings refer to the main notes.

#### 2.1.1.6 Exercise: trace of a matrix

The trace of a (square) matrix (list of lists) is the sum of it's diagonal elements.
Loop over the matrix's rows and add up the diagonal values for the trace. We'll be working with the simple 3 x 3 matrix:
```
[[1, 2, 3],   
 [4, 5, 6],  
 [7, 8, 9]]
```

#### Discussion: using `enumerate()` to manage indices
Here, it's very helpful to know the index of each element (both row and column). To gather these in our loop, we can unpack tuples from the `enumerate()` function, which provides index-value tuples through a generator. When the row and column index are equal, we have a diagonal element and add it onto the `trace`'s running tally.

#### 2.1.1.8 Exercise: Sorting lists

You might have noticed that the list `y` is a list containing the first 14 numbers in the Fibonacci sequence, i.e., any number can be calculated from the sum of the previous two numbers. 

Now, use a `for` loop to calculate the next 10 Fibonacci numbers and append them to list `y`. 

#### Discussion: Reverse indexing and slicing
We can easily get 10 passes through a loop by iterating over `range(10)` (__Note__: we dont actually use `i`!). However, accessing the last two elements of the current running list could be tricky if we ad to ,e.g., check how long the list was and pull out values by explicit index. Instead, we can _slice_ out the last two elements using negative indexing. the `-2` index refers to the second from last, and by leaving a blank on the other side of the `:`, colon slice operator we're telling Python to gather the elements from that (`-2`) point through the list's end. From there, `sum()` nicely handles the summation on the list slice, and we can then use the `.append()` method.

#### 2.1.1.11 Exercise: Flattening a matrix into a list
Using either a nested comprehension or a nested for loop, conver the matrix below into a single list of values. Discuss how they become ordered, i.e., by column or row.
#### Discussion: unwinding nested objects with comprehensions
This is a pretty confusing syntax to see for the first time. As discussed, comprehensions reverse the order of expressions and iteration from loops. However, the order of nesting remains the same!

#### 2.1.2.3 Exercise: tuple unpacking and list transformation
Use a list comprehension to create a list of all the author names from the given list of tuples containing title, author, and publication year.

#### Discussion: pulling out a column
Since both lists and tuples are types of ordered arrays, iteration over `books` in the comprehension `[book[1] for book in books]` provides access to each row/tuple of data. Our expression in the comprehension can be though of as 'get the 1th column', using ordered indexing to get the author.  

#### 2.1.4.2 Exercise: functions review
Write your own basic version of the built-in `range()` function. It should take only one argument, the number before which the range should stop; and it should return a list containing all integers up to that end point.

#### Discussion:  `return` is final, i.e., ends a function
The prompt requests list output and this is because a function's `return` is final. This means we can't return each integer from inside of the while loop, but instead have to instantiate a list object (`stop_ints`), load it up with the integers and `return` it at the end.

#### 2.1.4.4 Exercise: generator functions
Rewrite `my_range()` as a generator function and print out the first 25 integers using this generator function.

#### Discussion: `yield` is incremental
Generators are functions whose output can be iterated over, produced lazily. To make one, instead of ending a function with final output using `return`, any piece of output can be `yield`ed as it is computed. Notice below how the `yield` statement is not being applied to the `stop_int` inside of the loop, and that no list object is being managed or `return`ed. To handle the output, notice that only a function (the generator) is referenced if it is not iterated upon. To receive the full output of the generator it can be iterated upon, or coerced into a list.

#### 2.1.4.6 Exercise: lambda function data transformation for a sort
Sort this list of books in descending order of year of publication (i.e. the more recently published books should come first) using a lambda function with `sorted()`.

#### Discussion: the `lambda` function relies upon an implied schema
The sorting key provided to tell `sorted()` to sort by the year, i.e., third column, assumes that there is always a `2`th element of each element (tuple) in the list being sorted:
- `key = lambda x: x[2]`

This might seem strange at first, but the `x` is a dummy variable that just tells the `sorted()` function which aspect of a piece of data (in the list) to sort by. It says 'given a tuple `x`, sort by the `2`th element, `x[2]`.

#### 2.2.2.4 Exercise: interacting with dictionaries
Write code that uses a loop to iterate over `books_dict` and create three separate lists of titles, authors, and years, populated by the correct values.

#### Discussion: dictionary iteration defaults to keys
However, since we know that each book has a `"title"` we can get away with pulling the actual value for a book by the string-key directly. 

#### 2.2.2.6 Exercise: Counting with a base Python dictionary
Say we have a list of results of soccer games. There are three possibilities: win (W), loss (L), and draw (D). Count the number of wins, losses, and draws using the `counts` dictionary.

#### Discussion: infullness implementation
The implementation below utilizes the infullness approach, if the boolean value of `result in counts` is `True` we modify (increment) the value present in `counts[result]`, while if the boolean value is `False` we get the value of `counts[value]` started with direct assignment to `1`.

#### 2.2.2.7 Exercise: sorting a dictionary's keys by value
As discussed, default iteration on a dictionary is by key. This means a default `sorted()` call on a dictionary will return a sorting of the dictionary's keys. Use this, and the `key=lambda x: expression(x)` syntax to sort the dictionary from the exercise in __Section 2.2.2.6__ to sort the keys (game result types) of the `counts` dictionary by value, from high to low (`reverse = True`), and use a loop to print out the top 2 most common game result types and their counts.

#### Discussion: sorting a dictionary by values
Since `sorted()` will operate on the keys, we can provide a lambda function that assumes each key has a corresponding value in `counts`. In other words, the lambda function
- `lambda x: counts[x]`

says, 'given a key, `x` (input), sort by its value in `counts[x]` (output)'.

#### 2.2.2.9 Exercise: counters
Count the results and calculate a point tally, where a win equals 3 points, a draw equals 1 point, and a loss equals 0 points.

#### Discussion: A solution using weights
Here, we used a dictionary of weights. Technically, we could have just started with the `count` `Counter()` already constructed in __Section 2.2.2.7__ and used some `if`/`else` statements inside of the loop to check the different result types (e.g., `"W"`) by hand and increment `points` by the appropriate value, but the use of `weights` let's us think ahead about 'metadata' and how it can be used to support a tasks.