<a href="https://colab.research.google.com/github/lewyingshi/module2_lectures/blob/master/2_2_list_comprehensions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# List Comprehensions

## Objectives

1. Understand the list comprehension syntax
2. Demonstrate list processing with comprehensions
3. Use list comprehensions in probability simulations

## List Comprehension

* Expression for constructing list
* Returns a new list
* Reads like math
    * Set builder notation

In [None]:
mylist = [1,2,3,4,5]
yourlist = [item ** 2 for item in mylist]
yourlist

[1, 4, 9, 16, 25]

## Building a Comprehension


    
<img src="https://github.com/wsu-stat489/USCOTS2017_workshop/blob/master/img/listComprehensions.gif?raw=true">

## Building a List Comprehension

1. Begin with an empty shell

2. Insert the input sequence

3. Give the elements a name

`L = [   for    in       ]`

`L = [    for     in range(10)]`

`L = [    for num in range(10)]`

In [None]:
L = [num + 2 for num in range(10)]
L

[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

## Adding an optional filter

* The if portion is optional
* Syntax: `if boolean_cond`
    * After input sequence
* Only keeps value for which the condition is `True`

In [None]:
L = [num + 2 for num in range(10) if num % 2 == 1]
L

[3, 5, 7, 9, 11]

### <font color='red'> Exercise 1 </font>

Write a list comprehension that contains the squares for the numbers between 0 and 20 (inclusive).  **Hint:** Start with a `range`

In [None]:
mylist = [item ** 2 for item in range(21)]
mylist

[0,
 1,
 4,
 9,
 16,
 25,
 36,
 49,
 64,
 81,
 100,
 121,
 144,
 169,
 196,
 225,
 256,
 289,
 324,
 361,
 400]

### <font color='red'> Exercise 2 </font>

Write a list comprehension that contains all perfect squares less than 555.  **Hint:** Use a large range and a filter.

In [None]:
ps = [num ** 2 for num in range(100) if num ** 2 < 555]
ps

[0,
 1,
 4,
 9,
 16,
 25,
 36,
 49,
 64,
 81,
 100,
 121,
 144,
 169,
 196,
 225,
 256,
 289,
 324,
 361,
 400,
 441,
 484,
 529]

## Splitting and processing string

* Use `split` to cut a string into a list of strings
* Use a comprehension to process the list.

In [None]:
quote = "Don't judge each day by the harvest you reap but by the seeds that you plant."
quote.split(" ")

["Don't",
 'judge',
 'each',
 'day',
 'by',
 'the',
 'harvest',
 'you',
 'reap',
 'but',
 'by',
 'the',
 'seeds',
 'that',
 'you',
 'plant.']

In [None]:
[len(word) for word in quote.split(" ")]

[5, 5, 4, 3, 2, 3, 7, 3, 4, 3, 2, 3, 5, 4, 3, 6]

### <font color='red'> Exercise 3 </font>

Write a list comprehension that grabs the last two letter of each word.  **Hint:** Use the slice operation!

In [49]:
wordList = ["hello", "handsome"]
wordList
lastTwo = [s[:2] for s in wordList]
lastTwo

['he', 'ha']

### <font color='red'> Exercise 4 </font>

Write a list comprehension that contains all words that have at least 4 characters. 

In [None]:
string = "Hello, were you kidding when you say you are extremely sick?"
words = string.split(" ")
lengths = [word for word in words if len(word) >= 4]
lengths

['Hello,', 'were', 'kidding', 'when', 'extremely', 'sick?']

### Comprehensions work on any input sequence

In [None]:
# On string - gives list of characters
[ch for ch in "Todd Iverson"]

['T', 'o', 'd', 'd', ' ', 'I', 'v', 'e', 'r', 's', 'o', 'n']

In [None]:
# On tuple - converts to list
[item for item in (1,2,3)]

[1, 2, 3]

In [None]:
# On a lazy sequence
[tup for tup in enumerate(["a", "b", "c"])]

[(0, 'a'), (1, 'b'), (2, 'c')]

## Writing Clean Code
### Using helper functions

* **Clean Code Rule 1:** Use helper functions to hide complexity

In [None]:
# Original comprehension
[x**(1/3) for x in range(5) if x %2 == 1]

[1.0, 1.4422495703074083]

## Which is easier to read?

In [None]:
[x**(1/3) for x in range(5) if x %2 == 1]
[cube_root(x) for x in range(5) if is_odd(x)]

NameError: ignored

## Cleaning up list comprehensions

1. Make a function for 
    1. the expression,
    2. the sequence, and 
    3. the predicate
2. Refactor by replacing code with function calls.

In [None]:
# Helper functions
is_odd = lambda x: x % 2 == 1
cube_root = lambda x: x**(1/3)

In [None]:
# Refactored comprehension
[cube_root(x) for x in range(5) if is_odd(x)]

## Unpacking multiple items

* Tuple unpacking assigns a name to each item
* Can be use in a comprehension on a sequence of tuples

In [None]:
a, b, c = (1, 2, 3)
b

2

In [None]:
l = ['a', 'b', 'c']
m = [1, 2, 3]
# Without unpacking
[item for item in zip(l, m)]

[('a', 1), ('b', 2), ('c', 3)]

In [None]:
# with unpacking
[s + str(num) for s, num in zip(l, m)]

['a1', 'b2', 'c3']

## <font color="red"> Exercise 5 </font>

You probably learned about the normal and standard normal distribution in your introductory statistics class.  In this activity, we will build up functions to simulate these distributions using a uniform random number generator.

#### Step 1 -- Transform two uniform random variates into a standard normal random variate.

We will use the [Box-Mueller transformation](https://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform) to simulate turn two uniform random variates (generated with `random.random`) into a single standard normal variate.  **Hint:** In particular you want to implement the formula for $Z_0$ where $U_1$ and $U_2$ are two numbers generated using `random.random`.

**Task:** Write a function called `std_norm_variate` that generates a single random normal variate.

#### Step 2 -- Create function that generates many standard normal variates.

Now use a list comprehension to create a function called `std_norm_variates` that has one argument `n` and returns `n` simulated random normal trials.

#### Step 3 -- Transform a standard normal variate into a given normal variate.

Suppose that `Z` is a standard normal variate and we want to simulate $X~Norm(\mu, \sigma)$.  It can be shown that applying the following transformation will result in the desired distribution.

$$ X = \mu + Z\sigma $$

Write a function called `norm_variate` that takes `mean` and `sd` as arguments and returns a normal variate from a distribution with this mean and standard deviation.

#### Step 4 -- Create function that generates many standard normal variates.

Now use a list comprehension to create a function called `std_norm_variates` that has one argument `n` and returns `n` simulated random normal trials.