## What a list comprehension looks like

    [<expression with item_var> for <item_var> in <sequence>]

* Takes an **iterable** sequence (list, dictionary, set, tuple, generator, string, file, ...) as input
* **Transforms** each element of that sequence
* Evaluates to **a list** of transformed elements

Can also filter -- we'll see that in a later slide.

In [1]:
# List comprehension: map every element to its tenfold
input_list = [10, 20, 30, 40]
[x * 10 for x in input_list]

[100, 200, 300, 400]

## Why list comprehensions

Programming consists of **data transformation** and **side effects**

Most data transformation on collections takes one of 3 forms:

1. Transform each element of the collection: `map(function, sequence)`, list comprehension
2. Keep each element that meets a condition: `filter(function, sequence)`, list comprehensions
3. Summarize a collection: `reduce()`, statistics, lots of other functions

List comprehensions are a compact way to write the two most common collection transformations

## An expression is anything that can be reduced to a value

    [<expression with item_var> for <item_var> in <sequence>]

Expressions only contain identifiers, literals and operators

* function calls are fine
* method calls are fine
* indexing is fine

Basically, avoid function/class definitions and flow control.

* No `def ...:`
* No `if ...:`
* No `return ...`

## An expression is anything that can be reduced to a value

In [4]:
def maybe_plus_one(x):
    if x < 30:
        return x + 1

# List comprehension: map every x to x+1
[maybe_plus_one(x) for x in [10, 20, 30, 40]]

[11, 21, None, None]

In [5]:
# Split string on whitespace to get a list of names
names = 'James James Morrison Morrison Weather George du Pree'.split()

# List comprehension: map every word to its first two characters
[name[0:2] for name in names]

['Ja', 'Ja', 'Mo', 'Mo', 'We', 'Ge', 'du', 'Pr']

In [6]:
# Map every character to its uppercase, and add an underscore
[char.upper() + '_' for char in 'Tom']

['T_', 'O_', 'M_']

### List comprehensions allow filtering

Add `if` plus a filtering expression at the end

In [7]:
names = ['Tom', 'Dick', 'Harry']
[name for name in names]

['Tom', 'Dick', 'Harry']

In [8]:
[name for name in names if len(name) <= 4]

['Tom', 'Dick']

## Dictionary comprensions exist, too

This is a dictionary literal:

In [9]:
{'Joan': 8, 'Pippa': 12}

{'Joan': 8, 'Pippa': 12}

This is a dict comprehension:

In [10]:
{name: len(name) for name in names if len(name) <= 4}

{'Dick': 4, 'Tom': 3}

## Generator expressions: lazy lists

### What is a generator?

* A lazy sequence of values: the next value is only computed when you ask for it
* You can iterate over the entire sequence with `for value in the_generator`
* You can manually get the next value with `next(the_generator)`

### What is a generator good for?

* When you don't want to have the entire list in memory at once (saves CPU, too!)
* When you only need to pass over the generator's items once, in order
* When you don't need random access: `mygenerator[3]` will not work

## Life cyle of a generator

(0) Generator function  
(1) Generator or Generator expression  
(2) Consumption  
(3) Empty generator

### How do I consume a generator?

* `list(mygenerator)` consumes all values, and returns a list of all values
  * Useful when you get a generator, but want a list
* `next(mygenerator)` consumes one value, and returns that value
  * You'll rarely use this
* `for item in mygenerator:` consume and use items one by one
    * repeatedly calls next()

### 1. Generator function

In [11]:
def countdown(n):
    while 0 < n:
        yield n
        n = n - 1
    yield 'Liftoff!'

### 2. Generator or generator expression

In [12]:
print(countdown(4))
print( (char for char in 'asdf') )

<generator object countdown at 0x7f55183bdbf8>
<generator object <genexpr> at 0x7f55183bdbf8>


### 3. Consumption

In [13]:
list(countdown(4))

[4, 3, 2, 1, 'Liftoff!']

In [14]:
next( (char for char in 'asdf') )

'a'

### 4. Empty generator

In [15]:
c = countdown(4)
list(c)
next(c)

StopIteration: 

## Generator expressions

A generator expression is like a list comprehension, but wrapped in round parentheses instead of square brackets

In [17]:
input_list = [1, 2, 3, 4]
output_list = [x for x in input_list]
output_list

[1, 2, 3, 4]

In [18]:
input_list = [1, 2, 3, 4]
generator = (x for x in input_list)
generator

<generator object <genexpr> at 0x7f55183bda98>

## `map(function, sequence)` is equivalent to a generator comprehension

In [21]:
my_generator = map(lambda x: x + 1, [1, 2, 3, 4])
my_list = list(my_generator)
my_list

[2, 3, 4, 5]

## A pipeline of list comprehensions

We are going to parse this log file:

In [22]:
# The fields of my logfile: id, name, and payment
my_logfile = """\
12,Joan,12.50
13,Carl,1
14,Pippa,30\
"""

## A pipeline of list comprehensions

Notice how the data flows from list comprehension to list comprehension, to be filtered and transformed at each step.

In [23]:
lines = my_logfile.split('\n')
rows = [line.split(',') for line in lines]
rows_with_numbers = [(int(r[0]), r[1], float(r[2])) for r in rows]
paid_at_least_10 = [row for row in rows_with_numbers if 10 < row[2]]

print('lines:', lines)
print('rows:', rows)
print('rows_with_numbers:', rows_with_numbers)
print('paid_at_least_10:', paid_at_least_10)

lines: ['12,Joan,12.50', '13,Carl,1', '14,Pippa,30']
rows: [['12', 'Joan', '12.50'], ['13', 'Carl', '1'], ['14', 'Pippa', '30']]
rows_with_numbers: [(12, 'Joan', 12.5), (13, 'Carl', 1.0), (14, 'Pippa', 30.0)]
paid_at_least_10: [(12, 'Joan', 12.5), (14, 'Pippa', 30.0)]


### A pipeline of generator expressions

Replace list comprehensions with generator expressions: free laziness!

In [24]:
lines = my_logfile.split('\n')
rows = (line.split(',') for line in lines)
rows_with_numbers = ((int(row[0]), row[1], float(row[2])) for row in rows)
paid_at_least_10 = (row for row in rows_with_numbers if 10 < row[2])

print(paid_at_least_10)

# list() starts consuming the generators, and starts the data flowing
list(paid_at_least_10)

<generator object <genexpr> at 0x7f55183f02b0>


[(12, 'Joan', 12.5), (14, 'Pippa', 30.0)]