In [1]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

## List Comprehension

List comphrehensions have the basic formal form:

* `[expression for item in iterable]`

With a conditional logic:

* `new_list = [expression for item in iterable (if conditional)]`

Placing the conditional at the end is considered a filter clause, which filters out of the result items for which the test is not true. In other words, using `if` at the end skips an iterable’s items for which the `if` clause is not true.
 We could also apply conditional logic in the comprehension expression itself:

* `new_list = [expression (if conditional) for item in iterable (if conditional)]`

For nested `for`, we can use the following syntax:

* `new_list = [inner expression for outer in outer_iterable for inner in inner_iterable]`

This is equivalent to the following nested for loop:

In [2]:
new_list = []
for outer in 'abc':
    for inner in 'def':
        new_list.append(outer + inner)
new_list

['ad', 'ae', 'af', 'bd', 'be', 'bf', 'cd', 'ce', 'cf']

In [3]:
new_list_comp = [outer + inner for outer in 'abc' for inner in 'def']
new_list_comp

['ad', 'ae', 'af', 'bd', 'be', 'bf', 'cd', 'ce', 'cf']

We can code any number of nested for loops in a list comprehension, and each may have an optional associated if test to act as a filter. The general structure of list comprehensions is as follows:

<p align="center">
  <img width="600" height="250" img src="images/nested_list_comp.png">
</p>

## List Comphrehension vs For loop

### Example 1

In [4]:
res = [x + y for x in [0, 1, 2] for y in [100, 200, 300]]
res

[100, 200, 300, 101, 201, 301, 102, 202, 302]

The mapping for the nested list comprehension is as follows:

* expression is x + y

* target1 is x and target2 is y

* iterable1 is `[0, 1, 2]` and iterable2 is `[100, 200, 300]`

This is equivalent to the following nested for loop:

In [5]:
res = []
for x in [0, 1, 2]:
    for y in [100, 200, 300]:
        res.append(x + y)
res

[100, 200, 300, 101, 201, 301, 102, 202, 302]

### Example 2

Although list comprehensions construct list results, they can iterate over any sequence or other iterable type. For instance, `str`:

In [6]:
res = [x + y for x in 'spam' for y in 'SPAM']
print(res, end = ' ')

['sS', 'sP', 'sA', 'sM', 'pS', 'pP', 'pA', 'pM', 'aS', 'aP', 'aA', 'aM', 'mS', 'mP', 'mA', 'mM'] 

In [7]:
res = []
for x in 'spam':
    for y in 'SPAM':
        res.append(x + y)
print(res, end = ' ')

['sS', 'sP', 'sA', 'sM', 'pS', 'pP', 'pA', 'pM', 'aS', 'aP', 'aA', 'aM', 'mS', 'mP', 'mA', 'mM'] 

### Example 3

In [8]:
res_list_comp = [x + y + z for x in 'spam' for y in 'SPAM' for z in 'yang']
print(res_list_comp, end = ' ')

['sSy', 'sSa', 'sSn', 'sSg', 'sPy', 'sPa', 'sPn', 'sPg', 'sAy', 'sAa', 'sAn', 'sAg', 'sMy', 'sMa', 'sMn', 'sMg', 'pSy', 'pSa', 'pSn', 'pSg', 'pPy', 'pPa', 'pPn', 'pPg', 'pAy', 'pAa', 'pAn', 'pAg', 'pMy', 'pMa', 'pMn', 'pMg', 'aSy', 'aSa', 'aSn', 'aSg', 'aPy', 'aPa', 'aPn', 'aPg', 'aAy', 'aAa', 'aAn', 'aAg', 'aMy', 'aMa', 'aMn', 'aMg', 'mSy', 'mSa', 'mSn', 'mSg', 'mPy', 'mPa', 'mPn', 'mPg', 'mAy', 'mAa', 'mAn', 'mAg', 'mMy', 'mMa', 'mMn', 'mMg'] 

In [9]:
res_loop = []
for x in 'spam':
    for y in 'SPAM':
        for z in 'yang':
            res_loop.append(x + y + z)
print(res_loop, end = ' ')

['sSy', 'sSa', 'sSn', 'sSg', 'sPy', 'sPa', 'sPn', 'sPg', 'sAy', 'sAa', 'sAn', 'sAg', 'sMy', 'sMa', 'sMn', 'sMg', 'pSy', 'pSa', 'pSn', 'pSg', 'pPy', 'pPa', 'pPn', 'pPg', 'pAy', 'pAa', 'pAn', 'pAg', 'pMy', 'pMa', 'pMn', 'pMg', 'aSy', 'aSa', 'aSn', 'aSg', 'aPy', 'aPa', 'aPn', 'aPg', 'aAy', 'aAa', 'aAn', 'aAg', 'aMy', 'aMa', 'aMn', 'aMg', 'mSy', 'mSa', 'mSn', 'mSg', 'mPy', 'mPa', 'mPn', 'mPg', 'mAy', 'mAa', 'mAn', 'mAg', 'mMy', 'mMa', 'mMn', 'mMg'] 

In [10]:
res_loop == res_list_comp

True

### Example 4 (with if statements)

In [11]:
res_list_comp = [x + y + z for x in 'spam' if x not in ['s', 'm'] 
                           for y in 'SpAm9df32d' if y.isdigit()
                           for z in 'YaNgdD' if z.isupper()]
print(res_list_comp, end = ' ')

['p9Y', 'p9N', 'p9D', 'p3Y', 'p3N', 'p3D', 'p2Y', 'p2N', 'p2D', 'a9Y', 'a9N', 'a9D', 'a3Y', 'a3N', 'a3D', 'a2Y', 'a2N', 'a2D'] 

In [12]:
res_loop = []
for x in 'spam':
    if x not in ['s', 'm']:
        for y in 'SpAm9df32d':
            if y.isdigit():
                for z in 'YaNgdD':
                    if z.isupper():
                        res_loop.append(x + y + z)
print(res_loop, end = ' ')

['p9Y', 'p9N', 'p9D', 'p3Y', 'p3N', 'p3D', 'p2Y', 'p2N', 'p2D', 'a9Y', 'a9N', 'a9D', 'a3Y', 'a3N', 'a3D', 'a2Y', 'a2N', 'a2D'] 

In [13]:
res_loop == res_list_comp

True

### Example 5 (numeric)

In [14]:
res_list_comp = [(x, y) for x in range(5) if x % 2 == 0 for y in range(5) if y % 2 == 1]
print(res_list_comp, end = ' ')

[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)] 

In [15]:
res_loop = []
for x in (0, 1, 2, 3, 4):
    if x % 2 == 0:
        for y in (0, 1, 2, 3, 4):
            if y % 2 == 1:
                res_loop.append((x, y))
print(res_loop, end = ' ')

[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)] 

In [16]:
res_loop == res_list_comp

True

## List Comprehension and Matrices

Matrics (multidimentional arrays) are often coded as follows:

In [17]:
M = [[1, 2, 3], 
     [4, 5, 6], 
     [7, 8, 9]]
N = [[2, 2, 2], 
     [3, 3, 9], 
     [4, 4, 4]]

In [18]:
# Row 3
M[2]
# Second row third column
N[1][2]

[7, 8, 9]

9

### Getting columns and rows using list comprehension

Processing matrices coded as above essentially means iterating over the list elements (which represent the rows in a matrix). 

In [19]:
# Since each row is a nested sublist in M
[row for row in M]

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [20]:
[row for row in N]

[[2, 2, 2], [3, 3, 9], [4, 4, 4]]

In [21]:
# Columns
[row[0] for row in M]
[row[1] for row in M]
[row[2] for row in M]

[1, 4, 7]

[2, 5, 8]

[3, 6, 9]

In [22]:
[row[0] for row in N]
[row[1] for row in N]
[row[2] for row in N]

[2, 3, 4]

[2, 3, 4]

[2, 9, 4]

### Using offset to get columns

In [23]:
[M[row][0] for row in range(len(M))]
[M[row][1] for row in range(len(M))]
[M[row][2] for row in range(len(M))]

[1, 4, 7]

[2, 5, 8]

[3, 6, 9]

### Pulling out diagonal

In [24]:
# For square matrices
[M[i][i] for i in range(len(M))]
[N[i][i] for i in range(len(N))]

[1, 5, 9]

[2, 3, 4]

In [25]:
# Or a less efficient but more explicitly, we could use zip
[M[i][j] for i, j in zip(range(len(M)), range(len(M)))]
[N[i][j] for i, j in zip(range(len(N)), range(len(N)))]

[1, 5, 9]

[2, 3, 4]

### Off-diagonal

In [26]:
M = [[64, 2, 3, 23, 0], 
     [4, 5, 6, 21, 8], 
     [7, 8, 9, 2, 42],
     [9, 3, 28, 90, 4],
     [72, 54, 2, 2, 3]]
# Forward 
[M[i][len(M) - 1 - i] for i in range(len(M))]
# Or backward
[M[i][- 1 - i] for i in range(len(M))]

[0, 21, 9, 3, 72]

[0, 21, 9, 3, 72]

In the forward approach, we take the length of the square matrix (5) and subtract off 1 to obtain the integer offset (since Python uses 0-based indexing) for the last element of the first ($0^{th}$) row. Then, $i$ becomes 1, and we get the second to last element of the second (1st) row, and so on. In the backward approach, we begin with the last element -1, and decrease it by 1 each time we increase the row index $i$.

### Upadate matrix in place

We can use offset assignment to update matrix in place:

In [27]:
# A 7 x 5 matrix
M = [[64, 2, 3, 23, 0], 
     [4, 5, 6, 21, 8], 
     [7, 8, 9, 2, 42],
     [9, 3, 28, 90, 4],
     [72, 54, 2, 2, 3],
     [23, 33, 41, 9, 3],
     [0, 3, 99, 4, 10]]
N = M
# Using nested for loop
for i in range(len(M)):
    # Only update odd rows (0 based indexing so 0, 2, 4, 6 are actually odd rows)
    if i % 2 == 0:
        for j in range(len(M[i])):
            # Only update even columns (1, 3, 5, 7, etc.)
            if j % 2 == 1:
                N[i][j] += 5
N

[[64, 7, 3, 28, 0],
 [4, 5, 6, 21, 8],
 [7, 13, 9, 7, 42],
 [9, 3, 28, 90, 4],
 [72, 59, 2, 7, 3],
 [23, 33, 41, 9, 3],
 [0, 8, 99, 9, 10]]

In [28]:
# M should be updated two since N references the same object 
N is M

True

When list comprehension, we can achieve similar effect but technically we are created a new list:

In [29]:
[[col - 5 if row.index(col) % 2 == 1 and M.index(row) % 2 == 0 else col for col in row] 
                                                                        for row in M]

[[64, 2, 3, 23, 0],
 [4, 5, 6, 21, 8],
 [7, 8, 9, 7, 42],
 [9, 3, 28, 90, 4],
 [72, 54, 2, 2, 3],
 [23, 33, 41, 9, 3],
 [0, 3, 99, 4, 10]]

As can be seen, we obtain the original matrix back. The expression does the following:

* Each `row` in M is a sublist representing rows

* Each `col` in `row` is an element in the sublist representing entries in a particular row in the matrix

* We subtract 5 from the value represented by the variable `col` if two conditions are true:

    - `row.index(col) % 2 == 1` means that the positional integer offset of the value `col` in `row` must be 0, 2, 4, 6, i.e. the even columns
    
    - `M.index(row) % 2 == 0` means that the positional integer offset of the value `row` in `M` must be 1, 3, 5, i.e. the odd rows

### Matrix multiplication

In [30]:
# M is 3 x 2
M = [[2, 4],
     [9, 9],
     [7, 3]]
# N is 2 x 3
N = [[9, 7, 5],
     [8, 6, 3]]
# Resultant matrix is 3 x 3
MN = [[0 for j in range(len(N[1]))] for i in range(len(M))]
MN

[[0, 0, 0], [0, 0, 0], [0, 0, 0]]

Using for loop:

In [31]:
# Each i is an offset representing row index in M (three rows from 0 to 2)
for i in range(len(M)):
    # Each j is an offset representing column index in N (three columns in N from 0 to 2)
    for j in range(len(N[0])):
        # Each k is an offset representing number of rows in N (two rows from 0 to 1)
        for k in range(len(N)):
            # M is 3 x 2 and N is 2 x 3
            # Dot product
            MN[i][j] += M[i][k] * N[k][j]

In [32]:
MN

[[50, 38, 22], [153, 117, 72], [87, 67, 44]]

To understand, we make the following statements:

* For M (3 x 2), the first offset cannot exceed 3 and the second offset cannot exceed 2, so `i` and `k`

* For N (2 x 3), the first offset cannot exceed 2 and the second cannot exceed 3, so `k` and `j`

* The resultant matrix MN has elements `i` (row of M) and `j` (column of N) since matrix multiplication is row times column

The brute force approach above can be replaced by a list comprehension:

In [33]:
MN_list_comp = [[0 for j in range(len(N[1]))] for i in range(len(M))]
MN_list_comp
# List comprehension
[[sum([x * y for (x, y) in zip(row, col)]) for col in zip(*N)] for row in M]

[[0, 0, 0], [0, 0, 0], [0, 0, 0]]

[[50, 38, 22], [153, 117, 72], [87, 67, 44]]

To understand:

* Each `row in M` is a list representing a particular row in matrix M

* Each `col in zip(*N)` is a tuple representing a particular column in N

To understand `zip(*N)`:

In [34]:
list(zip(*N))

[(9, 8), (7, 6), (5, 3)]

Then:

* Since there are two elements in each `row in M` (two columns in each row) and two elements in each tuple `col` in N (two rows in each column), the iterable object `zip(row, col)` matches up nicely.

* Then each x in a particular row in M is multiplied by each y in a particular column in N, and their products compose a list via a list comprehension `[x * y for (x, y) in zip(row, col)]`

* This list is then summed using the `sum` function, which is the dot product of the row and column of M and N

In [35]:
# Lastly, using numpy functions
import numpy as np
np.matmul(M, N)
np.dot(M, N)

array([[ 50,  38,  22],
       [153, 117,  72],
       [ 87,  67,  44]])

array([[ 50,  38,  22],
       [153, 117,  72],
       [ 87,  67,  44]])

## Scope

An important aspect of list comprehensions is that list comprehension variables are localized in Python 3.x. In contrast, for loop statements never localize their variables to the statement block in any Python. In other words, list comprehenstions do not clash with bindings in the parent environment of the local scope of a iteration context:

In [36]:
X = 99
for X in range(5):
    pass
# Binding from X to the integer object 99 in the global environment is not retained
X

4

In [37]:
Y = 99
[Y for Y in range(5)]

[0, 1, 2, 3, 4]

In [38]:
# Binding is retained
Y

99