# Python Concepts for Data Science: List Comprehension

![List Comprehension](list_comprehension.png) Credit: [Buggy Programmer](http://buggyprogrammer.com)

Lists are one of the most used data structures in Python as they allow us to store data in an easy to handle way. The elements of a list can be explicitely written out one after another by the programmer but if the number of elements increases, this task can become quite tedious. One might need an automated list generation technique, hence the concept of list comprehension. List comprehension, which is a simple and powerfull way of creating a list from any existing iterable object. 

In this post, I will first describe the different ways to create list comprehensions, with examples, then I will state when to avoid using it as every single concept, how powerful it might be, has its own limitations. I will also include the equivalent code using **for loops** to compare with list comprehension, the factor of comparison being the execution time.

## 1. Simple list comprehension

### General syntax of list comprehension
A list comprehension is composed of the following items enclosed by square brackets:
- An output
- A collection
- A condition and expression

A simple list comprehension doesn't contain a condition. Let's create a simple list comprehension and its equivalent using a `for loop` and evaluate the runtime of each. For the execution time to be consistent, we will be dealing with long lists and only the first 10 elements of each list will be printed.

In [30]:
import time

# number of iterations
n_iter = 100000

# list comprehension
start = time.time()
x = [i for i in range(n_iter)]
end = time.time()
print(f'x = {x[:10]}')
print(f'Execution time of list comprehension : {(end - start):.2f} seconds\n')

# For loop
start = time.time()
x = []
for i in range(n_iter):
    x.append(i)
end = time.time()
print(f'x = {x[:10]}')
print(f'Execution time of for loop : {(end - start):.2f} seconds')

x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Execution time of list comprehension : 0.01 seconds

x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Execution time of for loop : 0.05 seconds


In [84]:
# list comprehension
start = time.time()
x = [i**2 for i in range(n_iter)]
end = time.time()
print(f'x = {x[:10]}')
print(f'Execution time of list comprehension : {(end - start):.2f} seconds\n')

# For loop
start = time.time()
x = []
for i in range(n_iter):
    x.append(i**2)
end = time.time()
print(f'x = {x[:10]}')
print(f'Execution time of for loop : {(end - start):.2f} seconds')

x = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Execution time of list comprehension : 0.08 seconds

x = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Execution time of for loop : 0.11 seconds


## 2. List Comprehension with *If statement*
### a. Single *If statement*

In the example below, we create a list of all the integers less than `n_iters` that are divisible by `3`.

In [86]:
# list comprehension with if statement
start = time.time()
x = [i for i in range(n_iter) if i%3 == 0]
end = time.time()
print(f'x = {x[:10]}')
print(f'Execution time of list comprehension : {(end - start):.2f} seconds\n')

# Equivalent for loop 
start = time.time()
x = []
for i in range(n_iter):
    if i%3 == 0:
        x.append(i)
end = time.time()
print(f'x = {x[:10]}')
print(f'Execution time of for loop : {(end - start):.2f} seconds')

x = [0, 3, 6, 9, 12, 15, 18, 21, 24, 27]
Execution time of list comprehension : 0.03 seconds

x = [0, 3, 6, 9, 12, 15, 18, 21, 24, 27]
Execution time of for loop : 0.05 seconds


###  b. Nested if statement

Here the elements of the list must meet two different conditions at the same time. The elements of our list must not only be divisible by `3`, but also by `7`.

In [33]:
# list comprehension with if statement
start = time.time()
x = [i for i in range(n_iter) if i%3 == 0 if i%7 == 0]
end = time.time()
print(f'x = {x[:10]}')
print(f'Execution time of list comprehension : {(end - start):.2f} seconds\n')

# Equivalent for loop 
start = time.time()
x = []
for i in range(n_iter):
    if i%3 == 0:
        if i%7 == 0:
            x.append(i)
end = time.time()
print(f'x = {x[:10]}')
print(f'Execution time of for loop : {(end - start):.2f} seconds')

x = [0, 21, 42, 63, 84, 105, 126, 147, 168, 189]
Execution time of list comprehension : 0.03 seconds

x = [0, 21, 42, 63, 84, 105, 126, 147, 168, 189]
Execution time of for loop : 0.05 seconds


## 3. List comprehension with *if...else* statement
### a. Single *if...else* statement


In [39]:
c = [x if x > 5 else x**2 for x in range(n_iter)]
c[:10]

[0, 1, 4, 9, 16, 25, 6, 7, 8, 9]

In [87]:
# Equivalent to
c = []
for x in range(n_iter):
    if x > 5:
        c.append(x)
    else: 
        c.append(x**2)
c[:10]

[0, 1, 4, 9, 16, 25, 6, 7, 8, 9]

### b. Multiple *if...else* statement

In the example below we create a list comprehension `divisors` with `3 if...else` statements that works as follows:

for each elements in `multiples` append
- `'two'` if this elements divisible by `2`
- `'three'` if this elements divisible by `3`
- `'neither'` if this elements neither divisible by `2` nor by `3`
- `'both'` if the element is divible by both.

In [88]:
multiples = [0, 54, 86, 1, 5, 9, 2, 45, 6, 75, 23, 14, 5, 65, 81, 60]
divisors = ['two' if (x%2==0 and x%3!=0) else "three" if (x%3==0 and x%2!=0) else 'both' if (x%3==0 and x%2==0)  else 'neither' for x in multiples]
print(divisors)

['both', 'both', 'two', 'neither', 'neither', 'three', 'two', 'three', 'both', 'three', 'neither', 'two', 'neither', 'neither', 'three', 'both']


In [89]:
# Equivalent
divisors = []
for x in multiples:
    if x%2 == 0 and x%3 != 0:
        divisors.append('two')
    elif x%3 == 0 and x%2 != 0:
        divisors.append("three")
    elif x%3 == 0 and x%2 == 0:
        divisors.append("both")
    else: 
        divisors.append('neither')

print(divisors)

['both', 'both', 'two', 'neither', 'neither', 'three', 'two', 'three', 'both', 'three', 'neither', 'two', 'neither', 'neither', 'three', 'both']


### 4. List comprehension with nested *for loops*

In the example below, we create a list `target` of tuples of elements from two source lists `src1` and `src2`.

In [90]:
src1 = [i for i in range(3)]
src2 = [j for j in range(2)]
target = [(x,y) for x in src1 for y in src2]
print(target)

[(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1)]


In [91]:
# Equivalent
target = []
for x in src1:
    for y in src2:
        target.append((x,y))
print(target)

[(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1)]
