# List comprehensions 

List comprehensions in Python are concise, syntactic constructs. They can be utilized to generate lists from other
lists by applying functions to each element in the list. The following section explains and demonstrates the use of
these expressions.

## 1) List comprehensions 

A list comprehension creates a new list by applying an expression to each element of an iterable. The most basic
form is:

[ <'expression> for <`element> in <"iterable"> ]

There's also an optional 'if' condition: 

[ <"expression> for <"element> in <"iterable> if <"condition> ]

Each <"element"> in the <"iterable"> is plugged in to the <"expression"> if the (optional) <"condition"> evaluates to true
. All results are returned at once in the new list. Generator expressions are evaluated lazily, but list comprehensions
evaluate the entire iterator immediately - consuming memory proportional to the iterator's length.

To create a list of squared integers: 

In [1]:
squares = [x * x for x in (1, 2, 3, 4)]

The for expression sets x to each value in turn from (1, 2, 3, 4). The result of the expression x * x is appended
to an internal list. The internal list is assigned to the variable squares when completed.

Besides a speed increase (as explained here), a list comprehension is roughly equivalent to the following for-loop: 

In [2]:
squares = [] 
for x in (1,2,3,4):
    squares.append(x) 
    

In [3]:
squares

[1, 2, 3, 4]

The expression applied to each element can be as complex as needed: 

In [4]:
# Get a list of uppercase characters from a string
[s.upper() for s in "Hello World"]
# ['H', 'E', 'L', 'L', 'O', ' ', 'W', 'O', 'R', 'L', 'D']
# Strip off any commas from the end of strings in a list
[w.strip(",") for w in ["these,", "words,,", "mostly", "have,commas,"]]
# ['these', 'words', 'mostly', 'have,commas']
# Organize letters in words more reasonably - in an alphabetical order
sentence = "Beautiful is better than ugly"
["".join(sorted(word, key=lambda x: x.lower())) for word in sentence.split()]
# ['aBefiltuu', 'is', 'beertt', 'ahnt', 'gluy']

['aBefiltuu', 'is', 'beertt', 'ahnt', 'gluy']

###### else 

else can be used in List comprehension constructs, but be careful regarding the syntax. The if/else clauses should be used before for loop, not after:

In [5]:
[x for x in 'apple' if x in 'aeiou' else '*']

SyntaxError: invalid syntax (3610056486.py, line 1)

In [None]:
[x if x in "aeiou" else "*" for x in "apple"]

['a', '*', '*', '*', 'e']

Note this uses a different language construct, a conditional expression, which itself is not part of the comprehension syntax. Whereas the if after the for in is a part of list comprehensions and use to filter elements from the source iterable. 

##### Doubble Iteration 

Order of double iteration [... for x in ... for y in ...] is either natural or counter-intuitive. The rule of thumb is to follow equivalent for loop: 

In [7]:
def foo(i): 
    return i, i+0.5 
for i in range(3):
    for x in foo(i): 
        print(str(x)) 


0
0.5
1
1.5
2
2.5


This becomes: 

In [8]:
[str(x) for i in range(3) for x in foo(i)]

['0', '0.5', '1', '1.5', '2', '2.5']

This can be compressed into one line as [str(x) for i in range(3) for x in foo(i)]

##### In-place Mutation and Other Side Effects 

Before using list comprehension, understand the difference between functions called for their side effects (mutating, of in-place functions) which usually return None, and functions that return an interesting value. 

Many functions (especially pure functions) simply take an object and return some object. An in-place function
modifies the existing object, which is called a side effect. Other examples include input and output operations such
as printing

list.sort() sorts a list in-place (meaning that it modifies the original list) and returns the value None. Therefore, it
won't work as expected in a list comprehension:

In [9]:
[x.sort() for x in [[2,1], [4,3], [0,1]]] 

[None, None, None]

Instead, sorted() returns a sorted list rather than sorting in-place: 

In [10]:
[sorted(x) for x in [[2,1], [4,3], [0,1]]]

[[1, 2], [3, 4], [0, 1]]

Using comprehensions for side-effects is possible, such as I/O or in-place functions. Yet a for loop is usually more
readable. While this works in Python 3:

In [11]:
[print(x) for x in (1,2,3)]

1
2
3


[None, None, None]

Instead use: 


In [12]:
for x in (1,2,3): 
    print(x)

1
2
3


In some situations, side effect functions are suitable for list comprehension. random.randrange() has the side
effect of changing the state of the random number generator, but it also returns an interesting value. Additionally,
next() can be called on an iterator.

The following random value generator is not pure, yet makes sense as the random generator is reset every time the
expression is evaluated:

In [14]:
from random import randrange 
[randrange(1,7) for _ in range(10)]

[1, 5, 4, 4, 3, 6, 5, 3, 3, 2]

In [15]:
[randrange(1,20) for _ in range(100)]

[14,
 3,
 8,
 4,
 17,
 9,
 8,
 11,
 14,
 4,
 19,
 1,
 3,
 8,
 1,
 2,
 10,
 14,
 7,
 19,
 17,
 14,
 15,
 16,
 1,
 18,
 6,
 8,
 9,
 8,
 14,
 1,
 19,
 8,
 19,
 7,
 8,
 12,
 2,
 8,
 16,
 13,
 14,
 11,
 2,
 19,
 16,
 3,
 17,
 17,
 4,
 4,
 2,
 17,
 5,
 11,
 16,
 5,
 4,
 8,
 4,
 19,
 4,
 10,
 5,
 3,
 8,
 5,
 10,
 18,
 18,
 1,
 16,
 9,
 8,
 10,
 14,
 6,
 8,
 11,
 10,
 14,
 2,
 11,
 6,
 12,
 17,
 19,
 7,
 4,
 12,
 9,
 13,
 5,
 9,
 17,
 19,
 2,
 8,
 8]

##### Whitespace in list comprehensions 

More complicated list comprehensions can reach an undesired length, or become less readable. Although less common in examples, it is possible to break a list comprehension into multiple lines like so: 

In [16]:
[
    x for x in 'foo' if x not in 'bar' 
]

['f', 'o', 'o']

## 2) Conditional List comprehensions 

Given a list comprehension you can append one or more if conditions to filter values.

[<"expression"> for <"element"> in <"iterable"> if <"condition">]

For each <"element"> in <"iterable">; if <"conditio"n> evaluates to True, add <"expression"> (usually a function of
<"element">) to the returned list.

For example, this can be used to extract only even numbers from a sequence of integers: 

In [17]:
[x for x in range(10) if x%2 == 0] 

[0, 2, 4, 6, 8]

The above code is equivalent to: 

In [18]:
even_numbers = []

In [19]:
for x in range(10):
    if x%2 == 0: 
        even_numbers.append(x) 


In [20]:
print(even_numbers) 

[0, 2, 4, 6, 8]


Also, a conditional list comprehension of the form [e for x in y if c] (where e and c are expressions in terms of
x) is equivalent to list(filter(lambda x: c, map(lambda x: e, y)))

Despite providing the same result, pay attention to the fact that the former example is almost 2x faster than the
latter one. For those who are curious, this is a nice explanation of the reason why.

Note that this is quite different from the ... if ... else ... conditional expression (sometimes known as a
ternary expression) that you can use for the <expression> part of the list comprehension. Consider the following
example:

In [21]:
[x if x % 2 == 0 else None for x in range(10)]

[0, None, 2, None, 4, None, 6, None, 8, None]

Here the conditional expression isn't a filter, but rather an operator determining the value to be used for the list
items:

<"value-if-condition-is-true"> if <"condition"> else <"value-if-condition-is-false">

This becomes more obvious if you combine it with other operators:

In [22]:
[2 * (x if x % 2 == 0 else -1) + 1 for x in range(10)]

[1, -1, 5, -1, 9, -1, 13, -1, 17, -1]

The above code is equivalent to: 


In [23]:
numbers = [] 
for x in range(10): 
    if x%2 == 0: 
        temp = x 
    else: 
        temp = -1 
    numbers.append(temp)
print(numbers) 

[0, -1, 2, -1, 4, -1, 6, -1, 8, -1]


One can combine ternary expressions and if conditions. The ternary operator works on the filtered result:

In [24]:
[x if x > 2 else '*' for x in range(10) if x%2 == 0] 

['*', '*', 4, 6, 8]

The same couldn't have been achieved just by ternary operator only: 

In [25]:
[x if (x > 2 and x%2 == 0) else '*' for x in range(10)]

['*', '*', '*', '*', 4, '*', 6, '*', 8, '*']

See also: Filters, which often provide a sufficient alternative to conditional list comprehensions.

## 3) Avoid repetitive and expensive operations using conditional clause 

Consider the below list comprehension: 


In [26]:
def f(x): 
    import time 
    time.sleep(.1) 
    return x**2 
[f(x) for x in range(1000) if f(x) > 10]

[16,
 25,
 36,
 49,
 64,
 81,
 100,
 121,
 144,
 169,
 196,
 225,
 256,
 289,
 324,
 361,
 400,
 441,
 484,
 529,
 576,
 625,
 676,
 729,
 784,
 841,
 900,
 961,
 1024,
 1089,
 1156,
 1225,
 1296,
 1369,
 1444,
 1521,
 1600,
 1681,
 1764,
 1849,
 1936,
 2025,
 2116,
 2209,
 2304,
 2401,
 2500,
 2601,
 2704,
 2809,
 2916,
 3025,
 3136,
 3249,
 3364,
 3481,
 3600,
 3721,
 3844,
 3969,
 4096,
 4225,
 4356,
 4489,
 4624,
 4761,
 4900,
 5041,
 5184,
 5329,
 5476,
 5625,
 5776,
 5929,
 6084,
 6241,
 6400,
 6561,
 6724,
 6889,
 7056,
 7225,
 7396,
 7569,
 7744,
 7921,
 8100,
 8281,
 8464,
 8649,
 8836,
 9025,
 9216,
 9409,
 9604,
 9801,
 10000,
 10201,
 10404,
 10609,
 10816,
 11025,
 11236,
 11449,
 11664,
 11881,
 12100,
 12321,
 12544,
 12769,
 12996,
 13225,
 13456,
 13689,
 13924,
 14161,
 14400,
 14641,
 14884,
 15129,
 15376,
 15625,
 15876,
 16129,
 16384,
 16641,
 16900,
 17161,
 17424,
 17689,
 17956,
 18225,
 18496,
 18769,
 19044,
 19321,
 19600,
 19881,
 20164,
 20449,
 20736,
 2

This results in two calls to f(x) for 1,000 values of x: one call for generating the value and the other for checking the
if condition. If f(x) is a particularly expensive operation, this can have significant performance implications.
Worse, if calling f() has side effects, it can have surprising results.

Instead, you should evaluate the expensive operation only once for each value of x by generating an intermediate
iterable (generator expression) as follows:

In [28]:
[v for v in (f(x) for x in range(1000)) if v > 10]

[16,
 25,
 36,
 49,
 64,
 81,
 100,
 121,
 144,
 169,
 196,
 225,
 256,
 289,
 324,
 361,
 400,
 441,
 484,
 529,
 576,
 625,
 676,
 729,
 784,
 841,
 900,
 961,
 1024,
 1089,
 1156,
 1225,
 1296,
 1369,
 1444,
 1521,
 1600,
 1681,
 1764,
 1849,
 1936,
 2025,
 2116,
 2209,
 2304,
 2401,
 2500,
 2601,
 2704,
 2809,
 2916,
 3025,
 3136,
 3249,
 3364,
 3481,
 3600,
 3721,
 3844,
 3969,
 4096,
 4225,
 4356,
 4489,
 4624,
 4761,
 4900,
 5041,
 5184,
 5329,
 5476,
 5625,
 5776,
 5929,
 6084,
 6241,
 6400,
 6561,
 6724,
 6889,
 7056,
 7225,
 7396,
 7569,
 7744,
 7921,
 8100,
 8281,
 8464,
 8649,
 8836,
 9025,
 9216,
 9409,
 9604,
 9801,
 10000,
 10201,
 10404,
 10609,
 10816,
 11025,
 11236,
 11449,
 11664,
 11881,
 12100,
 12321,
 12544,
 12769,
 12996,
 13225,
 13456,
 13689,
 13924,
 14161,
 14400,
 14641,
 14884,
 15129,
 15376,
 15625,
 15876,
 16129,
 16384,
 16641,
 16900,
 17161,
 17424,
 17689,
 17956,
 18225,
 18496,
 18769,
 19044,
 19321,
 19600,
 19881,
 20164,
 20449,
 20736,
 2

Or, using the builtin map equivalent:

In [29]:
[v for v in map(f, range(1000)) if v > 10]

[16,
 25,
 36,
 49,
 64,
 81,
 100,
 121,
 144,
 169,
 196,
 225,
 256,
 289,
 324,
 361,
 400,
 441,
 484,
 529,
 576,
 625,
 676,
 729,
 784,
 841,
 900,
 961,
 1024,
 1089,
 1156,
 1225,
 1296,
 1369,
 1444,
 1521,
 1600,
 1681,
 1764,
 1849,
 1936,
 2025,
 2116,
 2209,
 2304,
 2401,
 2500,
 2601,
 2704,
 2809,
 2916,
 3025,
 3136,
 3249,
 3364,
 3481,
 3600,
 3721,
 3844,
 3969,
 4096,
 4225,
 4356,
 4489,
 4624,
 4761,
 4900,
 5041,
 5184,
 5329,
 5476,
 5625,
 5776,
 5929,
 6084,
 6241,
 6400,
 6561,
 6724,
 6889,
 7056,
 7225,
 7396,
 7569,
 7744,
 7921,
 8100,
 8281,
 8464,
 8649,
 8836,
 9025,
 9216,
 9409,
 9604,
 9801,
 10000,
 10201,
 10404,
 10609,
 10816,
 11025,
 11236,
 11449,
 11664,
 11881,
 12100,
 12321,
 12544,
 12769,
 12996,
 13225,
 13456,
 13689,
 13924,
 14161,
 14400,
 14641,
 14884,
 15129,
 15376,
 15625,
 15876,
 16129,
 16384,
 16641,
 16900,
 17161,
 17424,
 17689,
 17956,
 18225,
 18496,
 18769,
 19044,
 19321,
 19600,
 19881,
 20164,
 20449,
 20736,
 2

Another way that could result in a more readable code is to put the partial result (v in the previous example) in an
iterable (such as a list or a tuple) and then iterate over it. Since v will be the only element in the iterable, the result is
that we now have a reference to the output of our slow function computed only once:

In [30]:
[v for x in range(1000) for v in [f(x)] if v > 10]

[16,
 25,
 36,
 49,
 64,
 81,
 100,
 121,
 144,
 169,
 196,
 225,
 256,
 289,
 324,
 361,
 400,
 441,
 484,
 529,
 576,
 625,
 676,
 729,
 784,
 841,
 900,
 961,
 1024,
 1089,
 1156,
 1225,
 1296,
 1369,
 1444,
 1521,
 1600,
 1681,
 1764,
 1849,
 1936,
 2025,
 2116,
 2209,
 2304,
 2401,
 2500,
 2601,
 2704,
 2809,
 2916,
 3025,
 3136,
 3249,
 3364,
 3481,
 3600,
 3721,
 3844,
 3969,
 4096,
 4225,
 4356,
 4489,
 4624,
 4761,
 4900,
 5041,
 5184,
 5329,
 5476,
 5625,
 5776,
 5929,
 6084,
 6241,
 6400,
 6561,
 6724,
 6889,
 7056,
 7225,
 7396,
 7569,
 7744,
 7921,
 8100,
 8281,
 8464,
 8649,
 8836,
 9025,
 9216,
 9409,
 9604,
 9801,
 10000,
 10201,
 10404,
 10609,
 10816,
 11025,
 11236,
 11449,
 11664,
 11881,
 12100,
 12321,
 12544,
 12769,
 12996,
 13225,
 13456,
 13689,
 13924,
 14161,
 14400,
 14641,
 14884,
 15129,
 15376,
 15625,
 15876,
 16129,
 16384,
 16641,
 16900,
 17161,
 17424,
 17689,
 17956,
 18225,
 18496,
 18769,
 19044,
 19321,
 19600,
 19881,
 20164,
 20449,
 20736,
 2

However, in practice, the logic of code can be more complicated and it's important to keep it readable. In general, a
separate generator function is recommended over a complex one-liner:

In [31]:
def process_prime_numbers(iterable): 
    for x in iterable: 
        if is_prime(x):
            yield f(x) 

In [32]:
[x for x in process_prime_numbers(range(1000)) if x > 10]

NameError: name 'is_prime' is not defined

Another way to prevent computing f(x) multiple times is to use the @functools.lru_cache()(Python 3.2+)
decorator on f(x). This way since the output of f for the input x has already been computed once, the second function invocation of the original list comprehension will be as fast as a dictionary lookup. This approach uses
memoization to improve efficiency, which is comparable to using generator expressions.

Say you have to flatten a list

In [33]:
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]

Some of the methods could be: 

In [35]:
from functools import reduce 

In [37]:
reduce(lambda x, y: x+y, [1,2,3,4]) 

10

However list comprehension would provide the best time complexity. 

[item for sublist in l for item in sublist]

The shortcuts based on + (including the implied use in sum) are, of necessity, O(L^2) when there are L sublists -- as
the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated,
and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the
end). So (for simplicity and without actual loss of generality) say you have L sublists of I items each: the first I items
are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the
sum of x for x from 1 to L excluded, i.e., I * (L**2)/2.

The list comprehension just generates one list, once, and copies each item over (from its original place of residence
to the result list) also exactly once.

## 4) Dictionary Comprehensions 

A dictionary comprehension is similar to a list comprehension except that it produces a dictionary object instead of
a list.

A basic example: 

In [38]:
{x: x*x for x in (1,2,3,4)} 

{1: 1, 2: 4, 3: 9, 4: 16}

which is just another way of writing: 

In [39]:
dict((x, x*x) for x in (1,2,3,4))

{1: 1, 2: 4, 3: 9, 4: 16}

As with a list comprehension, we can use a conditional statement inside the dict comprehension to produce only
the dict elements meeting some criterion.

In [40]:
{name: len(name) for name in ("Stack", "Overflow", "Exchange") if len(name) > 6}

{'Overflow': 8, 'Exchange': 8}

Or, rewritten using a generator expression.

In [43]:
dict(
    (name, len(name))
    for name in ("Stack", "Overflow", "Exchange")
    if len(name) > 6
)

{'Overflow': 8, 'Exchange': 8}

### 4.1) Starting with a dictionary and using dictionary comprehension as a key-value pair filter

In [44]:
initial_dict = {'x': 1, 'y': 2} 

In [45]:
{key: value for key, value in initial_dict.items() if key == 'x'} 

{'x': 1}

### 4.2) Switching key and value of dictionary (invert dictionary)

If you have a dict containing simple hashable values (duplicate values may have unexpected results): 

In [1]:
my_dict = {1: 'a', 2: 'b', 3: 'c'} 

and you wanted to swap the keys and values you can take several approaches depending on your coding style: 

- swapped = {v: k for k,v in my_dict.items()}

- swapped = dict((v, k) for k, v in my_dict.iteritems())

- swapped = dict(zip(my_dict.values(), my_dict))

- swapped = dict(zip(my_dict.values(), my_dict.keys())) 

- swapped = dict(map(reversed, my_dict.items()))

print(swapped)

If your dictionary is large, consider importing itertools and utilitize izip or imap.

### 4.3) Merging Dictionaries 

Combine dictionaries and optionally override old values with a nested dictionary comprehension. 

In [2]:
dict1 = {'w': 1, 'x': 1} 
dict2 = {'x': 2, 'y': 2, 'z': 2} 

In [3]:
{k:v for d in [dict1, dict2] for k, v in d.items()}

{'w': 1, 'x': 2, 'y': 2, 'z': 2}

In [4]:
{**dict1, **dict2}

{'w': 1, 'x': 2, 'y': 2, 'z': 2}

Note: dictionary comprehensions were added in Python 3.0 and backported to 2.7+, unlike list comprehensions,
which were added in 2.0. Versions < 2.7 can use generator expressions and the dict() builtin to simulate the
behavior of dictionary comprehensions.

## 5) List comprehensions with nested loops 

List Comprehensions can use nested for loops. You can code any number of nested for loops within a list
comprehension, and each for loop may have an optional associated if test. When doing so, the order of the for constructs is the same order as when writing a series of nested for statements. The general structure of list
comprehensions looks like this:

[expression for target1 in iterable1 [if condition1] for target2 in iterable2 [if condition2] for target3 in iterable3 [if condition3]]

For example, the following code flattening a list of lists using multiple for statements: 

In [5]:
data = [[1,2], [3,4], [5,6]] 

In [6]:
output = [] 

In [7]:
for each_list in data:
    for element in each_list:
        output.append(element) 

In [8]:
output

[1, 2, 3, 4, 5, 6]

can be equivalent written as a list comprehension with multiple for constructs: 

In [9]:
data = [[1,2], [3,4], [5,6]] 

In [10]:
output = [element for each_list in data for element in each_list]

In [11]:
print(output) 

[1, 2, 3, 4, 5, 6]


Live Demo
In both the expanded form and the list comprehension, the outer loop (first for statement) comes first.
In addition to being more compact, the nested comprehension is also significantly faster.

In [12]:
data = [[1,2], [3,4], [5,6]] 

In [13]:
def f(): 
    output = [] 
    for each_list in data: 
        for element in each_list: 
            output.append(element) 
    return output 


The overhead for the function call above is about 140ns. 

Inline ifs are nested similarly, and may occur in any position after the first for: 

In [14]:
data = [[1], [2,3], [4,5]]

In [15]:
output = [element for each_list in data if len(each_list) == 2 for element in each_list if element != 5]

In [16]:
print(output) 

[2, 3, 4]


For the sake of readability, however, you should consider using traditional for-loops. This is especially true when
nesting is more than 2 levels deep, and/or the logic of the comprehension is too complex. multiple nested loop list comprehension could be error prone or it gives unexpected result. 

## 6) Generator Expressions 

Generator expressions are very similar to list comprehensions. The main difference is that it does not create a full set of results at once, it creates a generator object which can then be iterated over. v


For instance, see the difference in the following code: 

In [17]:
[x**2 for x in range(10)] 

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [18]:
(x**2 for x in range(10)) 

<generator object <genexpr> at 0x0000016EC8EDFC60>

These are two very different objects: 

- The list comprehension returns a list object whereas the generator comprehension returns a generator. 
- generator objects connot be indexed and makes use of the next function to get items in order. 

Note: We use xrange since it too creates a generator object. If we would use range, a list would be created. Also,
xrange exists only in later version of python 2. In python 3, range just returns a generator. For more information,
see the Differences between range and xrange functions example.

In [19]:
g = (x**2 for x in range(10))

In [20]:
list(g)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [21]:
g

<generator object <genexpr> at 0x0000016EC8EDCC70>

In [22]:
print(g[0])

TypeError: 'generator' object is not subscriptable

In [23]:
g.next()

AttributeError: 'generator' object has no attribute 'next'

    NOTE: The function g.next() should be substituted by next(g) and xrange with range since Iterator.next() and xrange() do not exist in Python 3.

Although both of these can be iterated in a similar way: 

In [24]:
for i in [x**2 for x in range(10)]: 
    print(i) 

0
1
4
9
16
25
36
49
64
81


In [25]:
for i in (x**2 for x in range(10)): 
    print(i) 

0
1
4
9
16
25
36
49
64
81


### 6.1) Use cases 

Generator expressions are lazily evaluated, which means that they generate and return each value only when the generator is iterated. This is useful when iterating through large datasets, avoiding the need to create a duplicate of the dataset in memory: 

for square in (x**2 for x in range(1000000)):


#dosomething

Another common use case is to avoid iterating over an entire iterable if doing so is not necessary. In this example,
an item is retrieved from a remote API with each iteration of get_objects(). Thousands of objects may exist, must
be retrieved one-by-one, and we only need to know if an object matching a pattern exists. By using a generator
expression, when we encounter an object matching the pattern.

In [26]:
def get_objects():
    while True: 
        yield get_next_item()

In [27]:
def object_matches_pattern(obj):
    return matches_pattern 

In [28]:
def right_item_exists():
    items = (object_matches_pattern(each) for each in get_objects())
    for item in items: 
        if item.is_the_right_one: 
            return True 
    return False 

## 7) Set Comprehensions

Set comprehension is similar to list and dictionary comprehension, but it produces a set, which is an unordered collection of unique elements. 

In [29]:
{x for x in range(5)}

{0, 1, 2, 3, 4}

In [30]:
{x for x in range(1, 11) if x % 2 == 0}

{2, 4, 6, 8, 10}

In [32]:
text = "When in the Course of human events it becomes necessary for one people..."

In [33]:
{ch.lower() for ch in text if ch.isalpha()}

{'a',
 'b',
 'c',
 'e',
 'f',
 'h',
 'i',
 'l',
 'm',
 'n',
 'o',
 'p',
 'r',
 's',
 't',
 'u',
 'v',
 'w',
 'y'}

Live demo 

Keep in mind that sét are unordered. This means that the order of the results in the set may differ from the one presented in the above examples. 

Note: Set comprehension is available since python 2.7+, unlike list comprehensions, which were added in 2.0. In python 2.2 to Python 2.6 the set() function can be used with a generator expression to produce the same result: 

In [34]:
set(x for x in range(5)) 

{0, 1, 2, 3, 4}

## 8) Refactoring filter and map to list comprehensions 

The filter or map functions should often be replaced by list comprehensions. Guido Van Rossum describes this
well in an open letter in 2005:

filter(P, S) is almost always written clearer as [x for x in S if P(x)], and this has the huge
advantage that the most common usages involve predicates that are comparisons, e.g. x==42, and
defining a lambda for that just requires much more effort for the reader (plus the lambda is slower than
the list comprehension). Even more so for map(F, S) which becomes [F(x) for x in S]. Of course, in
many cases you'd be able to use generator expressions instead.

The following lines of code are considered "not pythonic" and will raise errors in many python linters. 

In [35]:
filter(lambda x: x%2 == 0, range(10)) 

<filter at 0x16ec8f77370>

In [36]:
map(lambda x: 2*x, range(10)) 

<map at 0x16ec8f673a0>

In [37]:
reduce(lambda x, y: x+y, range(10)) 

NameError: name 'reduce' is not defined

Taking what we have learned from the previous quote, we can break down these filter and map expressions into their equivalent list comprehensions, also removing the lambda functions from each - making the code more readable in the process. 

In [2]:
#Filter: 
# P(x) = x%2 == 0 
# S = range(10) 
[x for x in range(10) if x%2 == 0] 

[0, 2, 4, 6, 8]

In [3]:
#Map 
#F(x) = 2*x 
# S = range(10) 
[2*x for x in range(10)]

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

Readability becomes even more apprent  when dealing with chaining functions. Where due to readability, the results of one map or filter function should be passed as a result to the next, with simple cases, these can be replaced with a single list comprehension. Further, we can easily tell from the list comprehension what the outcome of our process is, where there is more cognitive load when reasoning about the chained Map & Filter process. 

In [5]:
#Map & filter 
filtered = filter(lambda x: x%2 == 0, range(10))

In [7]:
results = map(lambda x: 2*x, filtered)

In [8]:
results

<map at 0x16f7dd28f70>

In [9]:
# List comprehension 
results = [2*x for x in range(10) if x%2 == 0]

In [10]:
results

[0, 4, 8, 12, 16]

##### Refactoring - Quick Reference 

- Map 

map(F,S) == [F(x) for x in S]

- Filter 

filter(P, S) == [x for x in S if P(x)]

Where F and P are functions which respectively transform input values and return a bool. 

## 9) Comprehensions involving tuples 

The for clause of a list comprehension can specify more than one variable: 

In [11]:
[x+y for x,y in [(1,2), (3,4), (5,6)]]

[3, 7, 11]

In [12]:
[x+y for x,y in zip([1,2,3], [4,5,6])]

[5, 7, 9]

This is just like regular for loops: 

In [13]:
for x, y in [(1,2), (3,4), (5,6)]: 
    print(x+y)

3
7
11


Note however, if the expression that begins the comprehension is a tuple then it must be parenthesized: 

In [16]:
[x,y for x, y in [(1,2), (3,4), (5,6)]]

SyntaxError: did you forget parentheses around the comprehension target? (3432540293.py, line 1)

In [18]:
[(x,y) for x, y in [(1,2), (3,4), (5,6)]]

[(1, 2), (3, 4), (5, 6)]

## 10) Counting Occurrences Using Comprehension 

When we want to count the number of items in an iterable, that meet some condition, we can use comprehension to produce an idiomatic syntax: 

In [19]:
print(sum(1 for x in range(1000) if x%2 == 0 and '9' in str(x)))

95


The basic concept can be summarized as: 

1. Iterate over the elements in range(1000)

2. Concatenate all the needed if conditions. 

3. Use 1 as expression to return a 1 for each item that meets the conditions. 

4. Sum up all the 1s to determine number of items that meet the conditions. 

Note: Here we are not collecting the is in a list (note the absence of square brackets), but we are passing the ones directly to the sum function that is summing them up. This is called a generator expression, which is similar to a Comprehension.

## 11) Changing Types in a List 

Quantitative data is often read is as strings that must be converted to numeric types before processing. The types of all list items can be converted with either a List comprehension or the map() function. 

In [20]:
# Convert a list of strings to integers.
items = ["1", "2", "3", "4"]
[int(item) for item in items]
# Out: [1, 2, 3, 4]
# Convert a list of strings to float.
items = ["1", "2", "3", "4"]
map(float, items)
# Out:[1.0, 2.0, 3.0, 4.0]

<map at 0x16f7dd69570>

## 12) Nested list Comprehensions

Nested list comprehensions, unlike list comprehensions with nested loops, are List comprehensions within a list comprehension. The initial expression can be any arbitary expression, including another list comprehension.

In [21]:
[x+y for x in [1,2,3] for y in [4,5,6]] 

[5, 6, 7, 6, 7, 8, 7, 8, 9]

In [22]:
[[x+y for x in [1,2,3]] for y in [4,5,6]] 

[[5, 6, 7], [6, 7, 8], [7, 8, 9]]

The nested example is equivalent to: 

In [23]:
i = [] 
for y in [3,4,5]: 
    temp = [] 
    for x in [1,2,3]: 
        temp.append(x+y) 
    i.append(temp) 

One example where a nested comprehension can be used it to transpose a matrix: 

In [25]:
matrix = [[1,2,3], 
          [4,5,6], 
          [7,8,9]] 
[[row[i] for row in matrix] for i in range (len(matrix))]

[[1, 4, 7], [2, 5, 8], [3, 6, 9]]

Like nested for loops, there is no limit to how deep comprehensions can be nested. 

In [26]:
[[[i + j + k for k in "cd"] for j in "ab"] for i in "12"]

[[['1ac', '1ad'], ['1bc', '1bd']], [['2ac', '2ad'], ['2bc', '2bd']]]

## 13) Iterate two or more list simultaneously within list comprehension

For iterating more than two list simultaneously within list comprehension, one may use zip() as: 

In [27]:
list_1 = [1, 2, 3, 4]

In [28]:
list_2 = ["a", "b", "c", "d"]

In [29]:
list_3 = ["6", "7", "8", "9"]

In [30]:
[(i, j) for i, j in zip(list_1, list_2)]

[(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')]

In [31]:
[(i, j, k) for i, j, k in zip(list_1, list_2, list_3)]

[(1, 'a', '6'), (2, 'b', '7'), (3, 'c', '8'), (4, 'd', '9')]