# List, Set, and Dict Comprehensions

List comprehensions are one of the most-loved Python language features. They allow you to concisely form a new list by filtering the elements of a collection, transforming the elements passing the filter in one concise expression. They take the basic form:

``
[expr for val in collection if condition]
``

This is equivalent to the following for loop:

``
result = []
for val in collection:
    if condition:
        result.append(expr)
``

The filter condition can be omitted, leaving only the expression. For example, given a list of strings, we could filter out strings with length 2 or less and also convert them to uppercase like this:

In [1]:
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']

[x.upper() for x in strings if len(x) > 2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

Set and dict comprehensions are a natural extension, producing sets and dicts in an idiomatically similar way instead of lists. A dict comprehension looks like this:

``
dict_comp = {key-expr : value-expr for value in collection if condition}
``

A set comprehension looks like the equivalent list comprehension except with curly braces instead of square brackets:

``
set_comp = {expr for value in collection if condition}
``

Like list comprehensions, set and dict comprehensions are mostly conveniences, but they similarly can make code both easier to write and read. Consider the list of strings from before. Suppose we wanted a set containing just the lengths of the strings contained in the collection; we could easily compute this using a set comprehension:

In [2]:
unique_lengths = {len(x) for x in strings}
unique_lengths

{1, 2, 3, 4, 6}

We could also express this more functionally using the `map` function, introduced shortly:

In [3]:
set(map(len, strings))

{1, 2, 3, 4, 6}

As a simple dict comprehension example, we could create a lookup map of these strings to their locations in the list:

In [4]:
loc_mapping = {val : index for index, val in enumerate(strings)}
loc_mapping

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

## NESTED LIST COMPREHENSIONS
Suppose we have a list of lists containing some English and Spanish names:

In [5]:
all_data = [['John', 'Emily', 'Michael', 'Mary', 'Steven'], ['Maria', 'Juan', 'Javier', 'Natalia', 'Pilar']]

You might have gotten these names from a couple of files and decided to organize them by language. Now, suppose we wanted to get a single list containing all names with two or more e’s in them. We could certainly do this with a simple for loop:

``
names_of_interest = []
for names in all_data:
    enough_es = [name for name in names if name.count('e') >= 2]
    names_of_interest.extend(enough_es)
``

You can actually wrap this whole operation up in a single nested list comprehension, which will look like:

In [6]:
result = [name for names in all_data for name in names if name.count('e') >= 2]
result

['Steven']

At first, nested list comprehensions are a bit hard to wrap your head around. The for parts of the list comprehension are arranged according to the order of nesting, and any filter condition is put at the end as before. Here is another example where we “flatten” a list of tuples of integers into a simple list of integers:

In [7]:
some_tuples = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
flattened = [x for tup in some_tuples for x in tup]
flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9]

Keep in mind that the order of the for expressions would be the same if you wrote a nested for loop instead of a list comprehension:

``
flattened = []
for tup in some_tuples:
    for x in tup:
        flattened.append(x)
``

You can have arbitrarily many levels of nesting, though if you have more than two or three levels of nesting you should probably start to question whether this makes sense from a code readability standpoint. It’s important to distinguish the syntax just shown from a list comprehension inside a list comprehension, which is also perfectly valid:

In [8]:
[[x for x in tup] for tup in some_tuples]

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

This produces a list of lists, rather than a flattened list of all of the inner elements.

# Generators

Having a consistent way to iterate over sequences, like objects in a list or lines in a file, is an important Python feature. This is accomplished by means of the iterator protocol, a generic way to make objects iterable. For example, iterating over a dict yields the dict keys:

In [9]:
some_dict = {'a': 1, 'b': 2, 'c': 3}
for key in some_dict:
    print(key)

a
b
c


When you write for key in `some_dict`, the Python interpreter first attempts to create an iterator out of `some_dict`:

In [10]:
dict_iterator = iter(some_dict)
dict_iterator

<dict_keyiterator at 0x7fb0fda06130>

An iterator is any object that will yield objects to the Python interpreter when used in a context like a for loop. Most methods expecting a list or list-like object will also accept any iterable object. This includes built-in methods such as min, max, and sum, and type constructors like list and tuple:

In [11]:
list(dict_iterator)

['a', 'b', 'c']

A generator is a concise way to construct a new iterable object. Whereas normal functions execute and return a single result at a time, generators return a sequence of multiple results lazily, pausing after each one until the next one is requested. To create a generator, use the `yield` keyword instead of `return` in a function:

In [12]:
def squares(n=10):
    print('Generating squares from 1 to {0}'.format(n ** 2))
    for i in range(1, n + 1):
        yield i ** 2

When you actually call the generator, no code is immediately executed:

In [13]:
gen = squares()
gen

<generator object squares at 0x7fb0fd571660>

It is not until you request elements from the generator that it begins executing its code:

In [14]:
for x in gen:
    print(x, end=' ')

Generating squares from 1 to 100
1 4 9 16 25 36 49 64 81 100 

## GENERATOR EXPRESSSIONS
Another even more concise way to make a generator is by using a generator expression. This is a generator analogue to list, dict, and set comprehensions; to create one, enclose what would otherwise be a list comprehension within parentheses instead of brackets:

In [15]:
gen = (x ** 2 for x in range(100))
gen

<generator object <genexpr> at 0x7fb0fd571120>

This is completely equivalent to the following more verbose generator:

In [16]:
def _make_gen():
    for x in range(100):
        yield x ** 2
gen = _make_gen()

Generator expressions can be used instead of list comprehensions as function arguments in many cases:

In [17]:
sum(x ** 2 for x in range(100))

328350

In [18]:
dict((i, i **2) for i in range(5))

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

## ITERTOOLS MODULE
The standard library itertools module has a collection of generators for many common data algorithms. For example, groupby takes any sequence and a function, grouping consecutive elements in the sequence by return value of the function. Here’s an example:

In [19]:
import itertools
first_letter = lambda x: x[0]
names = ['Alan', 'Adam', 'Wes', 'Will', 'Albert', 'Steven']
for letter, names in itertools.groupby(names, first_letter):
    print(letter, list(names)) # names is a generator

A ['Alan', 'Adam']
W ['Wes', 'Will']
A ['Albert']
S ['Steven']


See the table for a list of a few other itertools functions I’ve frequently found helpful. You may like to check out the official Python documentation for more on this useful built-in utility module.

*Table. Some useful itertools functions*

|Function|Description|
|:---|:---|
|`combinations(iterable, k)`|Generates a sequence of all possible k-tuples of elements in the iterable, ignoring order and without replacement (see also the companion function combinations_with_replacement)|
|`permutations(iterable, k)`|Generates a sequence of all possible k-tuples of elements in the iterable, respecting order|
|`groupby(iterable[, keyfunc])`|Generates (key, sub-iterator) for each unique key|
|`product(*iterables, repeat=1)`|Generates the Cartesian product of the input iterables as tuples, similar to a nested for loop|