In [11]:
import math
import random

## List comprehensions

List comprehensions allow us to _transform_ an iterable's values through another function or _filter_ those values, returning a list.

First, let's look at how we have done this in other ways. In these first two examples, we create a new empty array, then iterate over it using a `for` loop, and append each transformed item to the new array.

In [3]:
names = ['Rowan', 'Oz', 'Shannon', 'Meredith']
lowercase_names = []
for name in names:
    lowercase_names.append(name.lower())
print(lowercase_names)

['rowan', 'oz', 'shannon', 'meredith']


In [2]:
output = []
for x in range(10):
    output.append(pow(2, x))
print(output)

[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]


List comprehensions can be used to transform items, similar to how map works in JavaScript.

It follows this basic pattern:

`new_list = [<expression> for <each_item> in <a_list>]`

_expression_ in that pattern just refers to some operation or function.

Let's look at the examples from above using list comprehensions:

In [4]:
# Lowercase names
[name.lower() for name in names]

['rowan', 'oz', 'shannon', 'meredith']

In [5]:
# Powers of 2
[pow(2, x) for x in range(10)]

[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]

In [None]:
# 5 random numbers
[random.random() for _ in range(5)]

**How did `_` work above?** We can use `_` as a variable name when we don't care about the value.

## Comprehension parts

Every comprehension is made up of the following parts:

1. collection
2. iteration
3. selection (optional)

Let's look at the previous ones for examples:

```py
[
 pow(2, x)           # collection
 for x in range(10)  # iteration
]
```

```py
[
 random.random()    # collection
 for _ in range(5)  # iteration
]
```

*Iteration* is straightforward and not really that different from the `for` loops you've been using. It iterates over a sequence.

*Collection* is the value that will be collected into the new list.

What's selection?

```py
[
 pow(2, x)           # collection
 for x in range(10)  # iteration
 if x % 2 == 0       # selection
]
```

*Selection* filters what you use from iteration. In this case, only even numbers will be used. We iterate over the entire range, but only collect when the value `x` is even.

In [None]:
[
 pow(2, x)           # collection
 for x in range(10)  # iteration
 if x % 2 == 0       # selection
]

The pattern for a list comprehension with a condition is:

`[<do_this_operation> for <item> in <a_list> if <condition>]`

In [15]:
# All squares in the first 1000 numbers.
squares = [x
           for x in range(1000) 
           if math.sqrt(x).is_integer()]
print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961]


In [9]:
# Filter a list

def remove_from_list(a_list, item_to_remove):
    return [item for item in a_list if item != item_to_remove]

remove_from_list(['MI', 'AK', 'SC', 'AK', 'DE'], 'AK')

['MI', 'SC', 'DE']

In [16]:
word = "MAGNITUDINAL"
current_guesses = ["G", "E", "T", "A"]

def display_letter(letter, guesses):
    """
    Conditionally display a letter. If the letter is already in
    the list `guesses`, then return it. Otherwise, return "_".
    """
    if letter in guesses:
        return letter
    else:
        return "_"

# we want to call the display letter method 
[display_letter(letter, current_guesses) for letter in word]

['_', 'A', 'G', '_', '_', 'T', '_', '_', '_', '_', 'A', '_']

In [11]:
# Compare to not using a list comprehension

word = "MAGNITUDINAL"
current_guesses = ["G", "E", "T", "A"]

def display_letter(letter, guesses):
    """
    Conditionally display a letter. If the letter is already in
    the list `guesses`, then return it. Otherwise, return "_".
    """
    if letter in guesses:
        return letter
    else:
        return "_"
    
output = []
for letter in word:
    output.append(display_letter(letter, current_guesses))
print(output)

['_', 'A', 'G', '_', '_', 'T', '_', '_', '_', '_', 'A', '_']


In [12]:
word = "MAGNITUDE"
guesses = ["G", "E", "T"]

[letter if letter in guesses else "_" 
 for letter in word]

['_', '_', 'G', '_', '_', 'T', '_', '_', 'E']

The following example uses the [string method `join()`](https://docs.python.org/3/library/stdtypes.html#str.join)

In [13]:
def print_word(word, guesses):
    output_letters = [display_letter(letter, guesses) 
                      for letter in word]
    print(" ".join(output_letters))
    
print_word(word, guesses)

_ _ G _ _ T _ _ E


In [14]:
# word = "MAGNITUDE"
# guesses = ["G", "E", "T"]
word = "NECESSITY"
guesses = ["E", "T", "S", "N"]

[letter
 for letter in word
 if letter in guesses]

['N', 'E', 'E', 'S', 'S', 'T']

In [17]:
with open("students.txt") as students_file:
    print([student.rstrip()
           for student in students_file.readlines() 
           if student.startswith("C")])

['Craig Brunengraber', 'Chinh Le', 'Christian Medlin']


In [18]:
words = ["cool", "indubitably", "Tehran", 
         "pineapple", "axolotl", "hamburger", "squat"]

[
    word                                 # collection
    for word in words                    # iteration
    if len(word) >= 6 and len(word) <= 8 # selection
]

['Tehran', 'axolotl']

### Advanced list comprehensions

List comprehensions can be nested. You can have a comprehension inside the collection or iteration stages of another comprehension. There's no reason you couldn't use one inside the selection stage, although I've never seen it.

Once you start nesting list comprehensions, or any kind of loop, things can get pretty complicated and hard to read (and hard to debug). If you feel compelled to nest list comprehensions, ask yourself if there might be a simpler way. Sometimes writing less concise code is more readable and more maintainable.

In [32]:
# Roll 6 dice, keep all 4 and above.

random.seed(0)
rolls = [random.randint(1,6)    # Iteration for the outer comprehension, collection for the inner comprehension 
         for _ in range(6)]
print(rolls)
[die 
 for die in rolls
 if die >= 4]

[4, 4, 1, 3, 5, 4]


[4, 4, 5, 4]

In [23]:
# Roll 6 dice, keep all 4 and above. -- using nested list comprehensions

[die 
 for die in [random.randint(1,6)    # Iteration for the outer comprehension, collection for the inner comprehension 
             for _ in range(6)] 
 if die >= 4]

[4, 4, 5]

In [33]:
# Transpose rows and columns using nested list comprehensions.
matrix = [[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]

[[row[i]                         # All of this is collection for the outer list comprehension
  for row in matrix]             # This is collection for outer and iteration for inner
  for i in range(len(matrix[0]))] # Outer iteration

[[1, 4, 7], [2, 5, 8], [3, 6, 9]]

The iteration stage of the comprehension can iterate over multiple sequences.

In [34]:
# Get a cartesian product of multiple iterables.
max_x = 5
max_y = 5

all_coordinates = [(x, y)
                   for x in range(max_x + 1) 
                   for y in range(max_y + 1)]
print(all_coordinates)

[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (5, 0), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5)]


In [35]:
# All student pairings
students = ["Blake", "Justice", "Kai", "Rowan"]
possible_pairings = [(s1, s2) 
                     for s1 in students 
                     for s2 in students 
                     if s1 is not s2]
print(possible_pairings)

[('Blake', 'Justice'), ('Blake', 'Kai'), ('Blake', 'Rowan'), ('Justice', 'Blake'), ('Justice', 'Kai'), ('Justice', 'Rowan'), ('Kai', 'Blake'), ('Kai', 'Justice'), ('Kai', 'Rowan'), ('Rowan', 'Blake'), ('Rowan', 'Justice'), ('Rowan', 'Kai')]


In [40]:
students = ["Blake", "Justice", "Kai", "Rowan"]
possible_pairings = []
for s1 in students:
    for s2 in students:
        if s1 is not s2:
            possible_pairings.append((s1, s2))
print(possible_pairings)

[('Blake', 'Justice'), ('Blake', 'Kai'), ('Blake', 'Rowan'), ('Justice', 'Blake'), ('Justice', 'Kai'), ('Justice', 'Rowan'), ('Kai', 'Blake'), ('Kai', 'Justice'), ('Kai', 'Rowan'), ('Rowan', 'Blake'), ('Rowan', 'Justice'), ('Rowan', 'Kai')]


This isn't exactly what I want, but we'll come back to it.

## Dictionary comprehensions

Dictionary comprehensions work like list comprehensions, but create dictionaries. You use curly braces on the outside and a colon to separate the key and value.

In [None]:
# Get a mapping of letters to Unicode values.

{letter: ord(letter) for letter in "abcdef"}

In [None]:
# Get a mapping of letters to their frequency.

sentence = "hello there pardner"
{letter: sentence.count(letter) 
 for letter in sentence 
 if letter is not " "}

In [2]:
# Map students to their grades.

students = ["Marion", "Sawyer", "Hayden"]
test_scores = [[87, 91, 79], [92, 90, 85], [90, 93, 82], [88, 92, 95]]

{student: [test[idx] for test in test_scores] 
 for (idx, student) in enumerate(students)}

{'Marion': [87, 92, 90, 88],
 'Sawyer': [91, 90, 93, 92],
 'Hayden': [79, 85, 82, 95]}

In [None]:
# What days are we open?

open_hours = {"Sunday": [900, 1730], 
              "Monday": [], 
              "Tuesday": [900, 2130], 
              "Wednesday": [900, 2130]}
{day_of_week: times for day_of_week, times in open_hours.items() if len(times) == 2}

## Set comprehensions

Sets are another type of sequence we haven't discussed. They are _unordered_ sequences of unique items. Each item must be _hashable_ -- that is, it can't be mutable, so lists and dictionaries are out. Numbers, strings, and tuples are in. Amazingly, sets are also out, as they're mutable, so no sets of sets!

There's a function called `frozenset()` to make an immutable set, so you can nest them.

In [None]:
# There can be only one (1).
{1, 2, 3, 4, 1}

In [None]:
# Unique letters
{letter for letter in "howdy there pardner" if letter is not " "}

In [None]:
set(list("howdy there pardner"))

Let's solve that problem of getting unique student pairings now.

In [None]:
# All student pairings
students = ["Blake", "Justice", "Kai", "Rowan", "Marion", "Hunter"]
possible_pairings = {frozenset([s1, s2]) 
                     for s1 in students 
                     for s2 in students 
                     if s1 is not s2}
print([set(pair) for pair in possible_pairings])
print(len(possible_pairings))

Why did we use `frozenset()`?

In [6]:
type({1, 2})

set

In [7]:
type(frozenset([1, 2]))

frozenset