<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Python List and Dictionary Comprehensions

---

## Learning Objectives

- Create list comprehensions 
- Create dictionary comprehensions 
- Use conditional logic (`if`/`else`) within list and dictionary comprehensions
- Use `zip()` and `enumerate()` within list and dictionary comprehensions
- Use nested list and dictionary comprehensions 

<h1>Lesson Guide<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Python-List-and-Dictionary-Comprehensions" data-toc-modified-id="Python-List-and-Dictionary-Comprehensions-1">Python List and Dictionary Comprehensions</a></span><ul class="toc-item"><li><span><a href="#Learning-Objectives" data-toc-modified-id="Learning-Objectives-1.1">Learning Objectives</a></span></li><li><span><a href="#Introduction:-List-Comprehensions" data-toc-modified-id="Introduction:-List-Comprehensions-1.2">Introduction: List Comprehensions</a></span><ul class="toc-item"><li><span><a href="#What-are-list-comprehensions?" data-toc-modified-id="What-are-list-comprehensions?-1.2.1">What are list comprehensions?</a></span></li><li><span><a href="#Conditional-logic-in-list-comprehensions" data-toc-modified-id="Conditional-logic-in-list-comprehensions-1.2.2">Conditional logic in list comprehensions</a></span></li><li><span><a href="#Zip-and-enumerate" data-toc-modified-id="Zip-and-enumerate-1.2.3">Zip and enumerate</a></span></li><li><span><a href="#Nested-List-Comprehensions" data-toc-modified-id="Nested-List-Comprehensions-1.2.4">Nested List Comprehensions</a></span><ul class="toc-item"><li><span><a href="#A-more-complicated-example" data-toc-modified-id="A-more-complicated-example-1.2.4.1">A more complicated example</a></span></li></ul></li></ul></li><li><span><a href="#Dictionary-Comprehensions" data-toc-modified-id="Dictionary-Comprehensions-1.3">Dictionary Comprehensions</a></span></li><li><span><a href="#Conclusion" data-toc-modified-id="Conclusion-1.4">Conclusion</a></span></li></ul></li></ul></div>

## Introduction: List Comprehensions

Python list comprehensions are a simple and powerful syntax that, once mastered, allow for fast, efficient, and intuitive manipulation of array-like data types.

Though list comprehensions may seem confusing at first, they are easy to get used to and once understood make otherwise complex code readable and concise.

List comprehensions are essentially replacements for iteration control statements. I will explain why this is the case below, and give the non-list-comprehension alternative code to help you understand what they are doing (and make it clear why they are so much better!).

### What are list comprehensions?

List comprehensions are statements that perform some kind of operation on each element of a list. Let's start with a simple array of numbers:

In [1]:
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Imagine that we want to add 1 to every element of the list. We could do this a couple of ways without the use of list comprehensions. We could use a for loop:

In [2]:
nums_plus_one = []

for num in numbers:
    nums_plus_one.append(num+1)

nums_plus_one

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

We could also use python's "map" with a lambda function. Map iterates over each element of a list and applies a function to it:

In [3]:
def plus_one(x):
    return x+1

In [4]:
nums_plus_one = map(plus_one, numbers)
list(nums_plus_one)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [5]:
nums_plus_one = map(lambda x: x+1, numbers)   #map???
list(nums_plus_one)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

These solutions each have pros and cons. The for loop is more readable and explicit (if you aren't familiar with how map and lambda works, at least), and the map with lambda is concise but arcane. Luckily list comprehensions combine the best of both worlds:

In [6]:
nums_plus_one = [x+1 for x in numbers]
nums_plus_one

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Let's go over how that works in more granular detail.

- Like the map statement, nums_plus_one is assigned on the left as a new variable.
- List comprehensions return a list, and the internal statement is wrapped in the list brackets: [...]
- Within the brackets these elements are similar to a for loop:
  1. The **operation per element** comes first: `x+1`
  2. Next is the **for loop variable assignment**: `for x`
  3. Last comes the **list of elements to iterate over**: `in numbers`

**Check:** 
Use a for loop and a list comprehension to multiply each element in `numbers` by 2 and to subtract 1.

In [7]:
num_cal = [x*2-1 for x in numbers]
print(num_cal)

[-1, 1, 3, 5, 7, 9, 11, 13, 15, 17]


In [8]:
type([x*2-1 for x in numbers])  

list

In [9]:
[x*2-1 for x in numbers]

[-1, 1, 3, 5, 7, 9, 11, 13, 15, 17]

### Conditional logic in list comprehensions

List comprehensions can be extended to cover more of the functionality of a for loop than just an operation over elements. For example we might use an if-statement to filter a list.

In [10]:
n = [1, 2, 7, 21, 3, 1, 62, 3, 34, 12, 73, 44, 12, 11, 9]
n_bin = []
n_mean = sum(n)/len(n)
print(n_mean)

19.666666666666668


In [11]:
larger_than_mean = []
for x in n:
    if x > n_mean:
        larger_than_mean.append(x)
larger_than_mean

[21, 62, 34, 73, 44]

In [12]:
[x+1 for x in n if x<n_mean]

[2, 3, 8, 4, 2, 4, 13, 13, 12, 10]

In [13]:
[x for x in n if x > n_mean]  #list comprehension

[21, 62, 34, 73, 44]

We can also use if-else statements.
Let's say we wanted to "binarize" a variable based on whether the elements are greater or less than the mean over all elements. The for loop could look something like this:

In [14]:
for x in n:
    if x >= n_mean:        #如果每在n中的元素x大于n_mean均值的话，
        n_bin.append(1)    #就在新list，n_bin中加写一个，在之前曾被定义成一个空的list
    else:
        n_bin.append(0)    #否则就在新list，n_bin中写一个0
n_bin

[0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0]

But that's pretty verbose. A list comprehension can do the same thing much easier:

In [15]:
n_bin = [1 if x >= n_mean else 0 for x in n]    
n_bin

[0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0]

In [16]:
["y"+"1" if x<= n_mean else 0 for x in n ]    #练习

['y1', 'y1', 'y1', 0, 'y1', 'y1', 0, 'y1', 0, 'y1', 0, 0, 'y1', 'y1', 'y1']

In [17]:
[1 if x<=n_mean else 0 for x in n]

[1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1]

We can even do chained conditionals! This swaps 1s to 0s and vice versa in a list, otherwise sets the items to none:

In [18]:
bin_or_none = [0 if x == 1 else 1 for x in n_bin]
bin_or_none

[1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1]

We can also take care of more cases.

In [19]:
n = [0, 1, 0, 1, 2, 3, 5, 2, 1, 0]

In [20]:
bin_or_none = [0 if x == 1 else 1 if x == 0 else None for x in n]   #三种情况是1，是0，是其他；两对if else
bin_or_none

[1, 0, 1, 0, None, None, None, None, 0, 1]

In [21]:
[0 if x==0 else 1 if x==1 else 2 if x==2 else None for x in n]    #四种情况,3对if else

[0, 1, 0, 1, 2, None, None, 2, 1, 0]

### Zip and enumerate

We can also do operations on multiple lists. In that regard, the built-in functions **zip** and **enumerate** come in handy in combination with list comprehensions. First let's go over what each of the functions does.

**zip** goes through each element of two lists iteratively at the same time:

In [22]:
a = ['a', 'b', 'c', 'd']
z = ['z', 'y', 'x', 'w']

zip(a, z)

<zip at 0x1095a72c8>

In [45]:
a = ['a', 'b', 'c', 'd']
z = ['z', 'y', 'x', 'w']
b = ["r", "t","d", "w"]      #练习
zip(a,z,b)
list(zip(a,z,b))

[('a', 'z', 'r'), ('b', 'y', 't'), ('c', 'x', 'd'), ('d', 'w', 'w')]

In [49]:
a = ['a', 'b', 'c', 'd']
z = ['z', 'y', 'x', 'w']
b = ["r", "t","d", "w"]
c = ['z', 'y', 'x', 'w']
zip(z,b,c)
dict(zip(a,zip(z,b,c)))      #对于多个list的处理方法


{'a': ('z', 'r', 'z'),
 'b': ('y', 't', 'y'),
 'c': ('x', 'd', 'x'),
 'd': ('w', 'w', 'w')}

Zip returns an iterator. We can transform it into a list or loop over the individual elements.

In [23]:
list(zip(a, z))

[('a', 'z'), ('b', 'y'), ('c', 'x'), ('d', 'w')]

In [24]:
dict(zip(a, z))

{'a': 'z', 'b': 'y', 'c': 'x', 'd': 'w'}

In [54]:
zipped = []
for item in zip(a, z):
    zipped.append((item[0], item[1])) #注意此时zip(a,z)是[('a', 'z'), ('b', 'y'), ('c', 'x'), ('d', 'w')]
zipped

[('a', 'z'), ('b', 'y'), ('c', 'x'), ('d', 'w')]

Do this as a list comprehension.

In [26]:
[item for item in zip(a, z)]

[('a', 'z'), ('b', 'y'), ('c', 'x'), ('d', 'w')]

In [27]:
[list(item) for item in zip(a, z)]    #

[['a', 'z'], ['b', 'y'], ['c', 'x'], ['d', 'w']]

**enumerate** keeps track of the index of each element of a list:

In [28]:
a = ['a', 'b', 'c', 'd']

Keep note that that with enumerate the index is returned first and the element second.

In [29]:
list(enumerate(a))    #计数

[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')]

In [30]:
enumerated = []
for item in enumerate(a):
    enumerated.append((item[0], item[1]))
enumerated

[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')]

**Check:** Do this as a list comprehension.

In [31]:
enumerated = [(item[0],item[1]) for item in enumerate(a)]
enumerated

[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')]

**Check:** Extract only the strings at even positions.

In [32]:
enumerated = [(item[0],item[1]) for item in enumerate(a)][::2]
enumerated

[(0, 'a'), (2, 'c')]

In [33]:
enumerated = [item[1] for item in enumerate(a) if item[0]%2 ==0]
enumerated

['a', 'c']

### Nested List Comprehensions

As some of you may have suspected by now, we can embed list comprehensions within other list comprehensions for even more power.

For example, let's say we want the square and the square root for every non-negative element in a list. We could first filter the list and then return the desired values.

In [34]:
n = [0, 1, 50, -23, -1, 75, -3]

n_filtered = [x for x in n if x >= 0]
[(x**2, x**0.5) for x in n_filtered]

[(0, 0.0), (1, 1.0), (2500, 7.0710678118654755), (5625, 8.660254037844387)]

We can do the same in one go.

In [35]:
[(x**2, x**0.5) for x in [y for y in n if y >= 0]]

[(0, 0.0), (1, 1.0), (2500, 7.0710678118654755), (5625, 8.660254037844387)]

#### A more complicated example

Here's a list comprehension that returns syllables (defined by consonants followed by a vowel) in a flattened list:

In [36]:
import string
vowels = 'aeiou'
alphabet = string.ascii_lowercase
print(alphabet)

abcdefghijklmnopqrstuvwxyz


In [37]:
syllables = [s for syls in [[c+v for v in vowels]
                            for c in [x for x in alphabet if x not in vowels]] for s in syls]
syllables

['ba',
 'be',
 'bi',
 'bo',
 'bu',
 'ca',
 'ce',
 'ci',
 'co',
 'cu',
 'da',
 'de',
 'di',
 'do',
 'du',
 'fa',
 'fe',
 'fi',
 'fo',
 'fu',
 'ga',
 'ge',
 'gi',
 'go',
 'gu',
 'ha',
 'he',
 'hi',
 'ho',
 'hu',
 'ja',
 'je',
 'ji',
 'jo',
 'ju',
 'ka',
 'ke',
 'ki',
 'ko',
 'ku',
 'la',
 'le',
 'li',
 'lo',
 'lu',
 'ma',
 'me',
 'mi',
 'mo',
 'mu',
 'na',
 'ne',
 'ni',
 'no',
 'nu',
 'pa',
 'pe',
 'pi',
 'po',
 'pu',
 'qa',
 'qe',
 'qi',
 'qo',
 'qu',
 'ra',
 're',
 'ri',
 'ro',
 'ru',
 'sa',
 'se',
 'si',
 'so',
 'su',
 'ta',
 'te',
 'ti',
 'to',
 'tu',
 'va',
 've',
 'vi',
 'vo',
 'vu',
 'wa',
 'we',
 'wi',
 'wo',
 'wu',
 'xa',
 'xe',
 'xi',
 'xo',
 'xu',
 'ya',
 'ye',
 'yi',
 'yo',
 'yu',
 'za',
 'ze',
 'zi',
 'zo',
 'zu']

**Check:** Explain to your neighbor how that works.

This is a complicated list comprehension with nested for loops, and brings up one of the more confusing aspects of list comprehensions. To understand let's first write out the comprehension more explicitly:

In [38]:
# simple list comprehension to get non-vowel letters:
consonants = [x for x in alphabet if x not in vowels]

# get all the syllables for each consonant + vowel pair in nested consonant-syllable lists:
syllables = [[c + v for v in vowels] for c in consonants]
syllables

[['ba', 'be', 'bi', 'bo', 'bu'],
 ['ca', 'ce', 'ci', 'co', 'cu'],
 ['da', 'de', 'di', 'do', 'du'],
 ['fa', 'fe', 'fi', 'fo', 'fu'],
 ['ga', 'ge', 'gi', 'go', 'gu'],
 ['ha', 'he', 'hi', 'ho', 'hu'],
 ['ja', 'je', 'ji', 'jo', 'ju'],
 ['ka', 'ke', 'ki', 'ko', 'ku'],
 ['la', 'le', 'li', 'lo', 'lu'],
 ['ma', 'me', 'mi', 'mo', 'mu'],
 ['na', 'ne', 'ni', 'no', 'nu'],
 ['pa', 'pe', 'pi', 'po', 'pu'],
 ['qa', 'qe', 'qi', 'qo', 'qu'],
 ['ra', 're', 'ri', 'ro', 'ru'],
 ['sa', 'se', 'si', 'so', 'su'],
 ['ta', 'te', 'ti', 'to', 'tu'],
 ['va', 've', 'vi', 'vo', 'vu'],
 ['wa', 'we', 'wi', 'wo', 'wu'],
 ['xa', 'xe', 'xi', 'xo', 'xu'],
 ['ya', 'ye', 'yi', 'yo', 'yu'],
 ['za', 'ze', 'zi', 'zo', 'zu']]

In [39]:
[
    s
    for syls in syllables
    for s in syls
][:10]

['ba', 'be', 'bi', 'bo', 'bu', 'ca', 'ce', 'ci', 'co', 'cu']

The trick here is that the nested list comprehension for loops are in the same order as they would be in standard nested for loops, except the retrieved element comes first!

In [40]:
flat_syllables = []
for syls in syllables:
    for s in syls:
        flat_syllables.append(s)
flat_syllables[:10]

['ba', 'be', 'bi', 'bo', 'bu', 'ca', 'ce', 'ci', 'co', 'cu']

## Dictionary Comprehensions

Comprehensions are not limited to lists. You can also use comprehensions to create dictionaries with key:value pairs.

Below, for example, we can create a dictionary with the integer value of each character in a string with the string as a key (the **ord** function returns the integer value of a character).

In [1]:
keys = ['dog', 'cat', 'bird', 'horse']

animal_dict = {k: [ord(c) for c in k] for k in keys}
animal_dict

{'dog': [100, 111, 103],
 'cat': [99, 97, 116],
 'bird': [98, 105, 114, 100],
 'horse': [104, 111, 114, 115, 101]}

In [5]:
keys = ['dog', 'cat', 'bird', 'horse']

animal_dict = {k: [ord(c) for c in k] for k in keys}
animal_dict

{'dog': [100, 111, 103],
 'cat': [99, 97, 116],
 'bird': [98, 105, 114, 100],
 'horse': [104, 111, 114, 115, 101]}

This can be particularly useful for creating pandas dataframes.

In [42]:
import pandas as pd

column_names = ['height', 'weight', 'is_male']
values = [[62, 54, 60, 50], [180, 120, 200, 100], [True, False, True, False]]

records = pd.DataFrame({col: vals for col, vals in zip(column_names, values)})
records

Unnamed: 0,height,weight,is_male
0,62,180,True
1,54,120,False
2,60,200,True
3,50,100,False


**Check:** Use a dictionary comprehension to subset the `animal_dict` on those entries which contain the letter "o" in their key.

In [43]:
keys = ['dog', 'cat', 'bird', 'horse']

animal_dict = {k: ["o" in k for c in k] for k in keys}
animal_dict

{'dog': [True, True, True],
 'cat': [False, False, False],
 'bird': [False, False, False, False],
 'horse': [True, True, True, True, True]}

In [44]:
{k: [ord(c) for c in k] for k in keys if "o" in k}

{'dog': [100, 111, 103], 'horse': [104, 111, 114, 115, 101]}

## Conclusion

List comprehensions are extremely useful for writing compact code. Usually, one can skip the step of creating an empty list to append to. However, it is not recommended to create list comprehensions with multiple nested levels as readability gets lost fast.