# Chapter 3. Built-In Data Structures, Functions, and Files

This chapter discusses capabilities built into the Python language that will be used ubiquitously throughout the book. While add-on libraries like pandas and NumPy add advanced computational functionality for larger datasets, they are designed to be used together with Python’s built-in data manipulation tools.

We’ll start with Python’s workhorse data structures: tuples, lists, dictionaries, and sets. Then, we’ll discuss creating your own reusable Python functions. Finally, we’ll look at the mechanics of Python file objects and interacting with your local hard drive.

# 3.1 Data Structures and Sequences

Python’s data structures are simple but powerful. Mastering their use is a critical part of becoming a proficient Python programmer. We start with tuple, list, and dictionary, which are some of the most frequently used sequence types.

## Tuple

A tuple is a fixed-length, immutable sequence of Python objects which, once assigned, cannot be changed. The easiest way to create one is with a comma-separated sequence of values wrapped in parentheses:

In [8]:
tup = (4, 5, 6)

In [10]:
tup

(4, 5, 6)

In many contexts, the parentheses can be omitted, so here we could also have written:

In [13]:
tup = 4, 5, 6

In [15]:
tup

(4, 5, 6)

You can convert any sequence or iterator to a tuple by invoking tuple:

In [18]:
tuple([4, 0, 2])

(4, 0, 2)

In [20]:
tup = tuple('string')

In [22]:
tup

('s', 't', 'r', 'i', 'n', 'g')

Elements can be accessed with square brackets [] as with most other sequence types. 

As in C, C++, Java, and many other languages, sequences are <b>0-indexed</b> in Python:

In [25]:
tup[0]

's'

When you’re defining tuples within more complicated expressions, it’s often necessary to enclose the values in parentheses, as in this example of creating a tuple of tuples:

In [28]:
nested_tup = (4, 5, 6), (7, 8)

In [30]:
nested_tup

((4, 5, 6), (7, 8))

In [32]:
nested_tup[0]

(4, 5, 6)

In [34]:
nested_tup[1]

(7, 8)

While the objects stored in a tuple may be mutable themselves, once the tuple is created it’s not possible to modify which object is stored in each slot:



In [37]:
tup = tuple(['foo', [1, 2], True])

In [39]:
tup[2] = False

TypeError: 'tuple' object does not support item assignment

If an object inside a tuple is mutable, such as a list, you can modify it in place:



In [42]:
tup[1].append(3)

In [44]:
tup

('foo', [1, 2, 3], True)

You can concatenate tuples using the + operator to produce longer tuples:

In [47]:
(4, None, 'foo') + (6, 0) + ('bar',)

(4, None, 'foo', 6, 0, 'bar')

Multiplying a tuple by an integer, as with lists, has the effect of concatenating that many copies of the tuple:

In [50]:
('foo', 'bar') * 4

('foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar')

<b>Note that the objects themselves are not copied, only the references to them.</b>

## Unpacking tuples

If you try to assign to a tuple-like expression of variables, Python will attempt to unpack the value on the righthand side of the equals sign:

In [55]:
tup = (4, 5, 6)

In [57]:
a, b, c = tup

In [59]:
b

5

Even sequences with nested tuples can be unpacked:

In [62]:
tup = 4, 5, (6, 7)

In [64]:
a, b, (c, d) = tup

In [66]:
d

7

Using this functionality you can easily swap variable names, a task that in many languages might look like:

In [69]:
tmp = a
a = b
b = tmp

But, in Python, the swap can be done like this:

In [72]:
a, b = 1, 2

In [74]:
a

1

In [76]:
b

2

In [78]:
b, a = a, b

In [80]:
a

2

In [82]:
b

1

A common use of variable unpacking is iterating over sequences of tuples or lists:



In [85]:
seq = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

In [87]:
for a, b, c in seq:
    print(f'a={a}, b={b}, c={c}')

a=1, b=2, c=3
a=4, b=5, c=6
a=7, b=8, c=9


Another common use is returning multiple values from a function. I’ll cover this in more detail later.

There are some situations where you may want to “pluck” a few elements from the beginning of a tuple. There is a special syntax that can do this, <b>*</b>rest, which is also used in function signatures to capture an arbitrarily long list of positional arguments:

In [91]:
values = 1, 2, 3, 4, 5

In [93]:
a, b, *rest = values

In [95]:
a

1

In [97]:
b

2

In [99]:
rest

[3, 4, 5]

This rest bit is sometimes something you want to discard; there is nothing special about the rest name. As a matter of convention, many Python programmers will use the underscore (_) for unwanted variables:

In [102]:
a, b, *_ = values

## Tuple methods

Since the size and contents of a tuple cannot be modified, it is very light on instance methods. A particularly useful one (also available on lists) is <b>count</b>, which counts the number of occurrences of a value:

In [106]:
a = (1, 2, 2, 2, 3, 4, 2)

In [108]:
a.count(2)

4

## List

In contrast with tuples, lists are variable length and their contents can be modified in place. Lists are mutable. You can define them using square brackets [] or using the list type function:

In [112]:
a_list = [2, 3, 7, None]

In [114]:
tup = ("foo", "bar", "baz")

In [116]:
b_list = list(tup)

In [118]:
b_list

['foo', 'bar', 'baz']

In [120]:
b_list[1] = "peekaboo"

In [122]:
b_list

['foo', 'peekaboo', 'baz']

Lists and tuples are semantically similar (though tuples cannot be modified) and can be used interchangeably in many functions.

The <b>list</b> built-in function is frequently used in data processing as a way to materialize an iterator or generator expression:

In [125]:
gen = range(10)

In [127]:
gen

range(0, 10)

In [129]:
list(gen)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

## Adding and removing elements

Elements can be appended to the end of the list with the append method:



In [133]:
b_list.append("dwarf")

In [135]:
b_list

['foo', 'peekaboo', 'baz', 'dwarf']

Using <b>insert</b> you can insert an element at a specific location in the list:

In [138]:
b_list.insert(1, "red")

In [140]:
b_list

['foo', 'red', 'peekaboo', 'baz', 'dwarf']

The insertion index must be between 0 and the length of the list, inclusive.



<blockquote>
Warning

insert is computationally expensive compared with append, because references to subsequent elements have to be shifted internally to make room for the new element. If you need to insert elements at both the beginning and end of a sequence, you may wish to explore collections.deque, a double-ended queue, which is optimized for this purpose and found in the Python Standard Library.
</blockquote>

The inverse operation to <b>insert</b> is <b>pop</b>, which removes and returns an element at a particular index:

In [145]:
b_list.pop(2)

'peekaboo'

In [147]:
b_list

['foo', 'red', 'baz', 'dwarf']

Elements can be removed by value with <b>remove</b>, which locates the first such value and removes it from the list:

In [150]:
b_list.append("foo")


In [152]:
b_list

['foo', 'red', 'baz', 'dwarf', 'foo']

In [154]:
b_list.remove("foo")

In [156]:
b_list

['red', 'baz', 'dwarf', 'foo']

If performance is not a concern, by using <b>append</b> and <b>remove</b>, you can use a Python list as a set-like data structure (although Python has actual set objects, discussed later).

Check if a list contains a value using the <b>in</b> keyword:

In [160]:
In [61]: "dwarf" in b_list
Out[61]: True

The keyword <b>not</b> can be used to negate <b>in</b>:



In [163]:
"dwarf" not in b_list

False

Checking whether a list contains a value is a lot slower than doing so with dictionaries and sets (to be introduced shortly), as Python makes a linear scan across the values of the list, whereas it can check the others (based on hash tables) in constant time.

## Concatenating and combining lists


Similar to tuples, adding two lists together with <b>+</b> concatenates them:



In [168]:
[4, None, "foo"] + [7, 8, (2, 3)]

[4, None, 'foo', 7, 8, (2, 3)]

If you have a list already defined, you can append multiple elements to it using the <b>extend</b> method:



In [171]:
x = [4, None, "foo"]

In [173]:
x.extend([7, 8, (2, 3)])

In [175]:
x

[4, None, 'foo', 7, 8, (2, 3)]

Note that list concatenation by addition is a comparatively expensive operation since a new list must be created and the objects copied over. Using extend to append elements to an existing list, especially if you are building up a large list, is usually preferable. Thus:

is faster than the concatenative alternative:

## Sorting


You can sort a list in place (without creating a new object) by calling its sort function:



In [27]:
a = [7, 2, 5, 1, 3]

In [29]:
a.sort()

In [31]:
a

[1, 2, 3, 5, 7]

<b>sort</b> has a few options that will occasionally come in handy. 

One is the ability to pass a secondary <i>sort key</i>—that is, a function that produces a value to use to sort the objects. For example, we could sort a collection of strings by their lengths:

In [36]:
b = ["saw", "small", "He", "foxes", "six"]

In [38]:
b.sort(key=len)

In [40]:
b

['He', 'saw', 'six', 'small', 'foxes']

Soon, we’ll look at the sorted function, which can produce a sorted copy of a general sequence.

## Slicing


You can select sections of most sequence types by using slice notation, which in its basic form consists of start:stop passed to the indexing operator <b>[:]</b>

In [45]:
seq = [7, 2, 3, 7, 5, 6, 0, 1]

In [47]:
seq[1:5]

[2, 3, 7, 5]

Slices can also be assigned with a sequence:

In [50]:
seq[3:5] = [6, 3]

In [52]:
seq

[7, 2, 3, 6, 3, 6, 0, 1]

While the element at the start index is included, the stop index is not included, so that the number of elements in the result is stop - start.

Either the start or stop can be omitted, in which case they default to the start of the sequence and the end of the sequence, respectively:

In [56]:
seq[:5]

[7, 2, 3, 6, 3]

In [58]:
seq[3:]

[6, 3, 6, 0, 1]

Negative indices slice the sequence relative to the end:

In [61]:
seq[-4:]

[3, 6, 0, 1]

In [63]:
seq[-6:-2]

[3, 6, 3, 6]

Slicing semantics takes a bit of getting used to, especially if you’re coming from R or MATLAB. See Figure 3-1 for a helpful illustration of slicing with positive and negative integers. In the figure, the indices are shown at the “bin edges” to help show where the slice selections start and stop using positive or negative indices.

![](pda3_0301.png)

A step can also be used after a second colon to, say, take every other element:

In [68]:
seq[::2]

[7, 3, 3, 0]

A clever use of this is to pass -1, which has the useful effect of reversing a list or tuple:



In [227]:
seq[::-1]

[1, 0, 6, 3, 6, 3, 2, 7]

## Dictionary

The dictionary or <b>dict</b> may be the most important built-in Python data structure. 

In other programming languages, dictionaries are sometimes called hash maps or associative arrays. A dictionary stores a collection of key-value pairs, where key and value are Python objects. Each key is associated with a value so that a value can be conveniently retrieved, inserted, modified, or deleted given a particular key. 

One approach for creating a dictionary is to use curly braces <b>{}</b> and colons to separate keys and values:

In [231]:
empty_dict = {}

In [233]:
d1 = {"a": "some value", "b": [1, 2, 3, 4]}

In [235]:
d1

{'a': 'some value', 'b': [1, 2, 3, 4]}

You can access, insert, or set elements using the same syntax as for accessing elements of a list or tuple:

In [238]:
d1[7] = "an integer"

In [240]:
d1

{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer'}

In [242]:
d1["b"]

[1, 2, 3, 4]

You can check if a dictionary contains a key using the same syntax used for checking whether a list or tuple contains a value:

In [245]:
"b" in d1

True

You can delete values using either the <b>del</b> keyword or the <b>pop</b> method (which simultaneously returns the value and deletes the key):

In [248]:
d1[5] = "some value"

In [250]:
d1

{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer', 5: 'some value'}

In [252]:
d1["dummy"] = "another value"

In [254]:
d1

{'a': 'some value',
 'b': [1, 2, 3, 4],
 7: 'an integer',
 5: 'some value',
 'dummy': 'another value'}

In [256]:
del d1[5]

In [258]:
d1

{'a': 'some value',
 'b': [1, 2, 3, 4],
 7: 'an integer',
 'dummy': 'another value'}

In [260]:
ret = d1.pop("dummy")

In [262]:
ret

'another value'

In [264]:
d1

{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer'}

The <b>keys</b> and <b>values</b> method gives you iterators of the dictionary’s keys and values, respectively. 

The order of the keys depends on the order of their insertion, and these functions output the keys and values in the same respective order:

In [267]:
list(d1.keys())

['a', 'b', 7]

In [269]:
list(d1.values())

['some value', [1, 2, 3, 4], 'an integer']

If you need to iterate over both the keys and values, you can use the <b>items</b> method to iterate over the keys and values as 2-tuples:

In [272]:
list(d1.items())

[('a', 'some value'), ('b', [1, 2, 3, 4]), (7, 'an integer')]

You can merge one dictionary into another using the <b>update</b> method:

In [275]:
d1.update({"b": "foo", "c": 12})

In [277]:
d1

{'a': 'some value', 'b': 'foo', 7: 'an integer', 'c': 12}

The update method changes dictionaries in place, so any existing keys in the data passed to update will have their old values discarded.

## Creating dictionaries from sequences

It’s common to occasionally end up with two sequences that you want to pair up element-wise in a dictionary. As a first cut, you might write code like this:

Since a dictionary is essentially a collection of 2-tuples, the dict function accepts a list of 2-tuples:

In [79]:
tuples = zip(range(5), reversed(range(5)))

In [81]:
tuples

<zip at 0x32bbe5640>

In [83]:
mapping = dict(tuples)

In [85]:
mapping

{0: 4, 1: 3, 2: 2, 3: 1, 4: 0}

Later we’ll talk about dictionary comprehensions, which are another way to construct dictionaries.

## Default values

It’s common to have logic like:



Thus, the dictionary methods <b>get</b> and <b>pop</b> can take a default value to be returned, so that the above if-else block can be written simply as:

<b>get</b> by default will return <b>None</b> if the key is not present, while <b>pop</b> will raise an exception.

With setting values, it may be that the values in a dictionary are another kind of collection, like a list. 

For example, you could imagine categorizing a list of words by their first letters as a dictionary of lists:

In [93]:
words = ["apple", "bat", "bar", "atom", "book"]

In [95]:
by_letter = {}

In [97]:
for word in words:
    letter = word[0]
    if letter not in by_letter:
        by_letter[letter] = [word]
    else:
        by_letter[letter].append(word)

In [99]:
by_letter

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

The <b>setdefault</b> dictionary method can be used to simplify this workflow. 

The preceding for loop can be rewritten as:

In [102]:
by_letter = {}

In [104]:
for word in words:
    letter = word[0]
    by_letter.setdefault(letter, []).append(word)

In [106]:
by_letter

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

The built-in <b>collections</b> module has a useful class, <b>defaultdict</b>, which makes this even easier. 

To create one, you pass a type or function for generating the default value for each slot in the dictionary:

In [109]:
from collections import defaultdict

In [111]:
by_letter = defaultdict(list)

In [113]:
for word in words:
    by_letter[word[0]].append(word)

In [115]:
by_letter

defaultdict(list, {'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']})

## Valid dictionary key types

While the values of a dictionary can be any Python object, the <b>keys</b> generally have to be <b>immutable</b> objects like scalar types (int, float, string) or tuples (all the objects in the tuple need to be immutable, too). 

The technical term here is hashability. You can check whether an object is hashable (can be used as a key in a dictionary) with the hash function:

In [326]:
hash("string")

8433698764441582378

In [328]:
hash((1, 2, (2, 3)))

-9209053662355515447

In [330]:
hash((1, 2, [2, 3])) # fails because lists are mutable

TypeError: unhashable type: 'list'

The hash values you see when using the hash function in general will depend on the Python version you are using.

To use a list as a key, one option is to convert it to a tuple, which can be hashed as long as its elements also can be:

In [334]:
d = {}

In [336]:
d[tuple([1, 2, 3])] = 5

In [338]:
d

{(1, 2, 3): 5}

## Set

A <b>set</b> is an unordered collection of unique elements. 

A set can be created in two ways: via the set function or via a set literal with curly braces <b>{}</b>:

In [342]:
set([2, 2, 2, 1, 3, 3])

{1, 2, 3}

In [344]:
{2, 2, 2, 1, 3, 3}

{1, 2, 3}

Sets support mathematical set operations like <b>union</b>, <b>intersection</b>, <b>difference</b>, and <b>symmetric difference</b>. 

Consider these two example sets:

In [121]:
a = {1, 2, 3, 4, 5}

In [123]:
b = {3, 4, 5, 6, 7, 8}

The union of these two sets is the set of distinct elements occurring in either set. This can be computed with either the <b>union</b> method or the <b>|</b> binary operator:

In [126]:
a.union(b)

{1, 2, 3, 4, 5, 6, 7, 8}

In [128]:
a | b

{1, 2, 3, 4, 5, 6, 7, 8}

The <b>intersection</b> contains the elements occurring in both sets. The <b>&</b> operator or the <b>intersection</b> method can be used:

In [131]:
a.intersection(b)

{3, 4, 5}

In [133]:
a & b

{3, 4, 5}

See Table 3-1 for a list of commonly used set methods.

## Table 3-1. Python set operations

|Function|Alternative syntax|Description|
|---|---|---|
|a.add(x)|N/A|Add element x to set a|
|a.clear()|N/A|Reset set a to an empty state, discarding all of its elements|
|a.remove(x)|N/A|	Remove element x from set a|
|a.pop()|N/A|	Remove an arbitrary element from set a, raising KeyError if the set is empty|
|a.union(b)|a \| b|	All of the unique elements in a and b|
|a.update(b)|a \|= b|	Set the contents of a to be the union of the elements in a and b|
|a.intersection(b)|a & b|	All of the elements in both a and b|
|a.intersection_update(b)|	a &= b|	Set the contents of a to be the intersection of the elements in a and b|
|a.difference(b)|a - b|	The elements in a that are not in b|
|a.difference_update(b)|a -= b|	Set a to the elements in a that are not in b|
|a.symmetric_difference(b)|	a ^ b|	All of the elements in either a or b but not both|
|a.symmetric_difference_update(b)|	a ^= b|	Set a to contain the elements in either a or b but not both|
|a.issubset(b)|<=|True if the elements of a are all contained in b|
|a.issuperset(b)|>=|True if the elements of b are all contained in a|
|a.isdisjoint(b)|N/A|True if a and b have no elements in common|

<blockquote>
Note

If you pass an input that is not a set to methods like union and intersection, Python will convert the input to a set before executing the operation. When using the binary operators, both objects must already be sets.
</blockquote>

All of the logical set operations have <i>in-place</i> counterparts, which enable you to replace the contents of the set on the left side of the operation with the result. For very large sets, this may be more efficient:

In [139]:
c = a.copy()

In [141]:
c |= b

In [143]:
c

{1, 2, 3, 4, 5, 6, 7, 8}

In [145]:
d = a.copy()

In [147]:
d &= b

In [149]:
d

{3, 4, 5}

Like dictionary keys, set elements generally must be immutable, and they must be hashable (which means that calling hash on a value does not raise an exception). In order to store list-like elements (or other mutable sequences) in a set, you can convert them to tuples:

In [152]:
my_data = [1, 2, 3, 4]

In [154]:
my_set = {tuple(my_data)}

In [156]:
my_set

{(1, 2, 3, 4)}

You can also check if a set is a <b>subset</b> of (is contained in) or a <b>superset</b> of (contains all elements of) another set:

In [159]:
a_set = {1, 2, 3, 4, 5}

In [161]:
{1, 2, 3}.issubset(a_set)

True

In [163]:
a_set.issuperset({1, 2, 3})

True

Sets are equal if and only if their contents are equal:

In [392]:
{1, 2, 3} == {3, 2, 1}

True

## Built-In Sequence Functions


Python has a handful of useful sequence functions that you should familiarize yourself with and use at any opportunity.

In [None]:
index = 0
for value in collection:
   # do something with value
   index += 1

Since this is so common, Python has a built-in function, <b>enumerate</b>, which returns a sequence of (i, value) tuples:

In [None]:
for index, value in enumerate(collection):
   # do something with value

## sorted


The <b>sorted</b> function returns a new sorted list from the elements of any sequence:

In [169]:
sorted([7, 1, 2, 6, 0, 3, 2])

[0, 1, 2, 2, 3, 6, 7]

In [171]:
sorted("horse race")

[' ', 'a', 'c', 'e', 'e', 'h', 'o', 'r', 'r', 's']

The sorted function accepts the same arguments as the sort method on lists.

## zip

<b>zip</b> “pairs” up the elements of a number of lists, tuples, or other sequences to create a list of tuples:

In [175]:
seq1 = ["foo", "bar", "baz"]

In [177]:
seq2 = ["one", "two", "three"]

In [179]:
zipped = zip(seq1, seq2)

In [181]:
list(zipped)

[('foo', 'one'), ('bar', 'two'), ('baz', 'three')]

<b>zip</b> can take an arbitrary number of sequences, and the number of elements it produces is determined by the shortest sequence:

In [184]:
seq3 = [False, True]

In [186]:
list(zip(seq1, seq2, seq3))

[('foo', 'one', False), ('bar', 'two', True)]

A common use of <b>zip</b> is simultaneously iterating over multiple sequences, possibly also combined with <b>enumerate</b>:

In [189]:
for index, (a, b) in enumerate(zip(seq1, seq2)):
    print(f"{index}: {a}, {b}")

0: foo, one
1: bar, two
2: baz, three


## reversed


<b>reversed</b> iterates over the elements of a sequence in reverse order:

In [192]:
list(reversed(range(10)))

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

<blockquote>
Keep in mind that reversed is a generator (to be discussed in some more detail later), so it does not create the reversed sequence until materialized (e.g., with list or a for loop).
</blockquote>

## List, Set, and Dictionary (and Generator) Comprehensions

List <b>comprehensions</b> are a convenient and widely used Python language feature. 
They allow you to concisely form a new list by filtering the elements of a collection, transforming the elements passing the filter into one concise expression. 

They take the basic form:

In [None]:
[expr for value in collection if condition]

This is equivalent to the following for loop:

In [None]:
result = []
for value in collection:
    if condition:
        result.append(expr)

The filter condition can be omitted, leaving only the expression. 

For example, given a list of strings, we could filter out strings with length 2 or less and convert them to uppercase like this:

In [198]:
strings = ["a", "as", "bat", "car", "dove", "python"]

In [200]:
[x.upper() for x in strings if len(x) > 2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

Set and dictionary comprehensions are a natural extension, producing sets and dictionaries in an idiomatically similar way instead of lists.

A dictionary comprehension looks like this:

In [None]:
dict_comp = {key-expr: value-expr for value in collection
             if condition}

A set comprehension looks like the equivalent list comprehension except with curly braces instead of square brackets:

In [None]:
set_comp = {expr for value in collection if condition}

Like list comprehensions, set and dictionary comprehensions are mostly conveniences, but they similarly can make code both easier to write and read. 

Consider the list of strings from before. Suppose we wanted a set containing just the lengths of the strings contained in the collection; we could easily compute this using a set comprehension:

In [451]:
unique_lengths = {len(x) for x in strings}

In [453]:
unique_lengths

{1, 2, 3, 4, 6}

We could also express this more functionally using the <b>map</b> function, introduced shortly:

In [204]:
set(map(len, strings))

{1, 2, 3, 4, 6}

As a simple dictionary comprehension example, we could create a lookup map of these strings for their locations in the list:

In [459]:
loc_mapping = {value: index for index, value in enumerate(strings)}

In [461]:
loc_mapping

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

## Nested list comprehensions

Suppose we have a list of lists containing some English and Spanish names:

In [206]:
all_data = [["John", "Emily", "Michael", "Mary", "Steven"],
    ["Maria", "Juan", "Javier", "Natalia", "Pilar"]]

Suppose we wanted to get a single list containing all names with two or more a’s in them. 

We could certainly do this with a simple for loop:

In [209]:
names_of_interest = []

In [211]:
for names in all_data:
    enough_as = [name for name in names if name.count("a") >= 2]
    names_of_interest.extend(enough_as)

In [214]:
names_of_interest

['Maria', 'Natalia']

You can actually wrap this whole operation up in a single nested list comprehension, which will look like:

In [217]:
result = [name for names in all_data for name in names
    if name.count("a") >= 2]

In [219]:
result

['Maria', 'Natalia']

At first, nested list comprehensions are a bit hard to wrap your head around. 

The <b>for</b> parts of the list comprehension are arranged according to the order of nesting, and any filter condition is put at the end as before. 

Here is another example where we “flatten” a list of tuples of integers into a simple list of integers:

In [222]:
some_tuples = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

In [224]:
flattened = [x for tup in some_tuples for x in tup]

In [226]:
flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9]

Keep in mind that the order of the for expressions would be the same if you wrote a nested for loop instead of a list comprehension:

In [229]:
flattened = []

for tup in some_tuples:
    for x in tup:
        flattened.append(x)

You can have arbitrarily many levels of nesting, though if you have more than two or three levels of nesting, you should probably start to question whether this makes sense from a code readability standpoint. 

It’s important to distinguish the syntax just shown from a list comprehension inside a list comprehension, which is also perfectly valid:

In [231]:
[[x for x in tup] for tup in some_tuples]

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

This produces a list of lists, rather than a flattened list of all of the inner elements.