Collections (sets, sequences, vectors, matrices, etc) are very important in mathematics, so it is worthwhile getting to know how they work in Python in more detail.

Python has two basic collection types, namely *sequence* and *map*:

- sequence types represent ordered sequences: ``list``, ``tuple``, ``range`` (discussed below), ``str`` (always immutable in Python), ``bytes`` (*immutable* sequence of bytes), ``bytearray`` (*mutable* sequence of bytes)
- map types represent sets of key-value pairs: ``dict``

(although there is only one built-in map type, you are free to create your own). Sequence/map types share a set of common operations (methods).

# Sequence types

Sequence types generally feature the following methods:

- ``__add__``, ``__mul__`` (sequence concatenation, new object returned)
- ``__getitem__``, ``__setitem__``, ``__delitem__`` (see examples below)
- ``__contains__``, ``__len__`` (see examples below)
- ``index``, ``count`` (both search for a particular value in the sequence - ``index`` returns the first index where the value is found, or throws an error if it is not; ``count`` returns a count of how many times the value is found)
- ``remove`` (removes the first matching entry)
- ``append`` (appends an element), ``extend`` ("appends" a sequence, but does so "in place", unlike ``__add__``)
- ``insert`` e.g. ``list_name.insert(index, element)``
- ``pop`` (removes and returns last value from the sequence or the given index value)
- ``reverse``, ``sort`` (reversing and sorting a sequence "in place")

Note, however, that some of the above methods mutate the object on which they are invoked (and hence cannot be used with the immutable sequence types) and some return a new object:

In [1]:
x = [1, 2, 3, 4, 5]
x.__add__([6, 7]) # x + [6, 7]
print(x)
x.extend([6, 7])
print(x)

'Hello'.extend(', World!') # Exception: str type is immutable

[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, 6, 7]


AttributeError: 'str' object has no attribute 'extend'

We will discuss some of the above methods in class, but know that having an exhaustive knowledge of every method on every built-in type is not necessary. After all, in the real world, knowing what a particular method does is only a Google search away.

As usual, remember that the methods surrounded in double underscores are generally not meant to be accessed directly:

In [None]:
x = [1, 2, 3, 4, 5]

x[2] = 'a' # x.__setitem__(2, 'a')
print(x[2]) # x.__getitem__(2)
print(6 in x) # x.__contains__(6)
del x[2] # x.__delitem__(2) (this is a type of statement, del is not an
         # opperator)
print(len(x)) # x.__len__()

You may wish to experiment with some of the above methods here:

In [9]:
x = [1, 2, 3, 4, 5]
y = x.pop()
print(x)
print(y)

[1, 2, 3, 4]
5


# Map types

Map types (``dict``) share the following methods:

- ``__getitem__``, ``__setitem__``, ``__delitem__``
- ``__contains__`` (searches keys), ``__len__``
- ``clear`` (removes all items)
- ``fromkeys`` (class method, e.g. ``dict.fromkeys(['a', 'b', 'c'], 1)`` returns ``{'a': 1, 'b': 1, 'c': 1}``)
- ``get`` (returns a default value if the key is not found, e.g. ``{'a': 1, 'b': 2}.get('c', 7)``)
- ``keys``, ``values``, ``items`` (returns a list of pairs, e.g. ``list({'a': 1, 'b': 2}.items())``)
- ``pop`` (value returned), ``popitem`` (item returned)

You may wish to experiment with some of the above methods here:

In [7]:
print(list({'a': 1, 'b': 2}.items()))

x = {'a': 1, 'b': 2, 'c': 3}
y = x.pop('b')
print(x)
print(y)

print({'a': 1, 'b': 2}.get('c', 7))
{'a': 1, 'b': 2}['c']

[('a', 1), ('b', 2)]
{'a': 1, 'c': 3}
2
7


KeyError: 'c'

# Ranges

A ``range`` is a sequence type we have not previously discussed. It is a kind of ``tuple`` where not every element needs to be stored in memory:

In [1]:
range(6, 30) # 6, 7,..., 29
range(10) # 0, 1,..., 9
range(-6, -2) # -6, -5,..., -3

y = range(1000000, 2000000) # no need to store all these values in memory
print(y[50])

1000050


We can also pass a "step" to the range constructor:

In [None]:
range(4, 14, 3) # 4, 7, 10, 13
range(5, -5, -2) # 5, 3, 1, -1, -3
range(-26, -48, 5) # empty (not infinite)

# Iterators

An *iterator* is an object which wraps a collection type and provides a convenient method of getting the next item:

In [3]:
z = ['a', 'b', 'c']
i = z.__iter__()
j = z.__iter__()

print(i.__next__())
print(i.__next__())

print(j.__next__())

print(i.__next__())
print(i.__next__())

a
b
a
c


StopIteration: 

Some points to keep in mind when using iterators:

- If you create two different iterators over the same object, they will each have different placeholders.
- Iterators over maps return keys (in no particular order).
- As usual, we should generally not invoke __ methods directly; instead, we can use the built-in functions ``iter`` and ``next``, for example:

In [None]:
z = ['a', 'b', 'c']
i = iter(z)
print(next(i))
print(next(i))
print(next(i))

- Once a call to ``__next__`` raises the ``StopIteration`` exception, so will all subsequent calls, regardless of whether more items are added to the collection.

# The for-in loop

Recall that Python does not have a "regular" for loop of the type found in most popular programming languages. Instead, it has a for-in loop of the following form:

In [None]:
for target in iterable: # iterable is an iterator or an object which
                        # implements __iter__
    # body

For example:

In [4]:
for v in [1, 2, 'apple']: # note that "in" is not the __contains__ operator
    print(v)

1
2
apple


*Warning:* in the above code, the variable ``v`` exists in the scope which encloses the loop, for example:

In [5]:
v = 50

for v in [1, 2, 'apple']:
    print(v)
    
print(v)

1
2
apple
apple


# List comprehensions

Frequently, we want to apply the same operation to all the items of a list, for example:

In [6]:
orig = [2, 6, -1, 13]
new = []
for x in orig:
    new.append(x * 2)
    
print(new)

[4, 12, -2, 26]


The above can be accomplished in a more readable manner using Python's *list comprehensions* (which look just like set notation in mathematics):

In [None]:
new = [x * 2 for x in orig]

Compare this to mathematical set notation:
$$
    \{2x|x\in S\}.
$$

Note that, in the above, the variable ``x`` is local to the list comprehension, so we could even write something like:

In [7]:
x = [2, 6, -1, 13]
new = [x * 2 for x in x] # legal code, but confusing

print(new)

[4, 12, -2, 26]
