### Container Types

Many data types in Python are **container** types - that is **collections** of other objects.

These collections allow us to use multiple objects without needing a symbol for each one.

Some collections are **mutable** - we can modify the collectgion by removing, adding and replacing elements.

Some collections are **immutable** - we **cannot** add, remove or replace elements (but the contained elements may, themselves, be mutable objects)

#### Lists

A `list` is a special type of **collection** that contains varying numbers of elements, where the **order** of the elements **matters**.

This means we have a concept of first element, second element, 10th element, last element, and so on.

(As we'll see later, we also collections where positional order does **not** matter - think of a bag of marbles - there is no particular order to them, you just need to put something in and take something out - but the order in which that happens is neither guaranteed nor important)

Collections where **positional order** matters, are called **sequence types**.

A `list` is a sequence type.

We can create a list object by using a literal with **square brackets**. (Think of an array in languages such as Java):

In [1]:
a = [1, 2, 3]
print(id(a), type(a), a)

4453183432 <class 'list'> [1, 2, 3]


Sequence types use zero-based indexing (i.e. the first element's index is `0`, the second's is `1`, etc). Sequences are also finite, and therefore have a **length**, i.e. the number of elements in the sequence.

In [2]:
a[0]

1

In [3]:
a[1]

2

In [4]:
a[2]

3

In [5]:
len(a)

3

Since sequences use zero-based indexing, the index of any sequence `a` is `len(a) - 1`

In [6]:
len(a)

3

In [7]:
a[len(a)-1]

3

If we try to request an element from a sequence using an index beyond that last index, we will get an exception - an `IndexError` exception:

In [8]:
a[3]

IndexError: list index out of range

Lists are **mutable** sequence types and are also dynamically sized. This means we can add, replace or remove elements from a list.

In [11]:
a = [1, 2, 3]
print(id(a), type(a), a)

4453320392 <class 'list'> [1, 2, 3]


In [12]:
a[0] = 100
print(id(a), type(a), a)

4453320392 <class 'list'> [100, 2, 3]


In [13]:
del a[0]
print(id(a), type(a), a)

4453320392 <class 'list'> [2, 3]


In [14]:
a.append(100)
print(id(a), type(a), a)

4453320392 <class 'list'> [2, 3, 100]


Notice how the memory address of our lists did **not** change as we modified it - it remained the same object, but it's contents (it's **state**) changed (was **mutated**).

We can create an **empty** list using a literal:

In [15]:
a = []
print(type(a), len(a), a)

<class 'list'> 0 []


#### Tuples

For now we can think of tuples as **immutable** lists. Tuples have the same basic functionality as lists, except that tuples are **immutable**.

We can create a tuple object using a literal with parentheses `()`:

In [16]:
a = (10, 20, 30)
print(id(a), type(a), a)

4453294728 <class 'tuple'> (10, 20, 30)


Just like a list, we can recover elements from a tuple using indexes (zero-based):

In [17]:
a[0]

10

In [18]:
len(a)

3

But tuples are **imutable**:

In [19]:
a[0] = 200

TypeError: 'tuple' object does not support item assignment

In [20]:
del a[0]

TypeError: 'tuple' object doesn't support item deletion

We can create an empty tuple using a literal:

In [21]:
a = ()
print(type(a), len(a), a)

<class 'tuple'> 0 ()


We have to be a bit careful though when creating a tuple with a single element.

Since `()` is also used to enclose expressions, writing this does not create a tuple!

In [22]:
a = (10)
print(type(a), a)

<class 'int'> 10


In Python what actually defines a tuple literal, are the commas, and not the parentheses.

In [23]:
a = 10, 20, 30
print(type(a), a)

<class 'tuple'> (10, 20, 30)


We use the parentheses in situations where there may be ambiguity, or where we clearly need to delimit the tuples, for example in a list of tuples:

In [24]:
a = [(1, 2), (3, 4)]

In [25]:
a

[(1, 2), (3, 4)]

So to create a tuple with a single element we can write:

In [26]:
a = 1,
print(type(a), a)

<class 'tuple'> (1,)


Or to be more explicit:

In [27]:
a = (1,)
print(type(a), a)

<class 'tuple'> (1,)


#### Strings

Strings are actually sequence types too - they are simply a sequence of characters, where the order of the characters matters.

 String objects can be created using a literal with **either** single `'` or double `"` quotes:

In [28]:
a = 'hello world'
b = "hello world"
print(id(a), type(a), a)
print(id(b), type(b), b)

4453319088 <class 'str'> hello world
4452300528 <class 'str'> hello world


This is useful if your string literal needs to contain a quote (single or double) character.

For example:

In [29]:
a = "O'Hare"
print(a)

O'Hare


or:

In [30]:
a = 'The book "Fluent Python" is a great Python book!'
print(a)

The book "Fluent Python" is a great Python book!


If you need both single and double quotes, then you will need to use an **escape** sequence:

In [31]:
a = 'I read the book "Fluent Python" while waiting in O\'Hare'
print(a)

I read the book "Fluent Python" while waiting in O'Hare


Two other very useful escape sequences are `\n` (newline) and `\t` (tab):

In [32]:
a = "I'm a lumberjack\tand I'm OK\nI sleep all night\tand I work all day."
print(a)

I'm a lumberjack	and I'm OK
I sleep all night	and I work all day.


Strings are therefore sequence types, and just like any sequence type, their are indexable (zero-based index):

In [33]:
a = 'abcdefg'

In [34]:
a[0]

'a'

In [35]:
a[1]

'b'

They have a length:

In [36]:
len(a)

7

But just like tuples, strings are **immutable**. So once a string has been created you cannot modify the contents of that string. You can extract portions of it and form a new string, but you cannot modify the string object itself.

In [37]:
a

'abcdefg'

In [38]:
a[0] = 'z'

TypeError: 'str' object does not support item assignment

In [39]:
del a[0]

TypeError: 'str' object doesn't support item deletion

#### Sets

A `set` is another collection type in Python. But just as in mathematics, sets do not have any implicit order in the elements it contains. Also, just like in mathematics, elements in sets are guaranteed to be unique.

You can create a `set` using braces `{}`:

In [40]:
a = {1, 'a', 10.5}
print(id(a), type(a), a)

4428410952 <class 'set'> {1, 10.5, 'a'}


As you can see the order in which the set elements were printed did not match the order in which they were defined in the literal. This is **really important** to remember - order of elements in sets is **not** guaranteed (at least as of Python 3.7).

So `sets` are collection types, but are **not** sequence types:

In [41]:
a[0]

TypeError: 'set' object does not support indexing

They **are** mutable though:

In [42]:
print(id(a), type(a), a)
a.add(False)
print(id(a), type(a), a)

4428410952 <class 'set'> {1, 10.5, 'a'}
4428410952 <class 'set'> {False, 1, 10.5, 'a'}


In [43]:
a.remove('a')
print(id(a), type(a), a)

4428410952 <class 'set'> {False, 1, 10.5}


Also, since elements in sets must be unique, trying to add the same element to the set again will simply be **ignored** by Python:

In [44]:
a = {1, 2, 3}
print(id(a), type(a), a)

4449014504 <class 'set'> {1, 2, 3}


In [45]:
a.add(1)
print(id(a), type(a), a)

4449014504 <class 'set'> {1, 2, 3}


In fact if we create a set using this literal:

In [46]:
a = {1, 1, 2, 2, 3, 3}
print(a)

{1, 2, 3}


you can see that the duplicates were removed.

This can actually be very useful when trying to find all the **unique** elements of a sequence type (which does allow for repeated elements).

For example, to find all the unique element in a list:

In [47]:
l = [1, 1, 2, 2, 3, 3]
print(l)

[1, 1, 2, 2, 3, 3]


In [48]:
s = set(l)
print(s)

{1, 2, 3}


It works the same way with a tuple, as well as strings:

In [49]:
l = "I'm a lumberjack and I'm OK"
print(len(l))

27


In [50]:
s = set(l)
print(len(s), s)

17 {'j', 'K', 'm', ' ', "'", 'b', 'c', 'r', 'd', 'a', 'l', 'n', 'k', 'I', 'u', 'e', 'O'}


Which means that to quickly count the number of unique characters in a string, we can do this:

In [51]:
len(set(l))

17

Since sets are not indexable, how do we *get* an element from a set? Think of it as a bag of marbles where you blindly reach in and pop one marble from the bag at a time.

In fact, that's exactly what we have for sets. The `pop()` method removes one item from the set and returns it:

In [52]:
a = {1, 'a', 10.5}

In [53]:
popped = a.pop()
print(popped, a)

1 {10.5, 'a'}


In [54]:
popped = a.pop()
print(popped, a)

10.5 {'a'}


In [55]:
popped = a.pop()
print(popped, a)

a set()


At this point we have an **empty** set, and trying to pop another element will result in an exception:

In [56]:
a.pop()

KeyError: 'pop from an empty set'

Notice the exception: `KeyError`. Why this weird exception that uses **key**? We'll see why next.

Note that we cannot create an empty set using a literal notation (we'll see why in a minute).

Instead we have to use the `set()` function with no arguments:

In [57]:
a = set()
print(type(a), len(a), a)

<class 'set'> 0 set()


#### Dictionaries

Dictionaries are also collections of objects. But instead of containg a collection of single objects, they contain collections of **pairs** of elements. The first item of the pair is called the **key**, and the second is called the **value**.

One restriction is that each **key** in a dictionary must be **unique** - just like sets.

Theoretically the order of keys in a dictionary is, just as with sets, not guaranteed. (In practice with Python 3.5 this is no longer true - but we'll look at that later).

In fact, sets and dictionaries are related (they are built using the same basic structural ideas) - think of sets as dictionaries that contain keys only (no associated values).

Dictionaries can be created using literals with brackets (just like sets!), `{}`, except that the elements in the dictionary literal are no longer just single elements, but pairs or elements separated by a colon `:`, in `key:value` fashion:

In [58]:
d = {'a': 1, 'b': 2, 'c': 3}
print(id(d), type(d), d)

4453314352 <class 'dict'> {'a': 1, 'b': 2, 'c': 3}


Dictionaries, like sets, are containers, but not sequence types - there is no concept of a first or second element, and so on. We still have the concept of number of elements (length), just not positional ordering.

In [59]:
len(d)

3

Just as with sets, elements (pairs now), can be **popped** from the dictionary using the `popitem()` method:

In [60]:
popped = d.popitem()
print(popped, type(popped), d)

('c', 3) <class 'tuple'> {'a': 1, 'b': 2}


Notice how the popped item returns the key/value pair as a `tuple`.

Dictionaries also support a `pop()` method, but in this case we can specify the item to pop by specifying the **key**:

In [61]:
print(d)

{'a': 1, 'b': 2}


In [62]:
popped = d.pop('a')
print(popped, type(popped), d)

1 <class 'int'> {'b': 2}


Notice how the `pop()` method returns **just the value**. This kind of makes sense since we already know what the **key** is since we specify it.

At this point our dictionary is empty, and if we try to `popitem()` from it:

In [63]:
d.popitem()

('b', 2)

You'll see that we get a `KeyError` exception - the same exception when we tried to `pop()` an item from an empty set.

The same thing will happen if we try to `pop` a non-existent key:

In [64]:
d = {'a': 1, 'b': 2}

In [65]:
d.pop('c')

KeyError: 'c'

We can also retrieve values associated to a key without removing the element from the dictionary - in fact this is probably more frequently used. Just like we used square brackets `[]` with sequence types, specifying an index number, here we also use `[]` but specify a key instad of an index number:

In [66]:
d = {'a': 1, 'b': 2}

In [67]:
value = d['a']
print(value, d)

1 {'a': 1, 'b': 2}


As you can see our dictionary was not mutated, and we recovered the value associated to the key we specified.

If we try to recover the value for a non-existent key, we get, as you probably guessed, a `KeyError` exception:

In [68]:
d['z']

KeyError: 'z'

Sometimes we want to specify a default value to return if the key does not exist. We can do so by using the `get()` method instead.
By default the `get()` method will return the `None` object if the key is not found in the dictionary:

In [69]:
value = d.get('z')
print(type(value), value)

<class 'NoneType'> None


But we can specify our own default value if we prefer:

In [70]:
value = d.get('z', 'N/A')
print(value, type(value))

N/A <class 'str'>


Dictionaries can also be mutated, removing, replacing or adding elements. Of course, with replacement, we really mean replacing the associated value, while adding or deleting elements applies to the key/value pair as a whole.

Replacement first is really easy:

In [71]:
d = {'a': 1, 'b': 2}
print(id(d), d)

4453400864 {'a': 1, 'b': 2}


In [72]:
d['a'] = 3.14
print(id(d), d)

4453400864 {'a': 3.14, 'b': 2}


Notice how the memory address of `d` did not change, but the value associated to the key `a` **did** change.

To add a key/value pair to the dictionary, we simply assign a value to a non-existent key:

In [73]:
d['z'] = 'new item'

In [74]:
print(id(d), d)

4453400864 {'a': 3.14, 'b': 2, 'z': 'new item'}


Once again, notice how the memory address of `d` did not change, but the collection was mutated (it's contents changed).

We can delete an item using the `pop` method we saw earlier, but that method returns the value corresponding item that was removed. Sometimes we want to simply delete the item without caring about the value. We can do this using `del` just as we saw with lists:

In [75]:
print(id(d), d)
del d['z']
print(id(d), d)

4453400864 {'a': 3.14, 'b': 2, 'z': 'new item'}
4453400864 {'a': 3.14, 'b': 2}


As a side note, just like `get` could specify a default if the key was not found, the `pop` item also allows us to specify a default return value if the key does not exist (in contrast, unlike the `get` method, the `pop` method does not use `None` as a default return value - we just get an exception):

In [76]:
print(id(d), d)
value = d.pop('z', 'Not found')
print(value, d)

4453400864 {'a': 3.14, 'b': 2}
Not found {'a': 3.14, 'b': 2}


We can create an empty dictionary using literals:

In [77]:
d = {}
print(type(d), len(d), d)

<class 'dict'> 0 {}


You should now understand why we cannot use `{}` to create an empty set - this would actually create an empty `dict`.