# Python compound types

Python also comes with [built-in types](https://docs.python.org/3/library/stdtypes.html) for groups of objects. These are the ones that will be mentioned in this tutorial:

 * list
 * tuple
 * set
 * dict
 
Any of the above types can hold several other objects, which do not need to be of the same type (e.g. a list can hold an integer on its first position, and a string in the second position). Let's have a deeper look at them:

## Sequence types: lists and tuples

A sequence is an **ordered group of objects**. As the elements of the list keep the order, they are accessible by index. It is also possible to retrieve the length of the sequence (the number of elements that it contains), as well as checking if elements are present in it.

In [71]:
print("This is an empty list:" , [], type([]))
print("This is an empty tuple:", (), type(()))
my_list = [1, 2, 3]
print("my_list:", my_list)
print("Is 1 in my_list?", 1 in my_list)
print("Isn't 3 in my list?", 2 not in my_list)
print("length of my_list:", len(my_list))

This is an empty list: [] <class 'list'>
This is an empty tuple: () <class 'tuple'>
my_list: [1, 2, 3]
Is 1 in my_list? True
Isn't 3 in my list? False
length of my_list: 3


The main difference between lists and tuples is **mutability**. Lists are mutable, but tuples are immutable:

In [72]:
# Lists are mutable
my_list = [1, 2, 3]
print("my_list before modification:", my_list)
my_list[0] = "uno"
print("my_list after modification:", my_list)

# Tuples are immutable
my_tuple = (1, 2, 3)
try:
    my_tuple[0] = "uno"
except TypeError as type_error:
    print(type_error)

my_list before modification: [1, 2, 3]
my_list after modification: ['uno', 2, 3]
'tuple' object does not support item assignment


If you know that the elements of your sequence will not change, you should use tuples for two reasons:

 * It expresses your original idea: this sequence is immutable
 * It will have some performance benefits: Python can optimize if it knows data will not change
 
**Watch out!** Since tuples use the parenthesis as constructor, creating single-element tuples can be sometimes tricky, specially when that single element is a sequence:

In [73]:
print("this is a tuple of six single-character strings:", tuple("banana"))
print("this is a tuple with a single element:", tuple(("banana",)))

this is a tuple of six single-character strings: ('b', 'a', 'n', 'a', 'n', 'a')
this is a tuple with a single element: ('banana',)


### Sequence operations

Since lists and tuples behave in the same way, the support the same set of operations, with the exception of the mutability access rules. Some of the common operations are:

 * **len**: retrieve length of the sequence
 * **in**: check if an element is part of the sequence
 * **index**: know the position of an element in the sequence
 * **+**: concatenation of sequences
 * **S\*N**: concatenation of the sequence _S_ to itself _N_ times

### Sequence slicing

As previously mentioned, sequences are accessible by index. We can use it to retrieve an element of the string using either positive or negative indices:

In [74]:
my_list = ["one", 2, "three", ["nested_four"], 5]
# Accessing by positive index: 0 to (len-1)
print(my_list[0])
print(my_list[4])
# Accessing by negative index: (-len) to -1
print(my_list[-5])
print(my_list[-1])

one
5
one
5


Python also supports a powerful mechanism to retrieve sub-sequences from a sequence: _slicing_. This works by specifying a range of indices separated by a colon, and Python will retrieve the sequence from the first index, up to (but not including) the second index. Let's see an example:

In [75]:
my_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print("my_list is:", my_list)
print("First five elements of my_list:", my_list[0:5])
print("Last five elements of my_list:", my_list[5:10])

# When working with the first or last elements, we don't need to specify them
print("First five elements of my_list:", my_list[:5])
print("Last five elements of my_list:", my_list[5:])

# We can get arbitrary elements as first/last
print("Intermediate six elements:", my_list[-8:9])
print("Get the whole list:", my_list[:])

my_list is: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
First five elements of my_list: [0, 1, 2, 3, 4]
Last five elements of my_list: [5, 6, 7, 8, 9]
First five elements of my_list: [0, 1, 2, 3, 4]
Last five elements of my_list: [5, 6, 7, 8, 9]
Intermediate six elements: [2, 3, 4, 5, 6, 7, 8]
Get the whole list: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


Slicing is even more powerful, as it is possible to specify a third paramenter: the _step_. Imagine that we want to get only the even numbers from _my\_list_:

In [76]:
my_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print("my_list is:", my_list)
print("even numbers:", my_list[0:10:2])

# If we consider the whole sequence, we can skip the indexes
print("even numbers again:", my_list[::2])

# No need to always use the whole sequence
print("odd numbers in the last half:", my_list[5::2])

my_list is: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
even numbers: [0, 2, 4, 6, 8]
even numbers again: [0, 2, 4, 6, 8]
odd numbers in the last half: [5, 7, 9]


### Strings are sequences too!

Maybe you remember the definition of the _str_ object: _a immutable sequence of Unicode code points_. Strings are just a restricted form of sequence, which only allows one data type as its internal element. Due to its widespread use, they also have a different way of being printed.

However, this means that you can use any of the sequence operators with strings:

In [77]:
my_str = "banana"
print("ana in banana?:", "ana" in my_str)
print("lenght of banana:", len(my_str))
print("slicing:", my_str[::2])

ana in banana?: True
lenght of banana: 6
slicing: bnn


### Other sequences

There are other sequence types, such as _range_ or _bytes_. For more information check the [Python documentation](https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range).

## Sets and frozensets

A set is a **mutable**, **unordered** group of **unique** elements, which means that adding the same element to a set twice will make only one copy of the element be part of the set.

Being an unordered collection, sets do not record element position or order of insertion. Accordingly, sets do not support indexing, slicing, or other sequence-like behavior.

A frozenset is an **immutable** set, once created, elements cannot be inserted or removed from it.

In [78]:
my_set = {"one", "two", "three", "four", "one"}
print("my_set:", my_set)
my_set.add("two")
print("my_set after adding 'two':", my_set)
my_set.add("five")
print("my_set after adding 'five':", my_set)

# Frozenset cannot be changed
my_frozenset = frozenset(["uno", "dos", "tres", "cuatro", "uno"])
try:
    my_frozenset.add("cinco")
except AttributeError as attribute_error:
    print(attribute_error)

my_set: {'four', 'one', 'two', 'three'}
my_set after adding 'two': {'four', 'one', 'two', 'three'}
my_set after adding 'five': {'four', 'one', 'five', 'two', 'three'}
'frozenset' object has no attribute 'add'


### Hashable elements

How do sets know when two elements are the same? An object that is to be inserted into a set must be **[hashable](https://docs.python.org/3/glossary.html#term-hashable)**. Immutable built-in types (int, float, string, tuple, frozenset) are all hashable.

As you probably guessed already, to make a set of sets, the inner ones need to be coverted to frozensets.

In [79]:
try:
    my_set = {"one", ["a", "list", "is", "not", "hashable"]}
except TypeError as type_error:
    print(type_error)
    
my_set = {"one", ("tuples", "are", "hashable")}
print(my_set)


unhashable type: 'list'
{'one', ('tuples', 'are', 'hashable')}


### Set operations

Sets are quite convenient for [mathematical set operations](https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset). Here are some of the most relevant ones. Note that they return a new set without modifying the original ones.

 * **union**: Get the elements present in any of the sets
 * **intersection**: Get the elements present in both sets
 * **difference**: Get the elements of one set not present in the other one
 * **symetric_difference**: Get the elements present in one of the sets, but not in both
 * **issubset**: Check if the set is contained in other set
 * **issuperset**: Check if the set contains the other set

In [80]:
a = {1, 2, 3, 4, 5, 6}
b = {2, 4, 6, 8}
print("a.union(b):", a.union(b))
print("a.intersection(b):", a.intersection(b))
print("a.difference(b):", a.difference(b))
print("a.symmetric_difference(b):", a.symmetric_difference(b))
print("a.issubset(b):", a.issubset(b))
print("a.issuperset(b):", a.issuperset(b))
b.remove(8)
print("a.issuperset(b):", a.issuperset(b))

# These operations support multiple parameters, and even other types:
b.add(8)
c = [6, 7, 9]
d = tuple((0,))
print("a.union(b, c, d):", a.union(b, c, d))

a.union(b): {1, 2, 3, 4, 5, 6, 8}
a.intersection(b): {2, 4, 6}
a.difference(b): {1, 3, 5}
a.symmetric_difference(b): {1, 3, 5, 8}
a.issubset(b): False
a.issuperset(b): False
a.issuperset(b): True
a.union(b, c, d): {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}


## Mapping types: dictionaries

A mapping type maps [hashable](https://docs.python.org/3/glossary.html#term-hashable) types to arbitrary objects. These are typically named _key_ and _value_.

As the rest of the compund types mentioned here, Python does not restrict the user to a single type over all the dictionary, both keys and values can be of different types. In the same way as sets, as dictionaries do not record element position or order of insertion, and they don't support slicing or sequence-like operations.

Dictionaries support a special form of indexing, which instead of an integer uses a _key_.

In [81]:
my_dict = {1: "one", 2: "two", 3: "three", "four": 4}
print("my_dict:", my_dict)

# To add an element we can just assign it
my_dict["five"] = 5
print("my_dict after adding key 'five':", my_dict)

# Retrieving an element by key
print('my_dict["four"]:', my_dict["four"])

# Removing an element
del my_dict[1]
print("my_dict after removing key 1:", my_dict)

my_dict: {'four': 4, 1: 'one', 2: 'two', 3: 'three'}
my_dict after adding key 'five': {'four': 4, 1: 'one', 2: 'two', 3: 'three', 'five': 5}
my_dict["four"]: 4
my_dict after removing key 1: {'four': 4, 2: 'two', 3: 'three', 'five': 5}


### Dictionary operations

Check [Python documentation](https://docs.python.org/3/library/stdtypes.html#mapping-types-dict) for a complete list of the methods supported by dictionaries. Here are some of the most commmon ones:

 * **in**: Check if an element is a key in the dictionary
 * **keys**: Retrieve the list of keys
 * **values**: Retrieve the list of values
 * **items**: Retrieve list of key-value tuples
 * **pop**: Remove a key from the dictionary, and return its value
 * **clear**: Remome all data from the dictionary
 * **get**: Retrieve an element from the dictionary, but without raising exceptions

As dictionaries do not keep the order, the _keys()_ and _values()_ functions will return unsorted lists. However, they are guaranteed to be mapped between them, that is, the _N_ element of _keys()_ is the key for the _N_ element of _values()_.

Access to the map with keys that do not exist will produce _KeyError_ exceptions. They can be easily prevented by means of the _in_ or _get_ functions, depending if we need to retrieve a value or not. The _get_ function can be specified a default value when the item is not found (_None_ by default).

In [82]:
my_dict = {1: "one", 2: "two", 3: "three"}
print("3 in my_dict?", 3 in my_dict)
print("three in my_dict?", "three" in my_dict)
print("4 in my_dict?", 4 in my_dict)
try:
    print("accessing by key - 4:", my_dict[4])
except KeyError as key_error:
    print("Got KeyError for key:", key_error)
    
print("accessing with get (no default) - 4:", my_dict.get(4))
print("accessing with get (default = 'four') - 4:", my_dict.get(4, "four"))

3 in my_dict? True
three in my_dict? False
4 in my_dict? False
Got KeyError for key: 4
accessing with get (no default) - 4: None
accessing with get (default = 'four') - 4: four


### Iterating over dictionaries

Python supports using more than one iteration variable, which is specially useful when combined with dictionaries:

In [83]:
my_dict = {1: "one", 2: "two", 3: "three", 4: "four"}
for key, value in my_dict.items():
    print("key:", key, "value:", value)

key: 1 value: one
key: 2 value: two
key: 3 value: three
key: 4 value: four
