*Authors:* 

# Lesson 4: Data structures

*Goals*: Storing and managing information in complex data structures

So far we have been dealing with variables that either save single values, such as integers or floating point numbers, or individual strings of characters. In this lesson, we will be covering more complex and powerful ways of storing multiple values in single variables.

We will be covering:
- Tuples
- Lists
- Sets
- Dictionaries
- Arrays
- Nested data structures
- Converting data structures

## Tuples

Tuples are the most basic data structure in Python. In principle, they can be understood as a group of multiple values that are stored in one common variable. 

Let us see this with an example. If one wants to save and print several colors, without any complex data structures, it would look something like this: 

In [None]:
# Storing four colors in individual variables
color1 = 'red'
color2 = 'blue'
color3 = 'green'
color4 = 'yellow'

# Printing the four individual variables
print(color1, color2, color3, color4)

Using a tuple we can store all of these values in a single variable. Syntactically, tuples are denoted by round brackets `()` and individual items are separated by commas.

In [None]:
# Storing four colors in one tuple
colors = ('red', 'blue', 'green', 'yellow')

# Printing the tuple
print(colors)

The length or size of data structures can be obtained using `len()`:

In [None]:
len(colors)

One can access individual tuple **items** (also called **elements**) by using an index enclosed in square brackets `[]`. This returns the value at the index position that is given in the brackets:

In [None]:
# Storing four colors in one tuple
colors = ('red', 'blue', 'green', 'yellow')

# Print the first item in the tuple
print(colors[0])

# Print the last item in the tuple
print(colors[3])

If one wants to access the last item of a tuple, there are two options:
1. We can 'count in reverse' and use the index -1. This can be seen as starting from the first item at index 0, and then taking one step back, which loops to the end of the tuple.
2. We can use the `len()`` function to determine the length of the tuple, and then access the index at that position - 1. The -1 is important to account for the fact that we start counting indices at 0

In [None]:
# Storing four colors in one tuple
colors = ('red', 'blue', 'green', 'yellow')

# Print the last item in the tuple
print(colors[-1])

# Print the last item in the tuple
print(colors[len(colors)-1])

One important feature of tuples is that they are **ordered**, which means that they have a fixed ordering of the elements stored inside them, based on the order of the elements in parentheses. 

To iterate over a tuple, we can use the `for` loops introduced in the previous lesson. There are three approaches to this:
1. Directly iterate over the elements stored in the tuple. This is very intuitive but somewhat limited. 
2. Iterate over indices and then accesses the elements stored at those indices. This requires us to iterate over the numbers from 0 to the length of the tuple. This is how many other programming languages handle iteration through a data structure, so it may be familiar if you are experienced in non-Python languages
3. Iterate over both indices and elements of a tuple by using the `enumerate` command. 


In [None]:
# Storing four colors in one tuple
colors = ('red', 'blue', 'green', 'yellow')

# Iterating over the elements
for color in colors:
    print(color)

# Iterating over the index
# The len() function returns the length of the tuple
for i in range(0, len(colors)):
    # we now use square brackets to access the item at the index i
    print(i, colors[i])

# The enumerate command returns another tuple consisting of the index and the element (index, element)
for enumerate_return in enumerate(colors):
    print(enumerate_return)

# We can also directly unpack the tuple returned by enumerate, by assigning it to a tuple structure:
for (i_color, color) in enumerate(colors):
    print(i_color, color)

Another feature of tuples to keep in mind is that they are **immutable**. This means that individual elements of a tuple cannot be modified. The entire tuple, however, can be overwritten, just like any other Python variable.

In [None]:
# Storing four colors in one tuple
colors = ('red', 'blue', 'green', 'yellow')

# Attempting to edit the first item (this does not work)
colors[0] = 'purple'

In [None]:
# Storing four colors in one tuple
colors = ('red', 'blue', 'green', 'yellow')

# Overwriting the entire tuple (this works)
colors = ('purple', 'blue', 'green', 'yellow')

Finally, tuples have no restrictions on what kind of elements are stored in them. For example, they **allow for duplicate items**, as well as mixing and matching between different variable types, e.g. integers and strings.

In [None]:
# Storing duplicates
colors = ('red', 'red', 'red', 'red')
print(colors)

# Mixing variable types
colors = ('red', 1, 0.5, -124)
print(colors)

## Lists

Lists are likely the most commonly used Python data structures, as they offer a flexible extension to the rather rigid structure of tuples. 

Syntactically, lists are denoted by square brackets `[]` and individual items are separated by commas. Accessing list **elements**/**items** is possible through the same syntax (index enclosed in brackets) we used for tuples. 

Lists, just like tuples, have a fixed **order** and we can iterate over them with the same methods as for tuples. They also allow for arbitrary items, including **duplicates**. 

In [None]:
# Storing four colors in one list
color_list = ['red', 'blue', 'green', 'yellow']

# Printing the tuple
print(color_list)

# Print the first element in the list
print(color_list[0])

# Print the last element in the list
print(color_list[3])

The main feature that sets lists apart from tuples is that they are **mutable**. This allows for modifying individual list elements, as well as adding more elements to an existing list. 

In [None]:
# Storing four colors in one list
color_list = ['red', 'blue', 'green', 'yellow']

# Attempting to edit the first element (this does work)
color_list[0] = 'purple'

print(color_list)

In [None]:
# Storing three colors in two lists each
color_list1 = ['red', 'blue', 'green']
color_list2 = ['purple', 'orange', 'yellow']

# We can combine multiple lists using the `+` operator
print(color_list1+color_list2)

In [None]:
# Storing four colors in one list
color_list = ['red', 'blue', 'green', 'yellow']

# Adding elements to a list is done using the append() function.
color_list.append('purple')

print(color_list)

In [None]:
# List can be empty on creation
color_list = []

# Empty list can still be appended to
color_list.append('red')
color_list.append('blue')
color_list.append('green')
color_list.append('yellow')

print(color_list)

In [None]:
# Storing four colors in one list
color_list = ['red', 'blue', 'green', 'yellow']

# We can also delete elements from a list using the remove command:
color_list.remove('yellow')
print(color_list)

In [None]:
# Note that if a list has multiple identical elements, remove will only remove the first instance.
color_list = ['red', 'yellow', 'blue', 'green', 'yellow']

color_list.remove('yellow')
print(color_list)

In [None]:
# remove() deletes a specific value from a list. If you instead need to remove the value at a given position,
# you can combine this with the item access syntax.
color_list = ['red', 'blue', 'green', 'yellow']
color_list.remove(color_list[0])
print(color_list)

In [None]:
# but be careful: also here only the first match will be removed!
color_list = ['red', 'yellow', 'blue', 'green', 'yellow']
color_list.remove(color_list[4])
# element 1 got removed, not element 4!
print(color_list)

In [None]:
color_list = ['red', 'blue', 'green', 'yellow']

# Alternatively we can use del() to remove items by index as well
del color_list[0]
print(color_list)

There are useful functions, such as `sum` to sum over all elements of a list (see further below for the function `list`):

In [None]:
many_numbers_list = list(range(101)); print(many_numbers_list)

sum(many_numbers_list)

## Sets

Sets are Python data structures specifically designed for mathematical set operations. Sets are denoted by curly brackets `{}` and individual **elements** (also called **members**) are separated by commas. 

In [None]:
# Defining a set of 3 numbers
number_set = {1, 2, 3}
print(number_set)

Sets are **unordered**, meaning the order in which the elements are displayed does not depend on their positions but just on their values. Additionally, it means we cannot access a specific element via an index.

In [None]:
# Defining a set of 3 numbers
number_set = {1, 2, 3}
print(number_set)

# Different order on creation results in the same set
number_set = {3, 1, 2}
print(number_set)

In [None]:
# Defining a set of 3 numbers
number_set = {1, 2, 3}

# This does not work
print(number_set[0])

Sets **do not allow for duplicates** and ignore any duplicate values. 

Elements in a set cannot be changed, but since elements can be added or removed, sets are **mutable**.

In [None]:
# Defining a set of 3 numbers
number_set = {1, 2, 2, 2, 2, 2, 3}
# Additional 2's are ignored
print(number_set)

number_set.add(4)
print(number_set)

The main advantage of sets is that they allow for mathematical set operations, such as the Union `|` of two sets, the Intersection `&`, the difference `-`, or the symmetric difference `^` between two sets

In [None]:
set1 = {1, 2, 3}
set2 = {3, 4, 5}

print('Union: ', set1 | set2)
print('Intersection: ', set1 & set2)
print('Difference 1-2: ', set1 - set2)
print('Difference 2-1: ', set2 - set1)
print('Sym. Difference: ', set1 ^ set2)

Further, they allow for a several useful check operations, such as checking if two sets are disjoint or if they are sub- or supersets of each other.

In [None]:
set1 = {1, 2, 3}
set2 = {4, 5}
set3 = {1, 2, 3, 4, 5}

print('Disjoint 1,2: ', set1.isdisjoint(set2))
print('Disjoint 1,3: ', set1.isdisjoint(set3))
print('Disjoint 2,3: ', set2.isdisjoint(set3))
print('Subset 2,3: ', set2.issubset(set3))
print('Superset 3,1: ', set3.issuperset(set1))

## Intermission Task
The set and the list below contain the same numbers. For both the set and the list, determine if they contain any of the following numbers: 6, 53, 79

In [None]:
set_task =  {346, 228, 138, 390, 459, 20, 78, 574, 478, 404, 194, 8, 18, 340, 40, 534, 386, 92, 423, 592, 550, 569, 539, 216, 440, 497, 565, 361, 405, 384, 575, 324, 376, 586, 453, 217, 57, 332, 66, 21, 316, 291, 287, 280, 373, 429, 27, 67, 377, 276, 274, 81, 277, 93, 470, 285, 10, 599, 240, 543, 581, 442, 584, 416, 259, 30, 139, 275, 399, 445, 264, 41, 516, 579, 149, 507, 128, 29, 248, 320, 568, 198, 170, 9, 498, 177, 266, 342, 298, 495, 411, 483, 343, 215, 129, 224, 253, 90, 245, 151, 496, 355, 430, 412, 222, 519, 64, 98, 490, 94, 385, 160, 509, 417, 59, 167, 520, 290, 207, 301, 426, 370, 84, 525, 446, 146, 100, 152, 53, 469, 15, 89, 472, 554, 68, 86, 273, 460, 322, 556, 488, 558, 300, 481, 365, 173, 34, 124, 133, 256, 199, 175, 510, 62, 521, 315, 137, 252, 297, 573, 164, 379, 158, 190, 576, 312, 544, 413, 351, 515, 120, 587, 462, 435, 530, 328, 202, 293, 140, 272, 220, 333, 511, 583, 126, 148, 594, 123, 425, 325, 467, 598, 279, 205, 302, 347, 4, 43, 421, 212, 487, 110, 171, 154, 258, 95, 589, 107, 227, 112, 352, 538, 103, 559, 56, 482, 477, 588, 268, 187, 165, 233, 473, 118, 174, 480, 567, 105, 155, 196, 142, 82, 181, 73, 294, 241, 317, 461, 61, 49, 382, 33, 522, 63, 549, 38, 87, 360, 193, 447}
list_task = [346, 228, 138, 390, 459, 20, 78, 574, 478, 404, 194, 8, 18, 340, 40, 534, 386, 92, 423, 592, 550, 569, 539, 216, 440, 497, 565, 361, 405, 384, 575, 324, 376, 586, 453, 217, 57, 332, 66, 21, 316, 291, 287, 280, 373, 429, 27, 67, 377, 276, 274, 81, 277, 93, 470, 285, 10, 599, 240, 543, 581, 442, 584, 416, 259, 30, 139, 275, 399, 445, 264, 41, 516, 579, 149, 507, 128, 29, 248, 320, 568, 198, 170, 9, 498, 177, 266, 342, 298, 495, 411, 483, 343, 215, 129, 224, 253, 90, 245, 151, 496, 355, 430, 412, 222, 519, 64, 98, 490, 94, 385, 160, 509, 417, 59, 167, 520, 290, 207, 301, 426, 370, 84, 525, 446, 146, 100, 152, 53, 469, 15, 89, 472, 554, 68, 86, 273, 460, 322, 556, 488, 558, 300, 481, 365, 173, 34, 124, 133, 256, 199, 175, 510, 62, 521, 315, 137, 252, 297, 573, 164, 379, 158, 190, 576, 312, 544, 413, 351, 515, 120, 587, 462, 435, 530, 328, 202, 293, 140, 272, 220, 333, 511, 583, 126, 148, 594, 123, 425, 325, 467, 598, 279, 205, 302, 347, 4, 43, 421, 212, 487, 110, 171, 154, 258, 95, 589, 107, 227, 112, 352, 538, 103, 559, 56, 482, 477, 588, 268, 187, 165, 233, 473, 118, 174, 480, 567, 105, 155, 196, 142, 82, 181, 73, 294, 241, 317, 461, 61, 49, 382, 33, 522, 63, 549, 38, 87, 360, 193, 447]


## Dictionaries

The final native Python data structure we will cover is the dictionary. Unlike tuples, lists, and sets, dictionaries store not only values but instead store combinations of **keys** and **values**.

Dictionaries are denoted by curly brackets `{}` containing `key : value` pairs. Each pair forms an **item**. The items are separated by commas. 

In [None]:
# Defining a dictionary that describes a physical object in 1D space
# Each line corresponds to a different physics property that we save
object_dict = {
    'mass': 500,
    'position': 0.0,
    'velocity': 5.0,
}

print(object_dict)

Individual items can be accessed through square brackets. However, instead of accessing them based on their positions, items are accessed using their respective keys. Nevertheless, dictionaries are still **ordered** (since Python 3.7).

In [None]:
object_dict = {
    'mass': 500,
    'position': 0.0,
    'velocity': 5.0,
}
# This does not work
print(object_dict[0])

In [None]:
object_dict = {
    'mass': 500,
    'position': 0.0,
    'velocity': 5.0,
}
# This does work
print(object_dict['mass'])

Dictionaries are **mutable**, allowing us to modify the values stored under specific keys.

In [None]:
object_dict = {
    'mass': 500,
    'position': 0.0,
    'velocity': 5.0,
}

# Modifying the mass value
object_dict['mass'] = 6000

# This change is applied to the dictionary
print(object_dict)

Attempting to assign a non-existent key does not fail, but instead, add the key and value as a new item to the dictionary.

In [None]:
object_dict = {
    'mass': 500,
    'position': 0.0,
    'velocity': 5.0,
}

# Adding an acceleration value
object_dict['acceleration'] = 6.0

# This addition can be seen in the dictionary
print(object_dict)

Finally, dictionaries allow for duplicate values but **not duplicate keys**.

In [None]:
# Multiple identical values work
object_dict = {
    'mass': 500,
    'position': 500,
    'velocity': 500,
}
print(object_dict)

# Multiple identical keys are overwritten by the latest one
object_dict = {
    'mass': 500,
    'mass': 0.0,
    'mass': 5.0,
}
print(object_dict)

We can use a `for` loop to iterate over the keys of a dictionary and then use those keys to access the respective values.

In [None]:
object_dict = {
    'mass': 500,
    'position': 0.0,
    'velocity': 5.0,
}

# Iterate over keys in the dictionary
for key in object_dict:
    # Retrieve values using keys and print both
    print(key, object_dict[key])

This covers the 4 native Python data structures. Here is a quick summary of their properties:

| Datastructure | Syntax | Ordered | Mutable | Allow Duplicates |
| --- | --- | --- | --- | --- |
| Tuple | (value, value) | Yes | No | Yes |
| List | [value, value] | Yes | Yes | Yes |
| Set | {value, value} | No | Yes | No |
| Dictionary | {key:value, key:value} | Yes | Yes | No |


## Arrays

Arrays are a data structure that store a **predefined number** of elements of **the same datatype**, i.e., the type and number of elements has to be decided when the array is created. Although arrys are extremely common in languages such as C/C++, Python does not have native array support. Arrays are implemented in the NumPy package. 

We will cover the details of NumPy in a later lesson, so for now we will only take a brief look at how arrays can be used. We need to import the NumPy package, similar to how we previously imported the math package.

In [None]:
import numpy as np

We can now use NumPy to create an array. Several features, such as element access via index and iteration, work for arrays just like for lists. 

In [None]:
# Creating a numpy array containing colors
color_array = np.array(['purple', 'orange', 'yellow'])

# Printing the whole array
print(color_array)

In [None]:
# Printing the first element
print(color_array[0])

In [None]:
# Iterating over an array (more on this in lesson 9)
for item in color_array:
    print(item)

The elements of an array have a fixed and uniform data type:

In [None]:
array = np.array([1, 2, 3])
array.dtype

In [None]:
array = np.array([1.0, 2, 3])
array.dtype

In [None]:
# all numbers are converted to floats
array

In [None]:
array = np.array([1, 'string', 3.0])
array

Arrays have many advantages, which will be covered in more detail in lesson 9. One important point to note, however, is that because of their fixed length, elements cannot be added to an existing array. This means any operation that does attempt to append elements to an array has to actually create a whole new array, which can be slow.

In [None]:
import time  # timing package

# Measures the time to append 10 000 elements to an array
start_time_array = time.time()
color_array = np.array(['purple', 'orange', 'yellow'])
for i in range(10_000):
    color_array = np.append(color_array, 'red')
end_time_array = time.time()

# Measures the time to append 10 000 elements to a list
start_time_list = time.time()
color_list = ['purple', 'orange', 'yellow']
for i in range(10_000):
    color_list.append('red')
end_time_list = time.time()

print('Append time for array: ', end_time_array-start_time_array, 's')
print('Append time for list: ',  end_time_list-start_time_list, 's')

Since you need to specify the size of an array when creating it, you can use a pre-filled array with all elements being zero or one (the values can be changed later):

In [None]:
np.zeros(15)

In [None]:
np.ones(12)

If you create an empty array, the values will not be initialized and have random values depending on what was written before in that memory location:

In [None]:
np.empty(10)

## Nested data structures

Python allows for nearly anything to be stored in a data structure, and this includes other data structures. This allows us to have nested structures, for example, a tuple containing tuples, a list containing lists, and so on. Items in such nested structures can be accessed using *multiple* item access syntax, where the first one affects the outer-most structure. 

In [None]:
# Tuple of tuples
nested_tuple = (
    (1, 2, 3),
    (4, 5, 6),
    (7, 8, 9),
)
# Accessing the first element of the second tuple:
print(nested_tuple[1][0])

# List of lists
nested_list = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
]
# Accessing the first element of the last list:
print(nested_list[-1][0])

# Dictionary of dictionary:
nested_dict = {
    'subdict_a': {'key_a1': 1, 'key_a2': 2},
    'subdict_b': {'key_b1': 3, 'key_b2': 4},
}
# Accessing the first item of the first dict:
print(nested_dict['subdict_a']['key_a1'])

In [None]:
# It is also possible to mix and match, for example here is a dictionary containing a list, tuple, and set
nested_dict = {
    'list': [4, 5, 6],
    'tuple': (5, 7, 9, 11),
    'set': {1, 2, 3, 4, 5},
}

# Accessing the first element of the list:
print(nested_dict['list'][0])
# Accessing the last element of the tuple:
print(nested_dict['tuple'][-1])
# Accessing the set
print(nested_dict['set'])


The `len()` function only returns the size of the outermost structure.

In [None]:
nested_list = [
    [" 1", " 2", " 3", " 4"],
    [" 5", " 6", " 7", " 8"],
    [" 9", "10", "11", "12"],
]
len(nested_list)

In [None]:
len(nested_list[0])

Also if you loop over a datastructure, you it will iterate over the outermost level:

In [None]:
for row in nested_list:
    print(row)

**Exercise:** Construct a loop that produces this output:
```
 1  2  3  4 
 5  6  7  8 
 9 10 11 12 
```
*Hint: if you want to print something without a newline use `print(..., end="")`*

In [None]:
# BEGIN-LIVE
for row in nested_list:
    for item in row:
        print(item, end=' ')
    print()
# END-LIVE

## Converting data structures

Throughout this lesson, we have covered several upsides and downsides. In order to optimally use these properties, it can be useful to swap from one structure to another. In Python, this can be done easily using **casting**. 

Tuples, lists, sets, and arrays can all be directly cast into one another, depending on what type of structure is currently needed. Dictionaries can be cast as well, although with slightly more work.


In [None]:
# Creating an initial tuple
number_tuple = (1, 2, 4, 6, 3, 5)
print(number_tuple)

# Cast to list, using the list() function
number_list = list(number_tuple)
print(number_list)

# Cast to set, using the set() function. Note that this unorders the list
number_set = set(number_list)
print(number_set)

In [None]:
# Cast back to a tuple, using the tuple() function. Note that this keeps the reordering of the set
number_tuple = tuple(number_set)
print(number_tuple)

In [None]:
dictionary = {'key_1': 1, 'key_2': 2, 'key_3': 3}

# Turn the dictionary into a list of tuples
print(dictionary.items())

# Turn dictionary keys into a list
print(dictionary.keys())

# Turn dictionary values into a list
print(dictionary.values())

In [None]:
print(list(np.random.choice(100, 200)))

## End of part 1

This is the end of the part you should read at home. Everything below this cell will be topic in the next exercise session and you don't need to look at this now.

## Interactive part

### 1. Bees and pollen

There are some bees:
- Maja flies with an average speed of 20 km/h, her favourite flowers have a distance of 1 km, she spends 10 minutes at the flowers to collect the pollen, she can collect and transport 1 pu (pollen unit, fictitious unit), she needs 2 minutes to deposit the pollen at home.
- Willi flies with an average speed of 15 km/h, his favourite flowers have a distance of 0.4 km, he spends 15 minutes at the flowers to collect the pollen, he can collect and transport 1.2 pu, he needs 4 minutes to deposit the pollen at home.
- Kassandra flies with an average speed of 30 km/h, her favourite flowers have a distance of 1.2 km, she spends 8 minutes at the flowers to collect the pollen, she can collect and transport 1.1 pu, she needs 1 minute to deposit the pollen at home.
- Helene flies with an average speed of 25 km/h, her favourite flowers have a distance of 1 km, she spends 10 minutes at the flowers to collect the pollen, she can collect and transport 1.5 pu, she needs 2 minutes to deposit the pollen at home.

**Task a) :** Create a variable for each bee and assign a datastructure of your choice to the variable to save all the data from the text above.

In [None]:
# BEGIN-LIVE
maja = {'name': 'Maja', 'speed': 20, 'distance': 1.0, 'collection_time': 10, 'capacity': 1.0, 'deposit_time': 2}
willi = {'name': 'Willi', 'speed': 15, 'distance': 0.4, 'collection_time': 15, 'capacity': 1.2, 'deposit_time': 4}
kassandra = {'name': 'Kassandra', 'speed': 30, 'distance': 1.2, 'collection_time': 8, 'capacity': 1.1, 'deposit_time': 1}
helene = {'name': 'Helene', 'speed': 25, 'distance': 1.0, 'collection_time': 10, 'capacity': 1.5, 'deposit_time': 2}
# END-LIVE

**Task b) :** Maybe it is useful for future tasks to put the variables defined above in another datastructure of your choice.

In [None]:
# BEGIN-LIVE
bees = [maja, willi, kassandra, helene]
# END-LIVE

**Task c) :** Calculate for each bee how long the entire process of collecting and depositing pollen at home (the bee is starting at home). Add this information to each bee (insert it to the datastructure).

In [None]:
# BEGIN-LIVE
def pollen_process_time(bee):
    flight_time = 2 * bee['distance'] / bee['speed']  # the factor 2 is because the bee has to fly forth and back
    coll_time = bee['collection_time'] / 60  # in hours
    dep_time = bee['deposit_time'] / 60  # in hours
    return flight_time + coll_time + dep_time

for bee in bees:
    bee['total_time'] = pollen_process_time(bee)
# END-LIVE

In [None]:
bees[0]

**Task d) :** How much pollen can all the bees collect together within 8 hours (a typical working day)? 

Note: Only complete processes of collecting pollen count, i.e., the bee will stay at home if there is not enough time left to finish another process before the working day ends.

In [None]:
# BEGIN-LIVE
pollen_amount = 0
for bee in bees:
    n_processes = 8 // bee['total_time']  # the number of completed processes that can be done during 8 hours
    pollen_amount_bee = n_processes * bee['capacity']
    print('The Bee', bee['name'], 'collected', pollen_amount_bee, 'pu within 8 hours')
    pollen_amount += pollen_amount_bee

print('The bees collected', pollen_amount, 'pu within 8 hours')
# END-LIVE

### 2. Data structures and hashablity
In this notebook, dictionary were introduced. A dictionary uses a hash algorithm to efficiently look up values associated with keys.
If some input datastructure produces an output with a hash algorithm, the input is called **hashable**.
The hash is of fixed length, regardless of the input.

Anything can be used as a key within a dictionary, as long as it is hashable.

**Task:**
We learned about tuples, lists, sets, and arrays. You also know about floats and ints. Which of these objects are hashable? You can find out by using Python's builtin `hash` function.  

In [None]:
# BEGIN-LIVE
# Strings are hashable
print('Hash of string \'10\' is:', hash('10'))

# Integers are hashable
print('Hash of integer 10 is:', hash(10))

# Floats are hashable
print('Hash of float 10.0 is:', hash(10.0))
print('Hash of float 10.5 is:', hash(10.5))

# Tuples are hashable
print('Hash of tuple (10, 20, 30) is:', hash((10, 20, 30)))
print('Hash of tuple (30, 20, 10) is:', hash((30, 20, 10)))

# Lists are not hashable, since they are mutable (elements can be changed or added)
try:
    print('Hash of list [10, 20, 30] is:', hash([10, 20, 30]))
except Exception as error:
    print(f'Lists are not hashable, the error is: {error}')

# Sets are not hashable, since they are mutable (elements can be added)
try:
    print('Hash of set {10, 20, 30} is:', hash({10, 20, 30}))
except Exception as error:
    print(f'Sets are not hashable, the error is: {error}')

import numpy as np
# Numpy arrays are not hashable, since they are mutable (elements can be changed)
try:
    print('Hash of numpy array [10, 20, 30] is:', hash(np.array([10, 20, 30])))
except Exception as error:
    print(f'Numpy arrays are not hashable, the error is: {error}')
# END-LIVE

### 3. Iterating mutable objects
Mutable objects can be changed while iterating over them and this has some implications one needs to understand when using loops.
In the following we will see this kind of unexpected behavior with `lists`. 

**Task:** 
Write two `for` loops that loop over `my_list` while removing or adding elements. 

In the first loop, print your loop variable and then use `pop(0)` to remove the first element of the current list. Which values would you expect to get printed?

In the second loop, append a new element of your choice and print both the loop variable and `my_list` in each iteration. How many iterations do you expect the loop to run? 

In [None]:
my_list = list(range(1,11)) # first ten positive integers

# BEGIN-LIVE
for item in my_list:
    print(item)
    my_list.pop(0)
# END-LIVE

In [None]:
my_list = [1, 2, 3]

# BEGIN-LIVE
max_len = 10

for i, item in enumerate(my_list, len(my_list) + 1): # start index set by length of original list
    my_list.append(i)
    print(item, my_list)
    if len(my_list) > max_len:
        print(f'Stopping the loop, since maximal length {max_len} is reached')
        break
# END-LIVE

Bonus Question: How could you still modify your loop object without running into this kind of behavior? 

In [None]:
# BEGIN-LIVE
# Bonus Question: Make a copy and only modify the copy
my_list = [1, 2, 3]

copy_of_my_list = my_list.copy()
for i, item in enumerate(my_list, len(my_list) + 1):
    copy_of_my_list.append(i)
    
print(my_list)
print(copy_of_my_list)
# END-LIVE

### 4. Typical errors with lists, tuples, sets and dictionaries

Below you will find code snippets that are not NOT FUNCTIONING as intended. 

**Task:** First execute the following cells. Then carefully read the error messages they produce.
Discuss these issues with your peers or instructor and apply fixes to resolve them.

In [None]:
# IndexError
my_list = [1, 2, 3]
my_list[3]

In [None]:
my_tuple = (1, 2, 3)
my_tuple[3]

In [None]:
my_list = [1, 2, 3]
my_list = my_list + '4'

In [None]:
my_list = [1, 2, 3]
my_list.remove(4)

In [None]:
my_tuple = (1, 2, 3)
my_tuple[0] = 10

In [None]:
my_first_set = {1, 2, 3}
my_second_set = {4, 5, 6}

my_first_set + my_second_set

In [None]:
my_dictionary = {'key_1': 1, 'key_2': 2, 'key_3': 3}
my_dictionary['key_not_there']