<br>

# Module 2 - Python basics : container types <a id='0'></a>
--------------------------

<div class="alert alert-block alert-info">
<b>Note:</b> this notebook contains <b>Additional Material</b> sections (in blue boxes, like this one) that will be skipped during the class due to time constraints. If you are going through this notebook on your own, feel free to
read these sections or skip them depending on your interest.
</div>

### Table of Content <a id='toc'></a>

* [**Introduction: container types**](#14)  
* [**Strings**](#15)  
&nbsp;&nbsp;&nbsp;&nbsp;[Length of a string](#16)  
&nbsp;&nbsp;&nbsp;&nbsp;[String concatenation](#17)  
&nbsp;&nbsp;&nbsp;&nbsp;[String slicing](#18)  
&nbsp;&nbsp;&nbsp;&nbsp;[Micro exercise 1](#19)  
* [**Lists**](#20)  
&nbsp;&nbsp;&nbsp;&nbsp;[List slicing](#21)  
&nbsp;&nbsp;&nbsp;&nbsp;[Mutability - an important difference between lists and tuples](#22)  
&nbsp;&nbsp;&nbsp;&nbsp;[Manipulating lists: adding and removing elements](#23)  
&nbsp;&nbsp;&nbsp;&nbsp;[From list to string, and back again ...](#24)  
&nbsp;&nbsp;&nbsp;&nbsp;[Micro Exercise 2](#25)  
* [**Dictionaries**](#26)  
* [**Exercises 2.1 - 2.3**](#27)
* [**Additional Material**](#28)  
&nbsp;&nbsp;&nbsp;&nbsp;[Mutability of objects in Python](#29)  
&nbsp;&nbsp;&nbsp;&nbsp;[A solution: explicit deep copy](#30)   
&nbsp;&nbsp;&nbsp;&nbsp;[Tuple: additionnal informations](#tuple) 


<br>

## Python documentation and learning resources

* **Official [python documentation](https://www.python.org/doc/)**: this is the official python documentation. It also contains some tutorials.
* **[Alternative documentation](https://www.w3schools.com/python/python_reference.asp)**: reference for built-in functions and types (easier to read, and sometimes more complete than the `help()` function). 


<br>
<br>

## Introduction: container types <a id='14'></a>

In this notebook, we will get to know some of the basic Python "container" built-in types.  
**Container types** are objects (types) that contain other objects:

* **`str`**: string - a sequence of characters.
* **`list`**: a **mutable** list of objects (mutable = can be modified after it was created).
* **`tuple`**: an **immutable** list of objects (immutable = cannot be modified after it was created).
* **`dict`**: dictionary - a collection of associated `key:value` pairs.

Container objects share some common characteristics, such as:
* They have a dedicated **`[]`** operator that lets user access one - or several - of the objects they contain.
* The number of objects a container has (its length) can be accessed using the **`len()`** function.
* Container objects are **iterables**: one can iterate over them using e.g. a `for` loop (see Notebook 3 of this course).

**Important:** in python (unlike e.g. in R), **indexing is zero-based**. This means that the first element of a container type object is accessed with `object[0]`, and not `object[1]`.

<br>

## Strings <a id='15'></a>

In python, the **`string`** type is a **sequences of characters** that can be used to represent text of any length.

* Strings can be declared using either **single `'`** or **double `"`** quotes. 

In [None]:
gene_seq = "ATGCGACTGATCGATCGATCGATCGATGATCGATCGATCGATGCTAGCTAC"
name = 'Sir Lancelot of Camelot'

print(gene_seq)
print(name)

<br>

In memory, a string variable can be represented as a sequence (container), where each element is a letter (character):

<img src="img/var_str1.png" alt="representation of a string variable in memory" style="width:250px;" />

<br>

Each element is associated with an **index**, starting at 0.

<img src="img/var_str2_with_indexes.png" alt="representation of a string variable in memory with indexes" style="width:250px;" />


In [None]:
my_str = "text"
print("The second element of the string is:", my_str[1])

* **Triple quotes** can be used to define **multi-line strings**.

In [None]:
long_string = """Let me tell you something, my lad. 
When you’re walking home tonight and some great 
homicidal maniac comes after you with a bunch 
of loganberries, don’t come crying to me!\n"""
print(long_string)

* **Accented and special characters** are possible in strings.

In [None]:
my_quote = "Gracieux : « aimez-vous à ce point les oiseaux \
que paternellement vous vous préoccupâtes \
de tendre ce perchoir à leurs petites pattes ? »"

print(my_quote)

* Inserting **Tab** and **new line** characters:
  * **`\t`** = tab
  * **`\n`** = new line

In [None]:
print('Hello\tWorld')  # \t : tabulation
print('Hello\nWorld')  # \n : newline

<br>

<div class="alert alert-block alert-success">

**Question:** why do the following 2 lines print exactly the same text?

</div>    

In [None]:
print('Hello World')
print('Hello World\n', end="")

<br>

* **Combining** single and double quotes.

In [None]:
quote_in_quote_1 = "Let me tell you 'something', my lad"
quote_in_quote_2 = 'Let me tell you "something", my lad'
print(quote_in_quote_1)
print(quote_in_quote_2)

# Note: quotes can also be escaped, but it is a bit less readable than using different quote types.
quote_in_quote_3 = "Let me tell you \"something\", my lad"
print(quote_in_quote_3)

<br>

### Length of a string <a id='16'></a>
The **`len()`** function can be used on a string to return its length:

In [None]:
name = "Sir Lancelot of Camelot"
print("The number of characters in the string '", name, "' is: ", len(name), sep='')

<br>

### String concatenation <a id='17'></a>
* Strings can be concatenated with the **`+`** operator.
* Strings can be "multiplied" (i.e. repeatedly concatenated) with the **`*`** operator.

In [None]:
print("dead" + "-" + "parrot") 
print("spam" * 5)

<div class="alert alert-block alert-info">

#### Additional material: f-strings

Python [f-strings (formatted string literals)](https://docs.python.org/3/reference/lexical_analysis.html#f-strings) allow to easily create strings that combine one or more variables with some hard-coded characters.

The syntax is simply to:
* Prefix the string with `f"This is an f-string"`.
* Inside an f-string, variable content can be accessed using curly braces, as in
  `f"This is a {variable_name}"`.  
  Here, `{variable_name}` will expand to the content of the variable `variable_name`.

**Examples:**

```python
# Example 1:
first_name = "Alice"
last_name = "Smith"

full_name = f"{first_name} {last_name}"   # Same as: full_name = first_name + " " + last_name"
print(f"Her full name is {full_name}.")   # -> Her full name is Alice Smith.

# Example 2:
animal = "cat"
container = "bag"

print(f"The {animal} is out of the {container}!")  # -> The cat is out of the bag!
```
    
</div>

<br>

### String slicing <a id='18'></a>

Because strings are a type of sequence (a sequence of characters), the different characters of a string can be accessed using the **`[]` operator**, with the index of the desired element(s).  

<img src="img/var_str3_with_indexes_and_access.png" alt="representation of a string variable in memory with indexes and access to an element" style="width:250px;" />

<br>

* Remember that in python, **the index of the first element is `[0]`**.
* Negative indices will access characters starting from the end of the string. E.g. `[-1]` returns the
  last character in the string.
  
<img src="img/var_str3_with_indexes_and_reverse_access.png" alt="representation of a string variable in memory with indexes and access to an element using reverse indexing" style="width:400px;" />
  

In [None]:
my_string = "And now, something completely different."

print("The first element of this string is:", my_string[0] )  # 0 is the index of the 1st element of the string.
print("The 5th element of this string is:", my_string[4] )    # 5th element of the string.
print("The last element of this string is:", my_string[-1] )  # -1 is the index of the last element of the string.

<br>

Indices can also be used to retrieve several elements at once: this is called a **slice operation** or **slicing**:
* The general syntax of slicing is `[start index: end index (excluded): step]`
* The end index position is **excluded from the slice**.
* The **default step value is 1**. It can be omitted (and usually is).
* If the start index is omitted, the slicing is implicitly done from the beginning of the string. `string[:10]`
* If the end index is omitted, the slicing is implicitly done until the end of the string. `string[10:]`

In [None]:
my_string = "And now, something completely different."

print(my_string)
print(my_string[0:5])   # Slice operation: get all elements from index 0 (included) to index 5 (excluded)
print(my_string[:5])    # Implicitly slices from the beginning of the string up to (but not included) index 5.
print(my_string[5:])    # Implicitly slices until the end of the string.
print(my_string[5::2])  # Keep every second letter, starting from index 5 to the end of the string.

**Tip:** you can reverse a sequence (such as a string) by using the **`[::-1]`** slicing operation.

In [None]:
print(my_string[::-1])  # Goes through the string from end to start -> reverses the string !

<br>

<div class="alert alert-block alert-success">

### Micro Exercise 1 <a id='19'></a>
* Create a string variable containing your name.
* Extract the last 3 letters from it using slicing.

</div>

<br>
<br>

[Back to ToC](#toc)

## Lists <a id='20'></a>
------------------------

Lists are **sequence type** objects that can contain any type of elements (other objects).
* **Lists** are declared by surrounding a comma separated list of objects with **`[]`**.  
  Example: `["This", "is", "a list", "with", 6, "items"]`

<br>

#### Creating a list

In [None]:
# Create a new list.
my_list = ["item 0", "item 1", "item 2"]

print("List content:", my_list)
print("List length :", len(my_list))

In [None]:
# Create a new list, here populated with integer numbers.
another_list = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

print("List content:", another_list)
print("List length :", len(another_list))

* List items can be of **heterogenous type**.
* List **items can be of any type**: therefore it's possible to nest lists within other lists.

In [None]:
nested_list = [1, 2, "spam", "eggs", 5.2, [2, "spam"]]

print("List content:", nested_list)
print("List length :", len(nested_list))

<br>

#### Creating an empty list

* Both **`[]`** and **`list()`** can be used create an empty list.  
* Generally `[]` is considered the more *pythonic* way of creating empty lists.

In [None]:
empty_list = []
empty_list = list()

print("List content    :", empty_list)
print("List length     :", len(empty_list))
print("Type of 'a_list':", type(empty_list))

<br>

#### Creating lists from iterables (e.g. sequences such as lists, tuples, or a range)

* Objects that are *iterables* can be converted to lists using the `list()` function.

In [None]:
# Creating a list from a "range" type object.

numbers = list(range(21))
print(numbers)

> What are **`range`** objects ?  
> `range` objects are sequences of **integer numbers**, e.g. `0, 1, 2, 3, 4, ...`.
>
> By default, a call to `range(x)` creates a sequence of integers from `0` to `x`, `x` excluded.
> * `range(10)`    -> `0, 1, 2, 3, 4, 5, 6, 7, 8, 9`
>
> Start and end values for the range can also be passed:
> * `range(3, 7)` -> `3, 4, 5, 6`
>
> A custom increment "step" can be passed as a 3rd argument.
> * `range(0, 22, 3)` -> `0, 3, 6, 9, 12, 15, 18, 21`
> * `range(10, 0, -1)` -> `10, 9, 8, 7, 6, 5, 4, 3, 2, 1`

<br>

[Back to ToC](#toc)

### Accessing values: list slicing <a id='21'></a>

* Accessing an element (or a range of elements) in a list is done using the **`[]`** operator.
* The **`[]` operator** works in much the same way than with strings, and allows
  **accessing individual objects** from a list, or **slicing** it.

<br>

<img src="img/var_list1.png" alt="representation of a list variable in memory with indices" style="width:250px;" />

<br>

* As with strings, remember that the **end position index is excluded** from the slicing.

In [None]:
my_list = [1, 2, "spam", "eggs", 5.2, [2, "spam"]]

print("First element  :", my_list[0])   # Get the 1st item of the list.
print("Elements 3 to 6:", my_list[2:])  # Get all elements from index 2 (i.e. the 3rd element) to the end of the list.

In [None]:
# Accessing the content of a nested list.
my_list[5][1]

<br>

* If we try to access an index that does not exist in the list, an **`IndexError`** is raised.

```python
    my_list = [1, 2, "spam"]
    my_list[3]

    ---------------------------------------------------------------------------
    IndexError                                Traceback (most recent call last)
    Input In [26], in <module>
          1 my_list = [1, 2, "spam"]
    ----> 2 my_list[3]

    IndexError: list index out of range
```

<br>

[Back to ToC](#toc)

<div class="alert alert-block alert-info">

### Additional material: Tuples <a id='22'></a>



Tuples are very similar to lists in that they also are a **sequence type** objects that can contain any type of element (other objects).  
* **Tuples** are declared using the syntax **`(value1, value2, ...)`**.
* Values in **tuples** can be accessed and sliced in the same way as lists are.
* The main difference between lists and tuples is that **the values in a tuple cannot be changed**
  once the tuple has been created. This means that we cannot add/remove values from a tuple, nor can
  we modify a value inside it.
  
  
**Creating tuples** is done similarly to list, but using **`()`** instead of `[]`:

```python
tuple_1 = ("spam", "eggs", "coconuts")   # Crate a tuple with 3 elements.
tuple_2 = ()                             # Create an empty tuple.
tuple_3 = tuple(range(10))               # Create a tuple from an iterable.

```

* ⚠️ When creating a tuple with a single element, a `,` must be added after the element: `tuple_2 = ("spam",)`

<br>

**Accessing tuples elements** is done with the `[]` operator, just like for lists or strings. Indexing also starts at `0`.
  
```python
tuple_1 = ("spam", "eggs", "coconuts")

tuple_1[0]  # 1st element of the tuple.
tuple_1[1]  # 2nd element of the tuple.
```



</div>

<br>

<div class="alert alert-block alert-info">

### Additional material: extracting non-consecutive values from a list <a id='22'></a>

While slicing works well to extract a single or consecutive elements from a list, creating a subset of non-consecutive
elements is somewhat trickier, but is possible using **list comprehension**.

```python
shopping_list = ["eggs", "ham", "spam", "coconuts", "shrubbery"]

# Extract the 3rd and last elements:
[shopping_list[i] for i in (2, -1)]  # -> ['spam', 'shrubbery']
```

<br>

However, if this something that you have to do a lot, then the `list` type is probably not the optimal solution for
that particular problem, and it might be better to use a different data structure such as a **numpy array** (see
the optional module on numpy).


</div>

<br>
<br>

[Back to ToC](#toc)

### Manipulating lists: adding and removing elements <a id='23'></a>

Remember the `help()` function ? Let's use it to gain a better understanding of the `list` type:

In [None]:
help(list)

That's a lot of information... let's go through it one element at a time:

* First we learn that `list` is a class (i.e. a specific type of object).
  Calling `list()` can thus creates an instance of type `list`.
  > class list(object)
  
* The help page then tells us that lists are:
  > built-in mutable sequence

  and describes the behavior of `list()` if no argument is given (creates an empty list).
  > If no argument is given, the constructor creates a new empty list.  
  > The argument must be an iterable if specified.
  
* The help then gives all methods available for class `list`, under `Methods defined here:`
    * **Methods** are functions that can be called on objects of the class they belong to.
      They often enable basic manipulation of objects of that type.  
    * Methods are called using the syntax **`object.method(...)`**.
    * When listed in the help, the first argument of methods is always **`self`**, but this argument does
      not need to be passed when calling the method.
    * Methods that start with **`_`** or **`__`** are **private methods**. They are not meant to be directly called
      by the end user (this is a convention but is not enforced by Python).
      > *Note:* the double underscore **`__`** is called a **dunder** (double underscore).  
    * The **`/` symbol** found in some method signatures indicates that **all arguments present before the `/`
      are positional arguments, even if they have a default value**. They have to be passed in the correct
      order, and cannot be passed with their argument name, i.e. `argument_name=value` - 
      [more details here](https://www.python.org/dev/peps/pep-0570).
    * The **`*` symbol** found in some method signatures indicates that **all arguments after the `*` are
      keyword arguments only**. In other words, no positional arguments are allowed after the `*`, and all
      arguments passer after the `*` must always be passed as `argument_name=value` -
      [more details here](https://peps.python.org/pep-3102)

<br>

Let's focus on 3 methods of the `list` class:
 * **`append(self, object, /)`**: adds an object - given as argument - at the end of the list.
 * **`extend(self, iterable, /)`**: concatenates (extends) the list with the items from the iterable passed as argument.
 * **`insert(self, index, object, /)`**: inserts an object - given as the 2nd argument - before 
   the index given as the 1st argument.

<br>
<br>

**Adding a single element to a list: `.append()` vs `.extend()`**

In [None]:
my_list = [1 , 2 , 5]
print("Initially, my_list is:", my_list)

* Calling the **`.append()`** method adds the specified element at the end of the list.

In [None]:
my_list.append("ham") 
print(my_list)

In [None]:
my_list.append("eggs")
print(my_list)

<br>

* Trying to add a string to a list with **`.extend()`** can lead to unexpected results.

In [None]:
my_list = [1 , 2 , 5]
my_list.extend("eggs")
print(my_list)

<br>
<br>

**Adding multiple values to a list: `.append()` vs. `.extend()` vs. list concatenation**
* **`.append(value)`** adds the specified value at the end of the list as a single value.
* When trying to add multiple values to a list, it might not have the desired effect.

In [None]:
my_list = [1, 2, 3]
my_list.append(["spam", "eggs"])

print("List content:", my_list)
print("List length: ", len(my_list), "\n")

* **`.extend(iterable)`** adds the values given in the specified **iterable** (e.g. a list, tuple, generator) to
  the list.
* This is the method you want to use to add multiple values to a list.

In [None]:
my_list = [1, 2, 3]
my_list.extend(["spam", "eggs"])

print(my_list)

In [None]:
# Extending a list with a "range" object (iterable).
my_list = [1, 2, 3]
my_list.extend(range(4, 11))

print(my_list)

<br>

* ✨ **Tip**: If both elements are of type `list`, concatenation can be done with the **`+`**, **`+=`**, and **`*`** operators.

In [None]:
my_list = [1, 2, 3]
my_list = my_list + ["spam", "eggs"]
print(my_list)

# += is a shortcut for extending a list.
my_list += ["more spam", "more eggs"]   # This is the same as: my_list = my_list + ["spam", "eggs"]

print(my_list)

> ⚠️ *Warning* : list concatenation with the `+` operator only works if both objects are of type `list`.
> 
>  ```python
>  test_list = [1, 2, 3]
>  test_list.extend((4, 5, 6))   # ✅ this works !
>  test_list + (1, 2, 3)         # ❌ this FAILS: TypeError: can only concatenate list (not "tuple") to list
>  ```
>

In [None]:
# Create a new list by appending two lists.
list_one = [ "hello" , 1159 ]
list_two = list_one + [10.1, "45", 7]
print(list_one)
print(list_two)

In [None]:
# Extend a list with the += operator.
list_one += ["spam", "eggs"] 
print(list_one)

In [None]:
# Concatenate the list multiple times with multiplication.
menu = ["spam", "eggs"] * 3 
print(menu)

<br>

<div class="alert alert-block alert-info">

#### Additional Material: `insert()` method

* Adding en element at a **specific position in the list** can be done with the **`insert()`** method.
* In this example, we add an element in second position of `my_list`.
* Remember that Python indices start with 0, so inserting before position 1 puts 
  the new object in second position in `my_list` (and not in the first).

</div>

In [None]:
my_list = [1, 2, 3]
my_list.insert(1 , "beans") 
print("list after insert:", my_list)

<br>
<br>

### Deleting elements in a list

* `list_object.pop(x)`: **deletes** the element at position `x` **and returns it**.
  If no arguments are passed to `.pop()`, the last element is removed by default.
* `del list_object[]`: **deletes** a single element or a slice.

✨ **Tip:** using **`.pop()`** is generally considered to be more *pythonic* than using `del`.

<br>

**Example:** deleting an item with the `.pop()` method:

In [None]:
a_list = list(range(21))
print("Original list:\n", a_list, "\n")

# By default, the last element of the list is removed by pop().
removed = a_list.pop()
print("Removed the last element:\n", a_list)
print("The element removed by pop is:", removed, end="\n\n")

# To remove an element at a specific index, the index value must be passed to the .pop() method:
removed = a_list.pop(0)
print("Removed the first element:\n", a_list)
print("The element removed by pop is:", removed)

<br>

**Example:** deleting with `del`:

In [None]:
a_list = list(range(21))
print("Original list:\n", a_list, "\n")

# Delete the last element from the list.
del a_list[-1]
print("Deleted the last element:\n", a_list, "\n")

# Delete all elements in positions 0 to 9. The element in position 10 is not deleted.
del a_list[0:10]
print("Deleted elements 0-9:\n", a_list)

<br>

[Back to ToC](#toc)

### From list to string, and back again ... <a id='24'></a>
Since string variable are **iterables** (they are sequences of characters), they can be converted to a list of characters using the **`list()`** function:

In [None]:
quote = "Drop your panties Sir William, I cannot wait till lunchtime."
individual_chars = list(quote)

print(individual_chars)

As can be seen above, the default behavior is that **each letter of the string becomes an element in the list**.

However, often we prefer to create a list that contains each word of the string. For this we use the **`.split()`** method of string:
* The `.split()` method is very useful when reading formatted text files.
* By default, it splits on white space: spaces, tabs, newlines.
* It accepts an optional **`sep`** argument that allows separation of fields using the specified character:
  look up `help(str.split)` for details.

In [None]:
quote = "Drop your panties Sir William, I cannot wait till lunchtime."
words = quote.split()

print(words)

<br>

**To convert a list to a string**, the **`.join()`** method can be used - it can be seen as the inverse of `.split()`.  
Somehow counter-intuitively, the `.join()` method applies to strings, and takes a list as argument:

In [None]:
# Here, the separator calls the join method which accepts the list "words" as argument.
quote = " ".join(words) 

print(quote)
print(type(quote))

In [None]:
# One can use a more exotic separator - in fact, any string can be used as separator.
quote = "_SEP_".join(words) 
print(quote)

* ✨ **Tip:** use an empty separator `""` to join characters into a word.

In [None]:
a_string = "".join(['to','ba','c','co','ni','st']) 
print(a_string)

<br>

<div class="alert alert-block alert-success">

### Micro Exercise 2 <a id='25'></a>

* Create a list containing all integers from `0` to `3` (included).
* Add two numbers at the end of the list.
* Use a slicing operation to select the fourth element of the list.
* Use a slicing operation to select the last element of the list.
* **Additional tasks (if you have the time):**
  * What is the difference between `list.pop()` and `list.remove()`?  
    Try to find-out empirically using the list `[6, 5, 5, 4, 3, 2, 1]` and
    running `.pop(5)` and `.remove(5)` on the list.
  * Why does `print(my_list.append("something"))` print "None" ?

<div>

<br>
<br>
<br>

### Returning a value vs. in-place modification

As you might have noticed, methods do sometimes return a value, and sometimes modify an object "inplace", meaning that the object on which the method is called is itself modified.

**Example:** the `.upper()` method of a string object returns a value (a modified copy of the original string). It does not modify the original string.

In [None]:
quote = "You must cut down the mightiest tree in the forest with a herring!"
return_value = quote.upper()

print("The return value is:", return_value)
print("The original object:", quote)        # The original object has not been modified.

<br>

**Example:** The `.append()` method of a list **modifies the list inplace**, and returns the value `None`.


In [None]:
more_quotes = [
    "What… is your quest?",
    "To seek the Holy Grail",
]
return_value = more_quotes.append("Well, how did you become king then?")

print("The return value is:", return_value)
print("The original object:", more_quotes)

<br>

When calling a method that modifies an object inplace and returns `None`, we generally do not store the return value in a variable.  
Appending a value to a list would thus be written as:

In [None]:
more_quotes.append("Well, how did you become king then?")

<br>
<br>
<br>

[Back to ToC](#toc)

## Dictionaries <a id='26'></a>
---------------

Dictionaries, or **`dict`**, are containers that associate a **key** to a **value**, just like a real world dictionary associates a word to its definition.
* Dictionaries are instantiated with the `{key:value}` or `dict(key=value)` syntax.

    ```python
    color_code = {"blue": 23, "green": 45, "red": 8}
    ```
    <br>
  
* **Keys must be unique** in the dictionary, and must be an immutable object (typically a `str`).
* **Values** can appear multiple times.
* The `[]` operator is used to **select objects from the dictionary**, but **using their key** instead
  of their index. E.g. `color_code[0]` is not a valid syntax (and will raise a **`KeyError`**), unless
  there is a key value of `0` in the dict (which is not the case in our example).
  
    ```python
    color_code["blue"]   # returns 23
    color_code["red"]    # returns 8
    ```
    <br>

  
* Dictionaries are **mutable** objects: `key:value` pairs can be added and removed, values can be modified.

**Examples:**

* **Create a dictionary with values** in it.

In [None]:
student_age = {
    "Anne": 26 , 
    "Victor": 31,
}

# Alternatively:
student_age = dict(Anne=26, Victor=31)

print(student_age)

<br>

* **Retrieve values** associated with keys.

In [None]:
print("The age of Anne is  :", student_age["Anne"])
print("The age of Victor is:", student_age["Victor"])

<br>

* **Trying to access an element of the `dict` by index is not possible**. It raises a **`KeyError`**, because
  python is trying to find the key `0` in the dictionary and it does not exist.

In [None]:
# Trying to access an element of the dict by index -> KeyError
student_age[0]

* **Adding additional `key:value` pairs** to a dictionary, or modifying an existing key is as easy as:

In [None]:
student_age["Eleonore"] = 5
print(student_age)

<br>

* **Modifying an existing key** of a dictionary.

In [None]:
student_age["Eleonore"] = 25
print(student_age)

student_age["Eleonore"] += 1  # Shortcut for: student_age["Eleonore"] = student_age["Eleonore"] + 1
print(student_age)

<br>

* **Create an empty dictionary**, then **add values** to it.
  * Empty dictionaries can be created with either **`{}`** or **`dict()`**.
    Using `{}` is considered more *pythonic*.
  * To **add a value to a `dict`**, we simply specify a new key value.

In [None]:
# Create an empty dict.
student_age = dict()
student_age = {}
print(student_age)

In [None]:
# Add new key:value pairs to the dict.
student_age["Anne"] = 26
print(student_age)

student_age["Victor"] = 31
print(student_age)

<br>

We are not restricted to a particular type for keys, nor for values. We can e.g. make a `dict` of lists or `dict` of `dict`.
* In practice, it's best to use dictionaries for storing **homogenous values** (i.e. you probably don't want
  to store unrelated things in different keys).

In [None]:
student_age[0] = "zero"                             # Key is an integer number.
student_age["group_1"] = [23, 25, 28]               # Value is a list.
student_age["group_2"] = {"bob": 26, "alice": 27}   # Value is a dict.

print(student_age)
print("Bob's age is:", student_age["group_2"]["bob"])

* Mutable values **cannot be used as keys!**

In [None]:
# Not allowed to use `[1,2]` as a key of the dict, because a list is a mutable object.
student_age[[1,2]] = "shrubbery"

<br>

### Removing items from a dictionary

**Removing an item from a dictionary** is similar as deleting items from a list:

 * **`dict.pop(key)`**: deletes the specified `key` from the dictionary and returns its value.
 * **`del dict[key]`**: deletes the specified `key` (and its associated value) from the dictionary.

In [None]:
# Create a new dictionary.
student_age = {
    "Anne": 26,
    "Victor": 31,
    "Eleonore": 25,
}
print('dictionary:', student_age)

In [None]:
# Delete values from the dictionary.
del student_age["Victor"]
removed_value = student_age.pop("Anne")

print('Dictionary after value removal:', student_age)
print("Value we removed with 'pop'   :", removed_value)

<br>
<br>

## Exercises 2.1 - 2.3 <a id='27'></a>
--------------------------

* Exercises are found in a separate Jupyter Notebook.
* If you have time, feel free to try the **additional exercises**.

<br>

<br>

<br>

[Back to ToC](#toc)

<div class="alert alert-block alert-info">

# Additional Material <a id='28'></a>
-------------------------------------

</div>

<br>

### Mutability of objects in Python <a id='29'></a>

All objects in Python can be either **mutable** or **immutable**. This is an important notion that newcomers to Python need to be aware of, which otherwise can lead to serious bugs in our codes.

What do we mean by *mutable*? We learnt earlier that **everything in Python is an object** and every variable holds an instance of an object. Once its type is set at runtime it can never change. A list is always a list, an integer is always an integer. However its value can be modified if it is mutable.

A mutable object can be changed/modified after it is created, and an immutable object can’t.

| Class   | Mutable |
| ------- |:-------:|
| `bool`  | no |
| `int`   | no |
| `float` | no |
| `str`   | no |
| `list`  | yes |
| `tuple` | no |
| `dict`  | yes |

Mutability has not much practical importance for simple types, but it has for container types.  
Let's see this with some examples:

In [None]:
a_str = "Python"
a_list = ["P", "y", "t", "h", "o", "n"]
a_tuple = ("P", "y", "t", "h", "o", "n")
a_dict = {0: "P", 1: "y", 2: "t", 3: "h", 4: "o", 5: "n"}

Let's try to modify an element (an individual char) in a string: it raises a **`TypeError`** because a string in an **immutable type**.

In [None]:
# Let's try to change "P" into "p"
print(a_str[0])
a_str[0] = "p"

Let's try to modify an element in a list: this is possible, because a list is a **mutable type**.

In [None]:
print(a_list)
a_list[0] = "p"
print(a_list)

However, the *immutable* cousin of `list`, the `tuple`, does not allow assignment:

In [None]:
print(a_tuple[0])
a_tuple[0] = "p"

<br>

Dictionaries are mutable, their values can be modified:

In [None]:
print(a_dict)
a_dict[0] = "p"
print(a_dict)

In [None]:
my_dict = {"str": a_str, "list": a_list}
another_dict = my_dict
print(another_dict["list"])

In [None]:
# Let's now modify my_dict...
my_dict["list"][0] = "P"

# ... and see what happens to both dictionaries.
print("my_dict:", my_dict)
print("another_dict:", another_dict)

Although we never changed/modified `another_dict`, it was also changed. This is because the key **'list'** in both dictionaries refer to the same `list` object: `a_list`. It is mutable and once it is modified, both dictionaries will reflect this modification. Let's visit this with a final example.

In [None]:
a_list[0] = "Z"
print("my_dict:", my_dict)
print("another_dict:", another_dict)

**To summarize:**

* An object in Python can either be mutable or immutable.
* We can simply check it by trying to modify a variable.
* `str` and `tuple` are immutable
* `list` and `dict` are mutable
* We need to pay attention when we modify mutable objects that are referred from multiple variables!

In [None]:
my_dict["str"] = "Zython"

print("my_dict:", my_dict)
print("another_dict:", another_dict)

In [None]:
a_third_dict = my_dict.copy()
my_dict["str"] = "Kython"
my_dict["list"][0] = "K"

print("my_dict:", my_dict)
print("third_dict:", a_third_dict)

<br>

[Back to ToC](#toc)

### A solution: explicit deep copy <a id='30'></a>

The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists, dicts, or class instances):

* A **shallow copy** constructs a new compound object and then (to the extent possible) inserts references into
  it to the objects found in the original.
* A **deep copy** constructs a new compound object and then, recursively, inserts copies into it of the objects found
  in the original.

In [None]:
import copy

a_third_dict = copy.deepcopy(my_dict)   # <- explicit deep copy
my_dict["str"] = "Back to Python"
my_dict["list"][0] = "P"
print("my_dict:", my_dict)
print("another_dict:", a_third_dict)

### Copying immutable values

* When making a copy of a variable of an immutable type (e.g. `str`, `int`, `float`, `tuple`, ...),
  the copy will point to the same memory location.
* However, when the variable value is changed (i.e. it is re-assigned a new value), then it will point to
  a different memory location. This is expected, since immutable variable types cannot have their value
  modified.


In [None]:
a = 3
b = a
print("Are 'a' and 'b' pointing to the same object in memory:", a is b)
print("Memory locations of the 2 objects:", id(a), id(b), sep="\n")
b += 1
print("After having modified 'b':")
print("Are 'a' and 'b' pointing to the same object in memory:", a is b)
print("Memory locations of the 2 objects:", id(a), id(b), sep="\n", end="\n\n")

#### Python memory management: interned vs non-interned values

> Integer values from -5 to 256 are **"interned"**, which means that they are created once
> and then re-used over the entire runtime of the python program/session.

```py
    a = 256
    b = 256
    print("Are 'a' and 'b' pointing to the same object in memory?:", a is b)
    print("Memory locations of the 2 objects:", id(a), id(b), sep="\n", end="\n\n")
```
```text
    Are 'a' and 'b' pointing to the same object in memory?: True
    Memory locations of the 2 objects:
    9801248
    9801248
```

> The integer value of 257 on the other hand, is not "interned": this means that if the
> value 257 is assigned independently to two different variables, they will not point
> to the same memory location. The value if 257 is thus duplicated in memory.
```py
    a = 257
    b = 257
    print("Are 'a' and 'b' pointing to the same object in memory:", a is b)
    print("memory locations of the 2 objects:", id(a), id(b), sep="\n", end="\n\n")
```
```text
    Are 'a' and 'b' pointing to the same object in memory: False
    memory locations of the 2 objects:
    139648697282640
    139648697282736
```

> Literal strings are also "interned": when a variable is assigned a literal
> string, python will first check whether such a string already exists somewhere
> in memory, and if yes, the variable is pointed to the already existing string
> in memory.
```py
    a = "shrubbery"
    b = "shrubbery"
    print("Are 'a' and 'b' pointing to the same object in memory:", a is b)
    print("memory locations of the 2 objects:", id(a), id(b), sep="\n", end="\n\n")
```
```text
    Are 'a' and 'b' pointing to the same object in memory: True
    memory locations of the 2 objects:
    139649051058800
    139649051058800
```

> Non-literal strings on the other hand are not interned:
```py
    a = str(23)
    b = str(23)
    print("Are 'a' and 'b' pointing to the same object in memory:", a is b)
    print("memory locations of the 2 objects:", id(a), id(b), sep="\n", end="\n\n")
```
```text
    Are 'a' and 'b' pointing to the same object in memory: False
    memory locations of the 2 objects:
    139649050161200
    139649050161968
```

> Because at this point the string "shrubbery" has already been created once before,
> re-assigning it to another variable will still point to the same memory location as
> it did earlier when it was assigned to "a" and "b".
```py
    c = "shrubbery"
    print("memory locations of 'c' is:", id(c), end="\n\n")
```
```text
    memory locations of 'c' is: 139649051058800
```


[Back to ToC](#toc)



### Tuples <a id='tuple'></a>


#### Create a tuple
 * **Important:** if a tuple contains a single element, then the last (and only) element of the tuple must be
   followed by a comma.
 * If the tuple contains multiple elements, then this final comma is not necessary (but allowed).
     ```py
     a_tuple = (value, )`   # Correct syntax.
     a_tuple = (value)      # This will NOT create a tuple, but a regular value.
     ```

In [None]:
# Create a tuple of 3 elements.
tuple_1 = ("spam", "eggs", "coconuts")

print(tuple_1)

In [None]:
# Create a tuple of 1 element.
tuple_1 = ("spam", )

print("Content of variable:", tuple_1)
print("Length of tuple is :", len(tuple_1))
print("type of variable   :", type(tuple_1))

In [None]:
# Create a tuple from a list.
list_1 = ["a", "sequence", "of", "strings"]
tuple_1 = tuple(list_1)

print(tuple_1)

<br>

**Warning:** the following does not create a tuple, but a variable of type string!

In [None]:
tuple_1 = ("spam")
print("Content of variable:", tuple_1)
print("Length of variable :", len(tuple_1))
print("type of variable   :", type(tuple_1))

<br>

#### Creating empty tuples
* Empty tuples can be created with `()` or `tuple()`.
* Note that because tuples cannot be changed after they are created, it is not possible to add elements
  to an empty tuple.

In [None]:
tuple_1 = ()
tuple_1 = tuple()

print("Content:", tuple_1, " Type:", type(tuple_1), " Length:", len(tuple_1))

<br>

### When to use `list` or `tuple`? Mutability - an important difference between lists and tuples 

* A `list` is **mutable**: it can be extended, reduced, and its elements can be changed.
* A `tuple` is **immutable**: its length is fixed and its elements cannot be changed.

<br>

**Use tuples** when:
  * You need to store a sequence of objects that will not change in your program (fixed length).
  * You want to be sure that a sequence of objects will not be accidentally modified - a
    sort of **write-protection**.
  * Tuples are slightly more memory efficient than list.
      ```py
        import sys
        print(sys.getsizeof((1, 2, 3, 4, 5)))  # -> 80 bytes
        print(sys.getsizeof([1, 2, 3, 4, 5]))  # -> 96 bytes.
      ```

<br>

**Use lists** when:
  * You need to store a sequence of objects that will be modified over time.
  * You need to have a sequence that can be grown (add elements) or shrunk (remove elements).

<br>

For more details about object mutability in python, see the **Additional Theory** section at the end of this notebook.

<br>

**Example:** because lists are mutable, we can modify an element in a list (or add/remove an element from a list).

In [None]:
# Create a new list.
sandwich_ingredients = ["spam", "ham", 3, "eggs"]

# We now modify the 4th element of the list:
sandwich_ingredients[3] = "and spam"

print(sandwich_ingredients)

Trying to do the same modification on a tuple raises a **`TypeError`**:

In [None]:
sandwich_ingredients = ("spam", "ham", 3, "eggs")
sandwich_ingredients[3] = "and spam"

<br>

What can be done however, is to **assign a new tuple to the same variable** - this will *look* like we have modified a tuple, but in fact we have created a new tuple object and assigned it to our variable.

In [None]:
sandwich_ingredients = ("spam", "ham", 3, "eggs")
print(sandwich_ingredients)

# We do not modify an existing tuple: we create a new one.
sandwich_ingredients = ("spam", "ham", 3, "and spam")
print(sandwich_ingredients)


<br>

### Additional info: tuples referencing mutable values

* We just saw that **tuples are immutable**... but let's consider the following:

In [None]:
my_tuple = ("a", "b", [1, 2, 3])
print("The tuple looks like this:", my_tuple)

my_tuple[2][2] = "Did I just change an immutable tuple?"
print("The tuple looks like this:", my_tuple)

In the above, it *looks* like we have modified a tuple! But in fact, we have only modified a list that happens
to be referenced by the tuple.

Nested lists or tuples can be visualized like this:

<img src="img/var_list_nested.png" alt="representation of a nested list variable in memory" style="width:450px;" />

* So `my_list` does not really contains the list and string themselves, but only a **pointer to these objects**.
* Changing the content of a list does not change its pointer, and therefore in the code example above
  the tuple has in fact not been modified: it is still pointing at the same list. The same behavior
  happens if we store a dictionary in a tuple.

* This behavior can be visualized using this [interactive code visualizer](https://pythontutor.com/render.html#code=my_tuple%20%3D%20%28%22a%22,%20%22b%22,%20%5B1,%202,%203%5D%29%0Aprint%28%22The%20tuple%20looks%20like%20this%3A%22,%20my_tuple%29%0A%0Amy_tuple%5B2%5D%5B2%5D%20%3D%20%22did%20I%20just%20change%20an%20immutable%20tuple%3F%22%0Aprint%28%22The%20tuple%20looks%20like%20this%3A%22,%20my_tuple%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)

<br>

For reference, accessing the nested elements looks like this:

<img src="img/var_list_nested_access.png" alt="representation of a nested list variable in memory with access" style="width:450px;" />

<br>

### Copy of list content vs. copy of pointer (copying mutable values)

When assigning a variable to another variable (as done below when assigning `l1` to `l2`), we are not duplicating the content of the existing variable: instead, we only create **a new pointer** to the content of the original variable.
* The example below can also be visualized on [the interactive code visualizer](https://pythontutor.com/render.html#code=l1%20%3D%20%5B1,%202,%203%5D%0Al2%20%3D%20l1%0Al2%5B0%5D%20%3D%20-1%0A%0Aprint%28%22l2%20is%3A%22,%20l2%29%0Aprint%28%22l1%20is%3A%22,%20l1%29%0A%0Adel%20l1%0Adel%20l2%0A%0Al1%20%3D%20%5B1,%202,%203%5D%0Al2%20%3D%20l1.copy%28%29%0Al2%5B0%5D%20%3D%20-1%0A%0Aprint%28%22l2%20is%3A%22,%20l2%29%0Aprint%28%22l1%20is%3A%22,%20l1%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)

In [None]:
l1 = [1, 2, 3]
l2 = l1
print("The content of list 1:", l1)
print("The content of list 2:", l2)

# By modifying the underlying list to which both l1 and l2 are pointing, both l1 and l2
# return the modified value.
l2[0] = -1
print("The content of list 1:", l1)
print("The content of list 2:", l2)

print("\nAre both list pointing to the same memory location?", l1 is l2)
print("Memory locations of the 2 lists:", id(l1), id(l2), sep="\n")

To make a copy of the actual list content, we must use the **`copy()`** method of list.

In [None]:
l1 = [1, 2, 3]
l2 = l1.copy()
print(l1)
print(l2)

# Now l1 and l2 are pointing to different memory locations.
l2[0] = -1
print(l1)
print(l2)

print("\nAre both list pointing to the same memory location?", l1 is l2)
print("Memory locations of the 2 lists:", id(l1), id(l2), sep="\n")

<br>

### Benchmarking: looping speed of tuples vs lists
* As can be tested below, there is no speed difference between `lists` and `tuples`.
* Generators are faster (probably because they skip the step where elements of the sequence must
  be stored in memory).

In [None]:
# Functions that do nothing but loop through a list, tuple or generator.

loop_replicates = 1000000

def loop_range():
    for x in range(loop_replicates):
        pass

def loop_tuple():
    for x in tuple(range(loop_replicates)):
        pass

def loop_list():
    for x in list(range(loop_replicates)):
        pass

# Compare the speed between tuples, lists and generators.
%timeit loop_tuple()
%timeit loop_list()
%timeit loop_range()