# SLU04 - Data Structures - Learning notebook 2

In this notebook, we will be covering the following:   
 
- Mutability and immutability
- `zip` function
- 'Splat' `*` operator

---

## 5. An introduction to mutability & immutability

### 5.1. Mutability concept

![Mutants.png](data/mutants.png)

⚠️ **Warning:** Before we begin, please understand that mutability (and obviously, immutability) is a **huge topic in Python**, and all the other programming languages with the same [paradigm](https://en.wikipedia.org/wiki/Programming_paradigm) because it goes all the way to the core of accessing and setting "values" associated with "entities". Nonetheless, it is an important concept to introduce at this stage since it deeply affects what you can and what you can't do with the data structures that you've just learned, as well as the data types you've learned in SLU 02.

📝 **Note:** If you get scarred after clicking the previous _paradigm_ link 😱, run back as we just like to leave things contextualized for you to go back to if you wish to expand your programming concepts, and of course when you feel yourself in a particularly courageous day 💪.

**Mutability** is a key concept in Python that defines **whether an entity's references (or "values") can (or cannot) be changed**.  

> 📝 **Note:** If you look for information about this subject you may find references to the **_ability to change the internal state of an entity_**. This is the formal concept, but we like to keep things simple here 😇 (At least for the time being...😈) and will just call it references or "values".

This characteristic applies to the fundamental data types (some of which you studied in SLU 02), and to data structures, (some of which are addressed in the current SLU).  
The first thing to remember is that the mutable or immutable character of an entities **depends exclusively on its `Type`**.

### 5.2. A word of warning about variables

The mutability concept **doesn't apply to variables!** If you ever hear anyone saying something like "this variable is mutable" (or "immutable"), just kill him on the spot! We're joking of course 😂... Since obviously, no one would ever say something like that. 😲

If you remember SLU 02, variables are just _names_ that hold references (i.e. "point") to memory addresses where actual entities exist. This is very easy to demonstrate:

In [1]:
favorite_fruit = 'cherry'

print(id('cherry'))
print(id(favorite_fruit))

139964347386416
139964347386416


As you can see, the `id` of the variable `favorite_fruit` is the `id` of the entity it represents (i.e. it points to), in this case, the string `'cherry'`. Nonetheless, you may think that you're mutating the variable when you assign something else to it, but that is completely wrong because there is no such thing since the **mutability concept doesn't apply to variables** (it's never enough said)!

In [2]:
favorite_fruit = 'abacaxi'

print(id('abacaxi'))
print(id(favorite_fruit))

139964347425584
139964347425584


What Python did in the previous cell was simply discard the first variable `favorite_fruit` holding a reference to the string `'cherry'` and create a new variable with the same name holding a reference to the string `'abacaxi'`. Note that there is no relation whatsoever between the "first" `favorite_fruit` variable (variable) and the second, besides the coincidence of having the same name, and that's the reason why it may seem that you've mutated it, but the truth is that after the second assignment the variable that previously hold (`'cherry'`) doesn't exist anymore.

![image.png](data/highlander.png)

### 5.3. Mutable and immutable types

5.3.1. Fundamental datatypes

**All fundamental datatypes** (integers, floats, strings, booleans, `None`, ...) **are immutable.** This is extremely easy to understand just by realizing that:
- 0 cannot be 1, 
- 0.0 cannot be 0.1, 
- 'dog' cannot be 'cat', 
- `True` cannot be `false`, 
- `None` cannot be "something", 
- ...

5.3.2. Data structures

Opposite to fundamental datatypes, **data structures have both mutable and immutable types**.

The data structures addressed in this SLU are divided as follows:

**Immutable types:**

- Tuples

**Mutable types:**

- Lists
- Dictionaries
- Sets

### 5.4. Practical example

You already knew this from Learning notebook 1. But you probably didn't think about the implications of each data structure's mutability character in more complex situations.

Let's get our pizza back! Suppose you want to order a pizza. Since you're taking this prep course, you leverage the situation to practice your data structures skills, and you write your pizza as follows:

In [3]:
base_ingredients = {'base': 'bread', 
                    'base_ingredient_1': 'cheese', 
                    'base_ingredient_2': 'tomato'}  # dictionary
extra_ingredient_1 = ['bacon']                      # list
extra_ingredient_2 = {'shrimp'}                     # set

> 📝 **Note:** This is not a convenient way to organize this particular information, but you just want to practice, so you make a "soup" with all the data structures you've learned.

Since you're the controlling type, you also "store" each ingredient (or group of ingredients) `id` in a variable.

In [4]:
your_base_ingredients_id = id(base_ingredients)
your_extra_ingredient_1_id = id(extra_ingredient_1)
your_extra_ingredient_2_id = id(extra_ingredient_2)

Naturally, you choose a tuple to hold your ingredients "list" since it is immutable as you want to make sure your order doesn't change.

In [5]:
pizza = base_ingredients, extra_ingredient_1, extra_ingredient_2
your_pizza_id = id(pizza)
pizza

({'base': 'bread',
  'base_ingredient_1': 'cheese',
  'base_ingredient_2': 'tomato'},
 ['bacon'],
 {'shrimp'})

In the restaurant, your order is taken by a Python curious who gave up from last year's prep course in the set-up but likes to google a lot... So, he does the following to your code:  
(Don't focus on the details of the following code. Just understand that the Python curious played around with some random methods and assignments to the ingredients data structures).

In [6]:
base_ingredients.pop('base_ingredient_1')
base_ingredients['base_ingredient_1'] = \
    base_ingredients.pop('base_ingredient_2')
extra_ingredient_1[0] = None
extra_ingredient_2.remove('shrimp')
extra_ingredient_2.add('tomato')

Just in case, he also takes note of the `id`s to send them back with the order for you to check.

In [7]:
rest_pizza_id = id(pizza)
rest_base_ingredients_id = id(base_ingredients)
rest_extra_ingredient_1_id = id(extra_ingredient_1)
rest_extra_ingredient_2_id = id(extra_ingredient_2)

You get your delivery, sit down on your couch, put on your favorite series, and enjoy every bit of your delicious... hot extra tomato sandwich with None?!?! 😮🤬

In [8]:
pizza

({'base': 'bread', 'base_ingredient_1': 'tomato'}, [None], {'tomato'})

Well, it's not necessarily the end of the world, especially if you like tomato 😂, (and of course, None 🤔), but let's get to the point.

Did someone swapped your "pizza"?

In [9]:
your_pizza_id == rest_pizza_id

True

The variable `your_pizza_id` is equal to the variable `rest_pizza_id`, which means your pizza wasn't swapped. So, were the data structures that hold the ingredients switched?

In [10]:
your_base_ingredients_id == rest_base_ingredients_id
your_extra_ingredient_1_id == rest_extra_ingredient_1_id
your_extra_ingredient_2_id == rest_extra_ingredient_2_id

True

The variables that hold your ingredients' ids are equal to the variables that hold the restaurant's ingredients ids, so, the data structures that hold the ingredients weren't switched. (Uff, tuples are still immutable and the Earth is still "round" 🎉).

So, what's the lesson to be learned here?

Similarly to variables, data structures also hold references the the entities they point to. This means that the "pizza" tuple holds references to the three data structures, and each of those data structures holds references to the ingredients they represent. Since they are mutable, the references they hold were changed in the restaurant, meaning the ingredients inside them were changed, but without changing neither their own references (i.e. `id`s) or the tuple's reference (`id`) itself.

In other words, despite these changes, when you check the IDs of the modified structures( the dictionary, list, and set), you find that they match the IDs of your original order. This is because the tuple pizza is immutable, so it still holds references to the original data structures. However, the mutable data structures that pizza references (i.e., the dictionary, list, and set) have been altered.

The bottom line is that **Immutability prevents the references in an entity from being changed**, which doesn't mean that references inside those references (and so on...) can't be changed **if they're hold by mutable data structures**. 🤓

The next day, you've already read the previous lines and got smarter, so you made your order as follows.  
(Pizza every day?? And I thought I was the "pizza monster"!)

<img src="data/pizza_cookie_monster.jpg" width="500">

In [11]:
ingredients = ('bread', 'cheese', 'tomato', 'bacon', 'shrimp')
pizza = {'ingredients': ingredients}

At the pizza store, the Python curious made his magic:  
(Don't worry about the `try - except` statements... He wrote them but he also doesn't know what they mean... Just understand the code tries to change the references in your `ingredients` tuple)

In [12]:
try:
    ingredients[:] = ('bread', 'olives', 'chicken', 'salt', 'fish')
except:
    print('😮')

😮


Pizza arrives and:

In [13]:
print(pizza, '✨')

{'ingredients': ('bread', 'cheese', 'tomato', 'bacon', 'shrimp')} ✨


Although your pizza was ordered in a mutable sequence (dictionary), the ingredients "list" is a tuple, which means it can never be changed.

Unfortunately, we won't be able to answer your final and most difficult question which probably is: "Who in the world eats bacon and shrimp pizza???" Perhaps AI can someday solve that one for us 😉

### 5.5. Specific methods of mutable data structures

Learning notebook 1 and the Extra notebook present some methods that are specific to mutable data structures. Some examples are:

- Lists:
  - `append`
  - `extend`
  - `insert`
  - `remove`
  - `pop`
  - `clear`
  - `reverse`
  - `sort`

- Dictionaries:
  - `clear`
  - `update`
  - `pop`
  - `popitem`
  - `setdefault`
  - `del <dictionary['key']>` (it's not an actual method (it's a function) but it invokes a method that performs the operation)
  - `<dictionary>['key']` (it's not an actual method (it's a location by key syntax) but it invokes a method that performs the operation)

- Sets:
  - `add`
  - `update`
  - `remove`
  - `discard`
  - `pop`
  - `clear`

After our pizza journey, you now understand the big difference between these methods and other data structures' methods: **These methods actually change the data structures leveraging their mutable essence**.

#### 5.5.1 Copying mutable data structures

If you need to apply one of the previous methods but you also keep your original data structure you can think that assigning it to a new variable and performing the necessary operations to it solves your problem. But.. it doesn't.

In [1]:
my_list = ['This is', 'my list!']
your_list = my_list  # Deep copy
your_list[1] = 'your list!'

print(my_list)
print(your_list)

['This is', 'your list!']
['This is', 'your list!']


The intended operation was performed, but `my_list` as also changed! 🤦‍♂️ This happens because you made a **'deep copy'** of `my_list`. When we assign `my_list` to `your_list` it "acquires" `my_list`'s `id`. When we change `your list` we are changing the entity with the previous `id`, meaning both `my_list` and `your_list`. The final result is that we are changing the same thing.

In [2]:
id(my_list) == id(your_list)

True

So, back to our problem, If you need to apply one of the previous methods but you also need to keep your original data structure **you need to create a shallow copy of the original data structure first!**.

In [3]:
my_list = ['This is', 'my list!']
your_list = my_list.copy()  # Shallow copy
your_list[1] = 'your list!'

print(my_list)
print(your_list)

['This is', 'my list!']
['This is', 'your list!']


A shallow copy creates a new data structure (i.e. an entity holding the same references in the original data structure but with a different `id`), which is an independent entity that can be manipulated without changing the original data structure.

In [4]:
id(my_list) == id(your_list)

False

---

## 6. The `zip` function and the 'splat' `*` operator

## 6.1. The _zip_ concept

The `zip` built-in function iterates over several iterables in parallel.
`zip` itself doesn't return anything "printable" (it actually returns something called an 'iterator', but don't worry about that for the time being...) because its job is to iterate through, not to create or change other entities, but it can be easily handled with collections to produce a readable output.

Its practical result is getting the values of corresponding item positions simultaneously from both collections, similarly (but not quite) to a zipper. This means gathering the first two elements of each iterable in a tuple, then the seconds in another tuple, then the thirds...

<img src="data/zipper.png" width="350">


If you consider two tuples (it could be list, dictionaries, or any other iterable...), representing the _left_ and _right teeth collections_, you can zip them, similar to an actual zipper, where the _slider_ is the `zip` function and the _pull tab_ is, well... the Python interpreter! You give the instruction and he runs it! 

In [14]:
left_teeth_collection  = (1, 3, 5, 7, 9)
right_teeth_collection = (2, 4, 6, 8, 10)

chain = zip(left_teeth_collection, right_teeth_collection)
chain_list = list(chain)

print(chain_list)

[(1, 2), (3, 4), (5, 6), (7, 8), (9, 10)]


Also similar to an actual zipper where you need the tape to provide it a context, note that we used the `list` function to get the zipped structure. If you just use `zip` by itself you'll just get that (previously mentioned) 'iterator' entity, kind of like a zipper floating around in the vastitude of space...

In [15]:
print(chain, "|", type(chain))

<zip object at 0x7f4bfc196000> | <class 'zip'>


One interesting feature is that `zip` it's 'lazy'. No, you don't need to scream at him to work, it just means that it only processes the elements when the iterable is actually iterated. Not all iteration-related features in Python do this... Lazy features are memory efficient since they only take up as much memory space as they need at each instant.

Another thing you crazy algebra lovers might have noticed in the first example is that `zip` performs something similar to transposing matrixes, meaning it turns "rows" into "columns", and "columns" into "rows". It's way too soon to talk about this subject (don't worry, you'll get enough 🤓), but it can't harm to get a picture of it. If you look carefully at the first example, you started with:

```python
(1, 3, 5, 7, 9)
(2, 4, 6, 8, 10)
```

And you ended up with:

```python
[(1, 2),
(3, 4),
(5, 6),
(7, 8),
(9, 10)]
```

Meaning `zip` turned the 2 original tuples (2 "rows") with 5 elements each (5 "columns") into a list with 5 tuples (5 "rows") with 2 elements each (2 "columns").

Now, can you guess what happens if we zip these 5 tuples? Abstract yourself from the number of teeth collections (there are 5 now) and apply the same concept. So, the items in each tuple's first position are zipped, then the items in each tuple's second position are zipped and we're done, since each tuple now only as only 2 items.

## 6.2 The '_splat_' `*` operator

Note that the 2 original tuples were independent and the 5 tuples you got are in a list (you already know why). This means that in order to access them we need to **unpack the list items first**, so we can get to each of its elements and zip them. We can do this with the **splat `*` (apostrophe) operator**.

In [16]:
print(*chain_list)

(1, 2) (3, 4) (5, 6) (7, 8) (9, 10)


Note that the splat `*` operator performs something closer to an action, rather than returning an entity itself, so unpacked items need something to "grab on" for it to work. In the previous cell, we used the `print` function, but you can also use "anything else" that takes iterables' items, like... the zip function! 

Now that we know how to unpack collection items, we can apply the zip function just the same **combining the `zip` function with the 'splat' `*` operator**.

In [17]:
unzipped_chain = zip(*chain_list)
unzipped_chain

<zip at 0x7f4bfc197dc0>

Same as previously, with need something (like a `list`) to get a readable output.

In [18]:
unzipped_chain_list = list(unzipped_chain)
unzipped_chain_list

[(1, 3, 5, 7, 9), (2, 4, 6, 8, 10)]

And _voila_! Our left and right teeth collections are back! 

In [19]:
print(*unzipped_chain_list)

(1, 3, 5, 7, 9) (2, 4, 6, 8, 10)


## 6.3. `zip` Practical applications

One useful thing you can do with the `zip` function is to create a dictionary by zipping two lists (or tuples)

In [None]:
keys = ['key_1', 'key_2', 'key_3']
values = ['value_1', 'value_2', 'value_3']

my_dict = dict(zip(keys, values))
print(my_dict)

{'key_1': 'value_1', 'key_2': 'value_2', 'key_3': 'value_3'}


## 6.4. Iterables' length issues

Iterables passed to `zip` can have different lengths. This can be on purpose, or a bug in the code. In this situation `zip` will just silently stop the iteration when the shortest iterable is exhausted.

In [20]:
broken_left_teeth_collection  = (1, 3, 5)
right_teeth_collection = (2, 4, 6, 8, 10)

chain = zip(broken_left_teeth_collection, right_teeth_collection)
chain_list = list(chain)

print(chain_list)

[(1, 2), (3, 4), (5, 6)]


So, if your iterables should have the same length, **always pass the `strict=True` keyword to the `zip` function** to avoid undetected data corruption.  
The following example detects when the collections are of unequal size and raises the associated `ValueError`.

In [21]:
chain = zip(broken_left_teeth_collection, right_teeth_collection, strict=True)
chain_list = list(chain)

ValueError: zip() argument 2 is longer than argument 1

There are other ways the `zip` function can handle iterables of unequal size, but as sensible and merciful human beings, we think you deserve a peaceful rest before you shine in the exercise notebook. 😉

---

Ready for the exercise notebooks? Fire it up! 🚀