# SLU04 - Data Structures

In this notebook we will be covering the following:   
 
- Data Structures - Tuples, Lists, Dictionaries and Sets
- Creation of a Data Structure
- Length of a Data Structure
- Immutability
- Checking if elements belong to a Data Structure
- Replace, Append and Delete Operations in Data Structures
- Conversion between Data Structures

A [Data Structure](https://en.wikipedia.org/wiki/Data_structure) is a collection of data that enables data organization, management, and storage. In this notebook, we will learn four different types of data structures in Python: tuples, lists, dictionaries and sets.

## 1 Data Structure - Tuple <a name="1"></a>

# <center> (🐙, 🐷, 🐬, 🐞, 🐈, 🙉, 🐸, 🐓) </center>

### 1.1 Definition and Tuple Creation <a name="1.1"></a>

#### 1.1.1 Definition <a name="1.1.1"></a>

A Tuple is a Python data structure type that has a collection of data that once created cannot be changed. It is ordered and accepts duplicated elements. Elements on a tuple should be between brackets `()` and separated by commas `,`.  A tuple can have as many elements as we want, and it can have elements of different types.

Let's create our first tuple with 5 elements.

In [1]:
this_tuple = (0, 1, 2, 3, 4)
this_tuple

(0, 1, 2, 3, 4)

---

#### 1.1.2 Function `type()` and `isinstance()` <a name="1.1.2"></a>

We have two ways of confirming that a variable is of a certain type. We can use __`type()`__ function or __`isinstance()`__ function.   
- __`type()`__ function receives a variable as an input and returns its type.
- __`isinstance()`__ function receives a variable and a type variable as inputs. It returns `True` if the input variable matches the input type and `False` otherwise.

Let's start to check if the variable `this_tuple` is a tuple. Let's use the function __`type()`__ for it. 

In [2]:
this_tuple

(0, 1, 2, 3, 4)

In [3]:
type(this_tuple) #This function returns the variable type of this_tuple

tuple

The output of the cell above is a tuple. We have our confirmation that our variable `this_tuple` is a tuple.

Let's check again if the variable type is a tuple using __`isinstance()`__ function. If the output is `True` that means that our input variable matches the input type.

In [4]:
isinstance(this_tuple, tuple) 

True

The output of the cell above is `True`. Because the input type was a tuple, we have our confirmation that the input variable matches the input type.

__Note__: We will use these two functions throughout the notebook, also for lists and dicts, not just for tuples.

---

#### 1.1.3 Tuples creation <a name="1.1.3"></a>

It is possible to create a tuple with elements of multiple types.

In [5]:
#types (float, str and boolean)
this_tuple = (1.5, "hello", True, "book")
this_tuple

(1.5, 'hello', True, 'book')

---

Tuples can have duplicated elements. Note that elements do not get reordered.

In [6]:
#a tuple with duplicated values
this_tuple = (1, 1, 3, 7, 7, 7, 5)
this_tuple

(1, 1, 3, 7, 7, 7, 5)

---

Is it possible to have an empty tuple?

In [7]:
this_tuple = ()
type(this_tuple)

tuple

The answer is yes.

We can also create an empty tuple using the function __`tuple()`__.

In [8]:
this_tuple = tuple()
type(this_tuple)

tuple

---

It is possible to create a tuple with more than one element without writing brackets.

In [9]:
this_tuple = 1.5, "hello", True, "book"
isinstance(this_tuple, tuple)

True

Let's see what happens if we try to create one tuple with one element without brackets.

In [10]:
this_tuple = "book"
isinstance(this_tuple, tuple)

False

So, what is the type of the variable `this_tuple` in this case?

In [11]:
type(this_tuple)

str

As you might have guessed, the variable above is a string and not a tuple. Let's now check if it works with brackets. 

In [12]:
this_tuple = ("book")
this_tuple, type(this_tuple)

('book', str)

We can see that it doesn't work to create tuples without brackets if we have just one element inside the tuple.

But if we add a comma after the value, it is assumed as a tuple, even if we don't have brackets.

In [13]:
this_tuple = ("book",)
isinstance(this_tuple, tuple)

True

In [14]:
this_tuple = "book", 
isinstance(this_tuple, tuple)

True

---

As a last example, here is a tuple of tuples.

In [15]:
this_tuple = ((1,2), (3,4), (4,5))
isinstance(this_tuple, tuple)

True

---

### 1.2 Tuple length and accessing to specific elements <a name="1.2"></a>

Now, let's see how we can know the length of a tuple and access certain elements of it.

#### 1.2.1 Length <a name="1.2.1"></a>

We can use the function __`len()`__ in order to check how many elements a tuple has.

In [16]:
this_tuple = (0, 1, 2, 3, 4)
len(this_tuple) 

5

On the output of the cell above, it is possible to confirm with function __`len()`__ that `this_tuple` has 5 elements inside.

In [17]:
this_tuple = 1,
type(this_tuple), len(this_tuple)

(tuple, 1)

And one element on the tuple above.

In [18]:
this_tuple = ((1,2), (3,4), (4,5))
len(this_tuple)

3

Again, with function __`len()`__, we can confirm that the variable `this_tuple` on the cell above has 3 elements inside (each element is a tuple like we saw on <a href="#1.1.3">section 1.1.3</a>). 

---

#### 1.2.2 Indexing <a name="1.2.2"></a>

What can we do if we want to get a specific element of a tuple? 

### <center>fruit = ("banana", "apple", "tomato", "watermelon", "pineapple")</center>   
# <center> &downarrow; </center>  
### <center>"tomato"</center>  

Giving the tuple above, what can we do if we want to extract just the `"tomato"` element?   
We must make use of __indexing__.

__Positive Indexing__ is done from left to right, starts with zero for the first position of the tuple and goes until the last position that is indexed with the value  
(size of the tuple - 1). 

__Negative Indexing__ is done from right to left, starts with -1 for the last element of the tuple and goes until -(size of the tuple) for the first element. 

<img src="data/tuples-in-python-with-examples.png" width="600" height="150" >

This image and additional materials can be found [here](https://www.faceprep.in/python/tuples-in-python/).

---

Extracting `"tomato"` from the tuple `fruit` using __positive indexing__:

In [19]:
fruit = ("banana", "apple", "tomato", "watermelon", "pineapple")

In [20]:
fruit[2]

'tomato'

---

Extracting "`tomato`" from the tuple `fruit` using __negative indexing__:

In [21]:
fruit[-3]

'tomato'

---

Now we want to extract the last element from tuple `fruit`. 

Using positive indexing:

In [22]:
fruit[4]

'pineapple'

Using positive indexing but making use of the function __`len()`__:   

__Note__: This might be useful when we don't know the size of the tuple.

In [23]:
fruit[len(fruit)-1] 

'pineapple'

Using negative indexing:

In [24]:
fruit[-1] 

'pineapple'

---

And if we want an element of a tuple that is inside another tuple?

In [25]:
fruit_by_color = (("strawberry", "cherry"), ("green apple", "kiwi", "pear"), ("mango", "papaya"))

We want to extract just the element `"pear"` from the tuple `fruit_by_color`. What can we do?   
We first need to know what is the position of the tuple where `"pear"` is. It is in position 1 (positive indexing).    
Let's extract position 1 to see if it matches the tuple that we want.

In [26]:
green_fruit = fruit_by_color[1]
green_fruit

('green apple', 'kiwi', 'pear')

From the cell above we can see that position 1 corresponds to the green fruits in `fruit_by_color` tuple, the tuple that has element `"pear"` inside.   
In this extracted tuple `green_fruit`, we can see that `"pear"` is in position 2.

In [27]:
green_fruit[2]

'pear'

Now let's go to our main tuple, `fruit_by_color`, and extract `"pear"` at once by indexing it twice. First time to extract `green_fruit` and second time to extract `"pear"`.    

Successive indexing should be written from left to right.

In [28]:
fruit_by_color[1][2]

'pear'

#### 1.2.3 Slicing <a name="1.2.3"></a>

If we want to extract more than one element, what can we do?   

### <center>fruit = ("banana", "apple", "tomato", "watermelon", "pineapple")</center>   
# <center> &downarrow; </center>  
### <center>("tomato", "watermelon", "pineapple")</center> 

The answer is __slicing__.

In [29]:
fruit = ("banana", "apple", "tomato", "watermelon", "pineapple")

In order to slice a tuple, we need to know the starting position of the slice that we want and the stop position.   

### <center>Tuple[start:stop:step]   <center/>    

To slice a tuple, start, stop and step positions should be between square brackets `[]` and separated by colon `:`.

- __start=n__: the beginning index of the slice; it includes the element at this index unless it is the same as stop; its default value is 0, i.e. the first index. If it's negative, it means to start n items from the end.

- __stop=n__: the ending index of the slice; it __excludes__ the element at this index; it defaults to the length of the sequence being sliced, that is, up to and including the end. If it's negative, it means to stop -n + 1 items from the end.

- __step=n__: the amount by which the index increases; its default value is 1. If it's negative, you're slicing over the iterable in reverse.



Let's see the following example:

<img src="data/slicing.jpeg" width="600" height="150" >

In [30]:
monty_python = ("M", "o", "n", "t", "y"," ", "P", "y", "t", "h", "o", "n")
monty_python

('M', 'o', 'n', 't', 'y', ' ', 'P', 'y', 't', 'h', 'o', 'n')

Let's extract `Monty` and `Pyth` from the tuple above.

In [31]:
#start=-12
#stop=-7
#step=1
monty_python[-12:-7]

('M', 'o', 'n', 't', 'y')

In [32]:
#start=6
#stop=10
#step=1
monty_python[6:10]

('P', 'y', 't', 'h')

---

For the tuple `fruit`, in order to slice it and extract `tomato`, `watermelon` and `pineapple`, we know that the first position is 2 and the last is 4. This means that `start=2` and `stop=4`. Don't forget that we need to add 1 to the stop position if we also want to extract the element with index 4.

In [33]:
fruit

('banana', 'apple', 'tomato', 'watermelon', 'pineapple')

In [34]:
fruit[2:5]

('tomato', 'watermelon', 'pineapple')

Because `pineapple` is the last element on our list, we can slice our tuple from the index 2 until the end. This can be done by writing no stop index.

In [35]:
fruit[2:]

('tomato', 'watermelon', 'pineapple')

---

We can also add a __step__ to our slicing. 

If the step is 2, this means that we will start slicing on the  starting index and each next element will be on the position of the previous index + 2 until we reach the stop index. When we don't explicitly write the step, Python assumes the default value of 1.

With `step=1`:

In [36]:
fruit[2:5]

('tomato', 'watermelon', 'pineapple')

In [37]:
fruit[2:5:1]

('tomato', 'watermelon', 'pineapple')

From the two cells above, we can confirm that, when we don't write the step, it is assumed as 1.

With `step=2`:

In [38]:
fruit[0:5:2]

('banana', 'tomato', 'pineapple')

In the cell above we have every other element from the tuple `fruit`, starting on the first position and until the last.

Because we are starting on the first position and finishing in the last one, we can also write in the following way:

In [39]:
fruit[::2]

('banana', 'tomato', 'pineapple')

---

Let's now try slicing with __negative indexing__.

In [40]:
fruit

('banana', 'apple', 'tomato', 'watermelon', 'pineapple')

Let's slice the tuple, from the right to the left, starting on negative index -2 (`"watermelon"`) and finishing on negative index -4 (`"apple"`).

If we want to slice from right to left, we need to write our step as -1.

In [41]:
fruit[-2:-5:-1]

('watermelon', 'tomato', 'apple')

---

We can also mix positive indexes with negative steps and vice-versa.

In [42]:
fruit[5:2:-1]

('pineapple', 'watermelon')

In [43]:
fruit[-5:-2:1]

('banana', 'apple', 'tomato')

---

### 1.3 Membership test<a name="1.3"></a>

Sometimes, you may want to check if a certain element belongs to a tuple or if a specific element matches a value.  
In order to do that, we can use the keywords __`in`__ or __`not in`__. The returned value is `True` if the condition is verified and `False` otherwise.


In [44]:
fruit

('banana', 'apple', 'tomato', 'watermelon', 'pineapple')

In [45]:
"banana" in fruit

True

In [46]:
"banana" not in fruit

False

In [47]:
"onion" not in fruit

True

---

We may also want to verify if a specific element inside a tuple matches a certain value. We can make use of __`==`__ and __`!=`__ operators. We'll see these operators extensively on SLU06.

In [48]:
fruit[0] == "banana"

True

In [49]:
fruit[0] != "tomato"

True

### 1.4 Immutability <a name="1.4"></a>

In [50]:
fruit

('banana', 'apple', 'tomato', 'watermelon', 'pineapple')

Tuples are __immutable__ which means that they are unchangeable, so if we try to assign a new value to a tuple that already exists it will generate an error.

In [51]:
fruit[0]

'banana'

Let's try to replace `"banana"` with `"lemon"` and see what happens.

In [52]:
fruit[0] = "lemon"

TypeError: 'tuple' object does not support item assignment

As expected, the output of the last cell was an error. We were not able to replace bananas with lemons on our tuple. Tuples are like monkeys, don't try to steal their bananas. 

<img src="data/monkey.gif" width="600" height="150" >

### 1.5 Adding two tuples <a name="1.5"></a>

We can create a third tuple by adding two distinct tuples.

Instead of trying to replace values, we can create a new tuple where the first element is a tuple of one element, `"lemons"`, and the remaining elements are the same as the tuple `fruit` without the first element.

In [53]:
fruit = ("banana", "apple", "tomato", "watermelon", "pineapple")
fruit

('banana', 'apple', 'tomato', 'watermelon', 'pineapple')

In [54]:
fruit_without_bananas = ("lemons",) + fruit[1:]
fruit_without_bananas

('lemons', 'apple', 'tomato', 'watermelon', 'pineapple')

---

### 1.5 Further Reading <a name="1.5"></a>

[Programiz: Python Tuple](https://www.programiz.com/python-programming/tuple)   
[GeeksForGeeks: Tuples in Python](https://www.geeksforgeeks.org/tuples-in-python/)   
[W3schools: Python Tuples](https://www.w3schools.com/python/python_tuples.asp)

---

## 2 Data Structure - List <a name="2"></a>

# <center> [🐙, 🐷, 🐬, 🐞, 🐈, 🙉, 🐸, 🐓] </center>

### 2.1 Definition and List Creation <a name="2.1"></a>

#### 2.1.1 Definition <a name="2.1.1"></a>

A list is also a collection of data. Lists can be changed after being created. Lists also accept duplicated values. After creation, the elements are kept in the same position until explicitly changed. The elements on a list are denoted using square brackets `[]` and are separated by commas `,`. Lists can have multiple types of data inside them. 

Let's create our first list with 5 elements.

In [55]:
this_list = [0, 1, 2, 3, 4]
this_list

[0, 1, 2, 3, 4]

In [56]:
type(this_list)

list

---

#### 2.1.2 Tuples vs Lists <a name="2.1.2"></a>

Besides the notation, the main difference between tuples and lists is that lists are changeable. This means that, contrary to tuples, we can change, append or delete elements on a list after creation.    

Regarding length checking, indexing, slicing and membership testing, these are done in the same way for lists as they are done for tuples.

Functions __`len()`__, __`type()`__ and __`isinstance()`__ can also be used with lists.

---

#### 2.1.3 List creation <a name="2.1.3"></a>

Let's see an example of a list with multiple types of variables inside.

In [57]:
this_list = [1.5, "hello", True, "book"]
this_list

[1.5, 'hello', True, 'book']

---

Let's verify if we can have duplicated values in a list and if the order is maintained.

In [58]:
this_list = [1, 1, 1, 10, 10, 5, 5, 1]
this_list

[1, 1, 1, 10, 10, 5, 5, 1]

As we can see from the cell above, order and duplicated values are maintained.

---

We can also have a list of lists.

In [59]:
this_list = [[1,2], [4,5,6], [4]]
type(this_list)

list

---

A list with one element:

In [60]:
this_list = [1]
type(this_list)

list

---

An empty list:

In [61]:
this_list = []
type(this_list), len(this_list)

(list, 0)

An empty list can also be created with the function __`list()`__.

In [62]:
this_list = list()
type(this_list), len(this_list) 

(list, 0)

---

We can create lists with __list comprehension__.

In [63]:
this_list = [i for i in range(0, 10)]
this_list

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

List comphrehension is a way to create a list using a for loop under the hood. This will be clearer by the time that we learn about __for__ loops. 

---

### 2.2 List length and accessing specific elements <a name="2.2"></a>

The rules for checking the size of a list, indexing and slicing are the same for tuples and lists.

#### 2.2.1 Length <a name="2.2.1"></a>

In [64]:
this_list = [0,1,2,3,4]
len(this_list)

5

---

#### 2.2.2 Indexing <a name="2.2.2"></a>

Using __positive indexing__, let's try to extract `"tomato"` from the following list:

In [65]:
fruit = ["banana", "apple", "tomato", "watermelon", "pineapple"]

In [66]:
#tomato is on the positive index 2
fruit[2]

'tomato'

---

Let's now extract the last element of the list `fruit` using __positive indexing__.

In [67]:
fruit[len(fruit)-1]

'pineapple'

---

Giving the following list of lists, let's try to extract `"pear"` using __positive indexing__.

In [68]:
fruit_by_color = [["strawberry", "cherry"], ["green apple", "kiwi", "pear"], ["mango", "papaya"]]

`"pear"` is inside the second list that is on positive index 1. Inside the second list, `"pear"` is on positive index 2. 

When we want to index a list inside a list, first we should index the main list and then the list inside.

In [69]:
fruit_by_color[1][2]

'pear'

---

Using __negative indexing__, let's try to extract `"mango"` from `fruit_by_color` list.

`"mango"` is on position -1 of the main list and on the position -2 of the secondary list.

In [70]:
fruit_by_color[-1][-2]

'mango'

---

Now mixing positive indexing with negative indexing in order to extract `"cherry"`.

`"cherry"` is on positive index 0 of the main list and -1 (last element of `["strawberry", "cherry"]` ) of the secondary list. 

In [71]:
fruit_by_color[0][-1]

'cherry'

---

#### 2.2.3 Slicing <a name="2.2.3"></a>

As said previously, the rules for slicing are the same for lists as the ones for tuples.

You can find some good examples of slicing with lists (with illustrations!), slicing with steps, positive and negative indexing on this [link](https://www.learnbyexample.org/python-list-slicing/). __Check it out__!

Let's check a few examples of __positive index__ slicing.

In [72]:
fruit

['banana', 'apple', 'tomato', 'watermelon', 'pineapple']

Let's extract the elements `"tomato"` and `"watermelon"`.

In [73]:
fruit[2:4:1]

['tomato', 'watermelon']

We can also ignore the step in the cell above, because it is positive, and its value is one.

In [74]:
fruit[2:4]

['tomato', 'watermelon']

---

Let's now extract the same values but using __negative indexing__ with a positive step of 1.

In [75]:
fruit[-3:-1:1]

['tomato', 'watermelon']

It is the same as writing:

In [76]:
fruit[-3:-1]

['tomato', 'watermelon']

---

Finally, let's slice the entire list `fruit` backwards with negative indexing, excluding the `"banana"`.   
Like a reverse + excluding the element `"banana"`.

In [77]:
fruit

['banana', 'apple', 'tomato', 'watermelon', 'pineapple']

In [78]:
fruit[:-5:-1]

['pineapple', 'watermelon', 'tomato', 'apple']

---

We are done with the slicing. Pizza for everyone!

<img src="data/pizza_2.gif" width="300" height="150" >

Hope we didn't make you hungry...

---

### 2.3 Membership test <a name="2.3"></a>

The methodology to check if an element belongs to a list is the same as the one we've learned for tuples.

In [79]:
fruit

['banana', 'apple', 'tomato', 'watermelon', 'pineapple']

Let's confirm that `fruit` list has `"pineapple"`.

In [80]:
"pineapple" in fruit

True

Let's check if the element `"pizza"` is in the list `fruit`.

In [81]:
"pizza" in fruit

False

Yup, pizza is not a fruit, even if it has `"pineapple"` on it.

Finally, let's check that the element on the second index is `"tomato"`.

In [82]:
fruit[2] == "tomato"

True

---

### 2.4 Replace, Append and Delete <a name="2.4"></a>

Unlike tuples, lists are mutable. In this way, we are able to replace, delete and append values.

#### 2.4.1 Replace <a name="2.4.1"></a>

In [83]:
fruit

['banana', 'apple', 'tomato', 'watermelon', 'pineapple']

We want to replace `"watermelon"` with `"onions"`. `"watermelon"` is on position index 3, so we need to index the list on this position and assign it the new value, `"onion"`.

In [84]:
fruit[3] = "onion"
fruit

['banana', 'apple', 'tomato', 'onion', 'pineapple']

From the cell above, you can see that `"watermelon"` was replaced by `"onion"`.

We will also replace `"pineapple"` with `"bacon"` using negative indexing.

In [85]:
fruit[-1] = "bacon"
fruit

['banana', 'apple', 'tomato', 'onion', 'bacon']

---

#### 2.4.2 Append <a name="2.4.2"></a>

Now, we will append the element `"bread"` to our list fruit.   

In [86]:
fruit.append("bread")
fruit

['banana', 'apple', 'tomato', 'onion', 'bacon', 'bread']

The method __`append()`__ added the element `"bread"` to the end of the list `fruit`.

---

#### 2.4.3 Delete <a name="2.4.3"></a>

If we want to delete an element from the list, we can use __`del`__. Let's start by removing `"apple"` element on the index 1.

In [87]:
del fruit[1]
fruit

['banana', 'tomato', 'onion', 'bacon', 'bread']

We can also delete elements using remove method. Let's remove the element `"banana"` with this method.

In [88]:
fruit.remove("banana")
fruit

['tomato', 'onion', 'bacon', 'bread']

As you might have guessed, we have transformed fruit into pizza!!! 🎉 🎉 🎉

<img src="data/happy.gif" width="300" height="150" >

---

### 2.5 Other Methods <a name="2.5"></a>

#### 2.5.1 `count()`

The method __`count()`__ can be used in order to count how many times a specific element appears in a list.

In [89]:
pizza = ["margherita", "napoletana", "carbonara", "romana", "napoletana", "gorgonzola", "calzone", "napoletana", "romana"]
pizza

['margherita',
 'napoletana',
 'carbonara',
 'romana',
 'napoletana',
 'gorgonzola',
 'calzone',
 'napoletana',
 'romana']

Let's check how many times `"napoletana"` appears in the `pizza` list.

In [90]:
pizza.count("napoletana")

3

---

#### 2.5.2 __`index()`__

This method returns the first index that matches the input element.

In [91]:
pizza = ["margherita", "napoletana", "carbonara", "romana", "napoletana", "gorgonzola", "calzone", "napoletana", "romana"]
pizza

['margherita',
 'napoletana',
 'carbonara',
 'romana',
 'napoletana',
 'gorgonzola',
 'calzone',
 'napoletana',
 'romana']

In [92]:
pizza.index("napoletana")

1

---

#### 2.5.3 __`sort()`__

In order to sort the list, we can make use of the method __`sort()`__.

In [93]:
pizza

['margherita',
 'napoletana',
 'carbonara',
 'romana',
 'napoletana',
 'gorgonzola',
 'calzone',
 'napoletana',
 'romana']

In [94]:
pizza.sort()
pizza

['calzone',
 'carbonara',
 'gorgonzola',
 'margherita',
 'napoletana',
 'napoletana',
 'napoletana',
 'romana',
 'romana']

Elements in `pizza` are now in alphabetical order.

Let's apply the same function but to a list of numbers.

In [95]:
pizza_price_usa = [8.37, 8.59,  5.99,  7.75, 8.75, 8.95, 8.99, 6.99, 7.99, 8.0]

In [96]:
pizza_price_usa.sort()
pizza_price_usa

[5.99, 6.99, 7.75, 7.99, 8.0, 8.37, 8.59, 8.75, 8.95, 8.99]

__Random Information__: Those are the average prices for pizza, in the cities with the cheapest pizza in the USA.  [Forbes: The Pizza Price Index](https://www.forbes.com/sites/priceonomics/2017/09/26/the-pizza-price-index/#7b2826be6553)

### 2.6 Converting a  tuple into a list and vice-versa <a name="2.6"></a>

Sometimes we may need to convert a tuple into a list and vice-versa.   

Let's start by trying to convert a tuple into a list.

In [97]:
this_tuple =("i", "am", "a", "tuple")
type(this_tuple)

tuple

The variable above is a tuple, as we can see from the output of the function __`type()`__. If we give this variable as an input of __`list()`__, our tuple will be converted to a list.

In [98]:
this_tuple = list(this_tuple)
this_tuple

['i', 'am', 'a', 'tuple']

In [99]:
type(this_tuple)

list

From the output of the last cell, we can confirm that `this_tuple` is now a list.

---

Let's now convert a list into a tuple. For that we will use the function __`tuple()`__. 

In [100]:
this_list = ["i", "am", "a", "list"]

In [101]:
type(this_list)

list

In order to convert our list into a tuple, we need to use the function __`tuple()`__ and pass it as input.

In [102]:
this_list = tuple(this_list)
this_list

('i', 'am', 'a', 'list')

In [103]:
type(this_list)

tuple

We have just converted our list into a tuple.

---

### 2.7 List Operations<a name="2.7"></a>

We can perform operations between lists.

Let's see what happens if we try to add two lists. 

In [104]:
pizza = ["margherita", "napoletana", "carbonara", "romana", "gorgonzola", "calzone"]
pizza

['margherita', 'napoletana', 'carbonara', 'romana', 'gorgonzola', 'calzone']

In [105]:
other_pizzas = ["quattro stagioni", "Frutti di Mare", "quattro formaggi"]
other_pizzas

['quattro stagioni', 'Frutti di Mare', 'quattro formaggi']

In [106]:
all_pizzas = pizza + other_pizzas
all_pizzas

['margherita',
 'napoletana',
 'carbonara',
 'romana',
 'gorgonzola',
 'calzone',
 'quattro stagioni',
 'Frutti di Mare',
 'quattro formaggi']

The result of adding two lists is a third list with all the elements of both lists. The first elements are the elements on list `pizza` followed by `other_pizzas`.

---

We can also multiply a list by a number.

In [107]:
other_pizzas

['quattro stagioni', 'Frutti di Mare', 'quattro formaggi']

In [108]:
other_pizzas_5_x = other_pizzas * 5
other_pizzas_5_x

['quattro stagioni',
 'Frutti di Mare',
 'quattro formaggi',
 'quattro stagioni',
 'Frutti di Mare',
 'quattro formaggi',
 'quattro stagioni',
 'Frutti di Mare',
 'quattro formaggi',
 'quattro stagioni',
 'Frutti di Mare',
 'quattro formaggi',
 'quattro stagioni',
 'Frutti di Mare',
 'quattro formaggi']

In [109]:
len(other_pizzas), len(other_pizzas_5_x)

(3, 15)

Now each element on list `other_pizzas` appears 5 times on list `other_pizzas_5_x`.

What if we have a list of numbers?

In [110]:
numbers = [1, 2, 3, 4]
numbers

[1, 2, 3, 4]

In [111]:
numbers_x4 = numbers * 4
numbers_x4

[1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4]

The result is the same, the elements are repeated.

### 2.8 Further Reading<a name="2.8"></a>

[Python docs: Datastructures](https://docs.python.org/3/tutorial/datastructures.html)    
[Programiz: Python List](https://www.programiz.com/python-programming/list)   
[GeeksForGeeks: Python Lists](https://www.geeksforgeeks.org/python-list/)   
[TutorialsPoint: Python Lists ](https://www.tutorialspoint.com/python/python_lists.htm)   
[RealPython: Python Lists and Tuples](https://realpython.com/python-lists-tuples/)     
[W3schools: Python Lists](https://www.w3schools.com/python/python_lists.asp)  
[Datacamp: Python List questions](https://www.datacamp.com/community/tutorials/18-most-common-python-list-questions-learn-python)

## 3 Data Structure - Dictionary <a name="3"></a>

# <center> {"octopus": 🐙, "pork": 🐷, "cat": 🐈, "monkey": 🙉} </center>

### 3.1 Definition and Dictionary Creation<a name="3.1"></a>

#### 3.1.1 Definition

A dictionary is a collection of key-value pairs. It doesn't support duplicated keys. Each unique key is mapped to a value. Keys and values can be of any type.   

Dictionaries are unordered. Being unordered implies that the elements don't have a specific position and, therefore, you cannot search for an element by its position.   

They are also mutable. It means that we can append, delete and update key-value pairs on a dictionary after its creation.

In a dictionary, the key-value correspondence should be done with a colon `:`, and consecutive pairs should be separated by commas `,`. All pairs should be between curly brackets `{}`.

Let's create our first dictionary.

In [112]:
#keys are fruits and values are the correspondent color
this_dict = {"strawberry": "red", "pear":"green", "mango": "yellow", "banana": "yellow"}
this_dict

{'strawberry': 'red', 'pear': 'green', 'mango': 'yellow', 'banana': 'yellow'}

#### 3.1.2 Dictionary creation

Let's start by creating an empty dictionary.

In [113]:
this_dict = {}
len(this_dict), type(this_dict)

(0, dict)

An empty dictionary can also be created by using the function __`dict()`__.

In [114]:
this_dict = dict()
len(this_dict), type(this_dict)

(0, dict)

---

What happens if we have duplicated keys? Let's create a dictionary with duplicated keys to find out.

In [115]:
this_dict = {"strawberry": "red", "pear": "green", "mango": "red", "banana": "yellow", "mango": "yellow"}
this_dict

{'strawberry': 'red', 'pear': 'green', 'mango': 'yellow', 'banana': 'yellow'}

It is __not possible__ to have __duplicated keys__ in a Python dictionary. The key-value pair keeps the last value assigned to a key.

---

Let's now create a dictionary with values of different types.

In [116]:
strawberry = {"type": "fruit", "name": "strawberry", "color": "red", "price": 2.5, "in_stock": True}
strawberry

{'type': 'fruit',
 'name': 'strawberry',
 'color': 'red',
 'price': 2.5,
 'in_stock': True}

In [117]:
type(strawberry)

dict

---

Let's try to create a dictionary with keys that are not string to see what happens.

In [118]:
strawberry = {"type": "fruit", "name": "strawberry", True: "in_stoks", 2.5: "in_stock"}
strawberry

{'type': 'fruit', 'name': 'strawberry', True: 'in_stoks', 2.5: 'in_stock'}

In [119]:
type(strawberry)

dict

It works. But keep in mind that keys should have a common meaning between them in order to be easy to search for elements in a dictionary.

---

Having dictionaries of dictionaries is very common when we want details about a key. 

Let's create a dictionary with the groceries that someone did on march 2020. Each key is the name of the product bought and the value corresponds to the details about the product purchased: type of product, price per unit and quantity purchased.

In [120]:
groceries = {
             "bread": {"type": "grains", "price_per_unit": 2, "quantity_purchased": 1},
             "onions": {"type": "vegetables", "price_per_unit": 0.5, "quantity_purchased": 2},
             "rice": {"type": "grains" , "price_per_unit": 1, "quantity_purchased": 2},
             "toilet paper": {"type": "others", "price_per_unit": 50, "quantity_purchased":1000},
             "spinaches": {"type": "vegetables" , "price_per_unit": 1.5, "quantity_purchased": 1}
            }
groceries

{'bread': {'type': 'grains', 'price_per_unit': 2, 'quantity_purchased': 1},
 'onions': {'type': 'vegetables',
  'price_per_unit': 0.5,
  'quantity_purchased': 2},
 'rice': {'type': 'grains', 'price_per_unit': 1, 'quantity_purchased': 2},
 'toilet paper': {'type': 'others',
  'price_per_unit': 50,
  'quantity_purchased': 1000},
 'spinaches': {'type': 'vegetables',
  'price_per_unit': 1.5,
  'quantity_purchased': 1}}

In [121]:
type(groceries)

dict

As we can see, it is possible to have a dictionary which values are other dictionaries.

---

### 3.2 Accessing keys and values on a dictionary<a name="3.2"></a>

Sometimes we may need to check the value assigned to a certain key. We can do that in two ways: using square brackets `[]` with the key inside; or using the __`get()`__ method also with the key inside.

- __Square brackets:__

In [122]:
groceries

{'bread': {'type': 'grains', 'price_per_unit': 2, 'quantity_purchased': 1},
 'onions': {'type': 'vegetables',
  'price_per_unit': 0.5,
  'quantity_purchased': 2},
 'rice': {'type': 'grains', 'price_per_unit': 1, 'quantity_purchased': 2},
 'toilet paper': {'type': 'others',
  'price_per_unit': 50,
  'quantity_purchased': 1000},
 'spinaches': {'type': 'vegetables',
  'price_per_unit': 1.5,
  'quantity_purchased': 1}}

In [123]:
toilet_paper = groceries["toilet paper"]
toilet_paper

{'type': 'others', 'price_per_unit': 50, 'quantity_purchased': 1000}

Let's now try to search by index on the dictionary to see what happens.

In [124]:
toilet_paper = groceries[1]
toilet_paper

KeyError: 1

We have an error on the cell above. Since dictionaries are unordered, we are not able to search by position. 

- __`get()` method:__

In [125]:
toilet_paper = groceries.get("toilet paper")
toilet_paper

{'type': 'others', 'price_per_unit': 50, 'quantity_purchased': 1000}

From the cells above we can see that the output is the same using both methods. Yes, the person who did these groceries bought a lot of very expensive toilet paper...

<img src="data/toilet_paper.gif" width="500" height="150" >

---

We can also use the methods __`keys()`__ and __`values()`__ in order to get all keys in a dictionary and all values, respectively.

In [126]:
toilet_paper

{'type': 'others', 'price_per_unit': 50, 'quantity_purchased': 1000}

Let's extract the keys from the dict `toilet_paper`.

In [127]:
toilet_paper.keys()

dict_keys(['type', 'price_per_unit', 'quantity_purchased'])

And now all of the values.

In [128]:
toilet_paper.values()

dict_values(['others', 50, 1000])

---

The method __`items()`__ returns a list of tuples, with each tuple having two elements. The first element is the key, and the second element is the corresponding value. This method is particularly useful when we want to iterate over a dictionary. 

In [129]:
toilet_paper.items()

dict_items([('type', 'others'), ('price_per_unit', 50), ('quantity_purchased', 1000)])

---

### 3.3 Membership test<a name="3.3"></a>

In order to test if a key belongs to a dictionary, we can use the __`in`__ or __`not in`__ notation.

In [130]:
toilet_paper

{'type': 'others', 'price_per_unit': 50, 'quantity_purchased': 1000}

Let's check if `"quantity_purchased"` key is in the dictionary `toilet_paper`.

In [131]:
"quantity_purchased" in toilet_paper

True

---

If we want to confirm if a value matches a key, we should start by indexing the value that matches our key (see <a href="#3.2">section 3.2</a>) followed by __`==`__ or __`!=`__ and the value we want to compare.

Let's see if `"quantity_purchased"` matches the value `1000` in the dictionary `toilet_paper`.

In [132]:
toilet_paper["quantity_purchased"] == 1000

True

---

If we want to check if a value belongs to a dictionary, we can combine the __`in`__ notation with the value extraction that we've learned on <a href="#3.2">section 3.2</a>.

In [133]:
1000 in toilet_paper.values()

True

---

### 3.4 Replace, Append and Delete<a name="3.4"></a>

Let's start by learning how to replace values in a key-value pair.

We can assign a new value to an already existing key using the notation `my_dict[key] = new_value`.

In [134]:
toilet_paper

{'type': 'others', 'price_per_unit': 50, 'quantity_purchased': 1000}

In [135]:
toilet_paper["price_per_unit"] = 2
toilet_paper["quantity_purchased"] = 1
toilet_paper

{'type': 'others', 'price_per_unit': 2, 'quantity_purchased': 1}

---

We can also add new key-value pairs to our dictionary, using the __`update()`__ method or using the notation `my_dict[new_key] = new_value`.

In [136]:
toilet_paper.update({"characteristics": ["soft", "double"]})
toilet_paper

{'type': 'others',
 'price_per_unit': 2,
 'quantity_purchased': 1,
 'characteristics': ['soft', 'double']}

In [137]:
toilet_paper["rating"] = 4
toilet_paper

{'type': 'others',
 'price_per_unit': 2,
 'quantity_purchased': 1,
 'characteristics': ['soft', 'double'],
 'rating': 4}

---

If we want to delete a key-value pair, we can use the __`del`__ notation or the __`pop()`__ method.

In [138]:
del toilet_paper["rating"]
toilet_paper

{'type': 'others',
 'price_per_unit': 2,
 'quantity_purchased': 1,
 'characteristics': ['soft', 'double']}

In [139]:
toilet_paper.pop("type")
toilet_paper

{'price_per_unit': 2,
 'quantity_purchased': 1,
 'characteristics': ['soft', 'double']}

---

### 3.5 Further Reading <a name="3.5"></a>

[Python Docs: Datastructures-Dictionaries](https://docs.python.org/3/tutorial/datastructures.html#dictionaries)   
[Realpython: Python dicts](https://realpython.com/python-dicts/)   
[Programiz: Dictionary](https://www.programiz.com/python-programming/dictionary)   
[W3schools: Python Dictionaries](https://www.w3schools.com/python/python_dictionaries.asp)   
[Pythonlikeyoumeanit: DataStructures II Dictionaries](https://www.pythonlikeyoumeanit.com/Module2_EssentialsOfPython/DataStructures_II_Dictionaries.html)

---

## 4 Data Structure - Set <a name="3"></a>

# <center> {🐙, 🐷, 🐬, 🐞, 🐈, 🙉, 🐸, 🐓} </center>

### 4.1 Definition and Set Creation<a name="3.1"></a>

#### 4.1.1 Definition <a name="4.1.1"></a>

A set is the fourth built-in collection data type in Python. Sets are **unindexed** and **unordered**. They do **not** allow duplicate values.
They are written using curly brackets `{}`. Sets can have mixed types of data inside them, unless they are mutable. 

In [140]:
planets = {"Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Uranus", "Neptune", "Neptune"}
planets

{'Earth', 'Jupiter', 'Mars', 'Mercury', 'Neptune', 'Saturn', 'Uranus', 'Venus'}

You may have noticed that the set has a different order than the one we wrote, since it sorted the values in alphabetical order.

#### 4.1.2 Set creation <a name="4.1.2"></a>

An empty set can be created using the function __`set()`__. 

**Note:** Remember from <a href="#2.1.1">section 2.1.1</a> that using empty curly brackets `{}` will create an empty dictionary and not a set.

In [141]:
this_set = set()
len(this_set), type(this_set)

(0, set)

---

What happens if we have duplicated items?

In [142]:
planets = {"Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Uranus", "Neptune", "Neptune"}
planets

{'Earth', 'Jupiter', 'Mars', 'Mercury', 'Neptune', 'Saturn', 'Uranus', 'Venus'}

As you may have noticed, we've written `"Neptune"` twice, but it only appeared once in the set.

It is __not possible__ to have __duplicated items__ in a Python set.

---

Let's now create a set with values of different types.

In [143]:
earth = {True, "Earth", 3}
print(earth)
print( type(earth) )

{3, True, 'Earth'}
<class 'set'>


In [144]:
mercury = {True, "Mercury", [0,1,2]}
mercury

TypeError: unhashable type: 'list'

It is possible to have a set with mixed data types such as bool, string, number or a tuple. But if you try to include a mutable type like a list, a dictionary or another set, it does not allow you and returns an error. 

Even if sets allow you to have different data types in them, you should avoid it and be consistent with the data you have in your set.

---

### 4.2 Accessing values in a set <a name="4.2"></a>

As previously said, sets are unindexed and unordered, meaning you cannot access a specific element in it. So the operations indexing and slicing are not permitted.

In [145]:
# Indexing does not work with sets
planets[0]

TypeError: 'set' object is not subscriptable

The previous cell returned an error, as expected.

### 4.3 Set size and Membership testing <a name="4.3"></a>

As in the other data structures, you can use the function __`len()`__ to get the number of elements in your set.

In [146]:
len(planets)

8

To check if an element exists in a set, you can again use the __`in`__ and __`not it`__ keywords. 

**Tip:** Since sets have unique values, searching for an element in this data structure is faster than in a list with duplicates.

In [147]:
"Pluto" in planets

False

As you can see, `"Pluto"` does not belong in the set `planets`. 

---

### 4.4 Conversion between lists and sets <a name="4.4"></a>

You can convert a list to a set and vice-versa, simply using the __`set()`__ or __`list()`__ functions. 

In [148]:
planets_list = list(planets)
print(planets_list)
print( type(planets_list) )

['Venus', 'Earth', 'Uranus', 'Neptune', 'Mercury', 'Mars', 'Saturn', 'Jupiter']
<class 'list'>


In [149]:
shopping_list = ["pen", "pencil", "paper", "pencil", "pen"]
shopping_set = set(shopping_list)
print(shopping_set)
print( type(shopping_set) )

{'paper', 'pen', 'pencil'}
<class 'set'>


When converting a list with repeated items to a set, we get the unique values in the list. 

**Important useful case**: if you have a big list of items and you want to check for duplicates, sets can help you with that by comparing the sizes of both data structures.

In [150]:
len(shopping_list) == len(shopping_set)

False

As you can see, the previous cell returned False, since the list has duplicated elements.

---

### 4.5 Modify sets <a name="4.5"></a>

We cannot access specfic elements of a set by indexing or slicing. But we can modify sets by adding new elements or removing them.

To add a single element, we can use the __`add()`__ method, or if we want to add multiple elements, we can use the __`update()`__ method. This last method takes lists, strings, tuples or other sets of elements to add.

-  __`add()`__

In [151]:
terrestrial_planets = {"Mercury", "Venus", "Earth"}
terrestrial_planets

{'Earth', 'Mercury', 'Venus'}

In [152]:
terrestrial_planets.add("Mars")
terrestrial_planets

{'Earth', 'Mars', 'Mercury', 'Venus'}

---

-  __`update()`__

In [153]:
giant_planets = {"Jupiter", "Saturn"}
giant_planets

{'Jupiter', 'Saturn'}

In [154]:
giant_planets.update(["Neptune", "Uranus"])
giant_planets

{'Jupiter', 'Neptune', 'Saturn', 'Uranus'}

---

To remove an element from a set, we can use the methods __`discard()`__ or __`remove()`__.

The only difference between them is that the last one returns an error if the element does not exist in the set.

In [155]:
planets = {"Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Uranus", "Neptune", "Pluto"}
planets

{'Earth',
 'Jupiter',
 'Mars',
 'Mercury',
 'Neptune',
 'Pluto',
 'Saturn',
 'Uranus',
 'Venus'}

In [156]:
planets.remove("Pluto")
planets

{'Earth', 'Jupiter', 'Mars', 'Mercury', 'Neptune', 'Saturn', 'Uranus', 'Venus'}

---

### 4.6  Operations with sets <a name="4.6"></a>

You can compare the elements in two sets by checking their intersection or difference. For this you have the methods __`intersection()`__ (or the keyword __`&`__) and __`difference()`__ (or the keyword __`-`__).

And you may also need to add the elements of two sets together. For this, you may use the method __`union()`__ or the keyword __`|`__.

Let's start by checking which `terrestrial_planets` are also `blue_planets`.

In [157]:
terrestrial_planets = {"Earth", "Mars", "Mercury", "Venus"}
terrestrial_planets

{'Earth', 'Mars', 'Mercury', 'Venus'}

In [158]:
blue_planets = {"Earth", "Uranus", "Neptune"}
blue_planets

{'Earth', 'Neptune', 'Uranus'}

- __`intersection()`__

In [159]:
# You can also write terrestrial_planets & blue_planets
terrestrial_planets.intersection(blue_planets)

{'Earth'}

Only `"Earth"` is a terrestrial blue planet. 

---

- __`difference()`__

Now let's find out which terrestrial planets are not blue, using both notations for the difference operation.

In [160]:
terrestrial_planets.difference(blue_planets)

{'Mars', 'Mercury', 'Venus'}

In [161]:
# This is the same as the previous cell
terrestrial_planets - blue_planets

{'Mars', 'Mercury', 'Venus'}

---

- __`union()`__

Let's bring together the `terrestrial_planets` and `giant_planets` into one set.

In [162]:
all_planets = terrestrial_planets | giant_planets
all_planets

{'Earth', 'Jupiter', 'Mars', 'Mercury', 'Neptune', 'Saturn', 'Uranus', 'Venus'}

This is the same as writing:

In [163]:
all_planets = terrestrial_planets.union(giant_planets)
all_planets

{'Earth', 'Jupiter', 'Mars', 'Mercury', 'Neptune', 'Saturn', 'Uranus', 'Venus'}

---

### 4.7 Further Reading <a name="4.7"></a>

[Python Docs: Datastructures-Sets](https://docs.python.org/3/tutorial/datastructures.html#sets)   
[Realpython: Python sets](https://realpython.com/python-sets/)   
[Programiz: Set](https://www.programiz.com/python-programming/set)   
[W3schools: Python Sets](https://www.w3schools.com/python/python_sets.asp)   

---