# Data Structures

Data structures are very important in programming. They store information for later processing. They are a type of variable and are declared in the same way. The three main types of data structures are:

Lists (`list`)
Tuples (`tuple`)
Dictionaries (`dict`)
Sets (`set`)

## Lists

Lists are sequences, ordered collections of objects separated by commas. They are declared with the indicating operator `[]`.

In [1]:
buttons = ["morning", "before lunch", "after lunch", "end of day"]

A value can be accessed by indicating the index. Warning, the indexes always start from 0. It's a little disturbing at first, but you have to get used to it.

In [2]:
print(buttons[0])

morning


We can specify that we only want the first two elements of the list.

In [3]:
print(buttons[:2])

['morning', 'before lunch']


We can also reverse the list.

In [4]:
print(buttons[::-1])

['end of day', 'after lunch', 'before lunch', 'morning']


You can also add a new element to a list:

In [5]:
buttons.append("before dinner")
print(buttons)

['morning', 'before lunch', 'after lunch', 'end of day', 'before dinner']


You can put any type of data in it.

In [6]:
var = "Hello, I am a variable that contains String values."
other_list = [193.45, "Hello", ["Inception list", "Two elements"], 89, var]
print(other_list)

[193.45, 'Hello', ['Inception list', 'Two elements'], 89, 'Hello, I am a variable that contains String values.']


You can also merge two lists.

In [7]:
buttons = ["morning", "before lunch", "after lunch", "end of day"]
extra_buttons = ["before dinner", "after dinner", "time to sleep!"]
buttons.extend(extra_buttons)
print(extra_buttons)

# OR
buttons = ["morning", "before lunch", "after lunch", "end of day"]
extra_buttons = ["before dinner", "after dinner", "time to sleep!"]

all_buttons = buttons + extra_buttons
print(all_buttons)

['before dinner', 'after dinner', 'time to sleep!']
['morning', 'before lunch', 'after lunch', 'end of day', 'before dinner', 'after dinner', 'time to sleep!']


#### Exercise

1. By doing some research, try to print the following elements from the list buttons:

- The last button of the day
- The two last buttons of the day

2. You can then add a button `"Wake up"` in the list before the start of the day and print the new two first buttons of the day

In [46]:
print(buttons[::-1])

['Wake Up', 'Wake Up', 'Wake Up', 'Wake Up', 'Wake Up', 'Wake Up', 'Wake Up', 'Wake Up', 'Wake Up)', 'Wake Up)', 'Wake Up)', 'Wake Up)', 'Wake Up)', 'Wake Up)', 'Wake Up)', 'end of day', 'after lunch', 'before lunch', 'morning']


## Tuples

Python offers a type of data called `tuple`. It is similar to a list but cannot be modified: you cannot add anything in it and you cannot change the value of its elements. You can use them when want to be sure that the data transmitted is not modified by mistake within a program.

They are declared with the indicating operator `()`.

In [65]:
buttons_tuple = ("morning", "before lunch", "after lunch", "end of day")
buttons_list = ["morning", "before lunch", "after lunch", "end of day"]

### `tuple` vs. `list`
As we can see, when we try to modify the content of a `tuple` object, we get an error. However, no problem for the list.

In [9]:
buttons_tuple.append("before dinner")

AttributeError: 'tuple' object has no attribute 'append'

In [10]:
buttons_list.append("before dinner")
print(buttons_list)

['morning', 'before lunch', 'after lunch', 'end of day', 'before dinner']


Also if we save a `list` variable in another variable and make a modification on one of the lists, both lists will be impacted.

In [47]:
buttons_list_copy = buttons_list
buttons_list_copy.append("after dinner")

print(buttons_list_copy)
print(buttons_list)

['morning', 'before lunch', 'after lunch', 'end of day', 'before dinner', 'after dinner']
['morning', 'before lunch', 'after lunch', 'end of day', 'before dinner', 'after dinner']


So if you need to process a collection that will not change, use tuples. On the other hand, if you know that this collection will have to be changed, then use the lists.

**Question:** Regarding the buttons from MyBecode, is it more relevant to use a `list` or a `tuple`? Why?

## Sets

A set is an unordered collection of unique elements in Python. You can define in two ways:

```python
# Creating an empty set
empty_set = set()

# Creating a set with elements
my_set = set([2, 3, 4, 1, 5])
```

or

```python
# Creating a set with elements
my_set = {2, 3, 4, 1, 5}
```

Since the set is not ordered you cannot append an element and keep the order. But you can still `add` elements in the set:


In [91]:
my_set = set([2, 4, 3, 1, 5])
my_set.add("X")
print(my_set)

{1, 2, 3, 4, 5, 'X'}


### `set` vs. `list`

When you have unique elements and you don't care about the order, it is better to use a set than a list because the performance is better. This is due to the way we look for information in the data structure.

Imagine that you have a list of guests for a party (and you know it by heart). When a guest is coming you have two ways of checking if he/she is well invited:

- Go through the list of guests until you have the name. This is how a **list** is working.
- Trust your memory to remember if the person was invited or not. This is how a **set** is working.

This is related to computational complexity. It is not important to remember now but keep that example in mind.

### Operations with sets

Sets can be used to perform mathematical (logical) operations. You can check the union, intersection or difference between two sets

Imagine that you have two groups of friends: one from Becode and another from your sport club. Some of them can be in both groups!

In [74]:
becode = {"Louis", "Basile", "Mohammed", "Justine", "Senay", "Sofie"}
sport = {"Remco", "Eden", "Mohammed", "Senay"}

# Union: include all friends from both groups
print("Union:", becode.union(sport))

# Intersection: include only friends that are in both groups
print("Intersection:", becode.intersection(sport))

# Difference: friends from Becode that are not in the sport group
print("Difference:", becode.difference(sport))

# Symmetric difference (friends who are exclusive to one group)
print("Symetric difference:", becode.symmetric_difference(sport))

Union: {'Louis', 'Sofie', 'Senay', 'Justine', 'Basile', 'Eden', 'Mohammed', 'Remco'}
Intersection: {'Senay', 'Mohammed'}
Difference: {'Basile', 'Louis', 'Justine', 'Sofie'}
Symetric difference: {'Louis', 'Sofie', 'Justine', 'Basile', 'Eden', 'Remco'}


## Dictionaries

A dictionary in Python is like a real-life dictionary or a phone book. It helps you store and organize information in pairs: a **unique** keyword (like a word in a dictionary or a person's name in a phone book) and its related information (like a definition in a dictionary or a phone number in a phone book). In Python, we call these unique keywords "keys" and their related information "values." Dictionaries allow you to quickly find the value associated with a specific key.

From a syntactic point of view, a dictionary structure is declared by a pair of braces. An empty dictionary will therefore be written as `{}`. You will probably have noticed that this is the same notation as a set. But this is different inside since we store pairs of elements and not unique elements.

In dictionaries, indexes will be strings of characters, unlike lists.

Since the dictionary type is a modifiable type, we can start by creating an empty dictionary and then fill it in little by little:

In [102]:
price_catalog = {}
price_catalog["apple"] = 0.5
price_catalog["strawberry"] = 0.8
price_catalog["kiwi"] = 1.2

print(price_catalog)

{'apple': 0.5, 'strawberry': 0.8, 'kiwi': 1.2}


As you can see in the line above, a dictionary appears as a series of elements separated by commas (all enclosed between two braces). Each of these elements consists of a pair of objects: an index and a value, separated by a colon.

In a dictionary, indexes are called keys, so elements can be called key-value pairs. You may notice that the order in which the elements appear in the last line does not correspond to the order in which we provided them. This is absolutely irrelevant: we will never try to extract a value from a dictionary using a numerical index. Instead, we will use the keys:

In [14]:
print(price_catalog["apple"])
print(price_catalog["kiwi"])

0.5
1.2


Here, "apple" and "kiwi" are the keys, the price are the values.

Unlike lists, it is not necessary to use a particular method to add new elements to a dictionary: simply create a new key-value pair.

In [16]:
price_catalog["lemon"] = 0.4
print(price_catalog)

{'strawberry': 0.8, 'kiwi': 1.2, 'lemon': 0.4}


You can also update the value of a key

In [17]:
price_catalog["apple"] = 0.6
print(price_catalog)

{'strawberry': 0.8, 'kiwi': 1.2, 'lemon': 0.4, 'apple': 0.6}


It is also possible to remove a key-value pair from the dictionary by using the keyword `del`

In [15]:
del price_catalog["apple"]
print(price_catalog)

{'strawberry': 0.8, 'kiwi': 1.2}


Last thing to know: if you try to get something that is not in the dictionary, you will face an error

In [18]:
print(price_catalog["orange"])

KeyError: 'orange'

## Exercises

1. Create a list of 10 famous artists and print the list.
2. Add a famous artist to the list of artists and print the updated list.
3. Order the list alphabetically and print the three first artists.
3. Create a tuple containing three famous artworks and print the tuple.
4. Create a set of art styles and print the set.
5. Add an art style to an existing set of art styles and print the updated set.
6. Create a second set of art styles and print the common elements.
7. Create a dictionary with famous artists as keys and their most known artwork as values.
8. Add a new artist and his/her most famous artwork to the existing dictionary and print it.
9. Access the most famous artwork of a specific artist in the dictionary and print it.
10. Remove an artist and his/her artwork from the dictionary and print the updated dictionary.

In [105]:
from re import A
artists = ["Michael Jackson", "Bruno Mars", "Elvis Presley", "Justin Timberlake", "Eminem", "Ed Sheeran", "Sia", "Chris Brown", "Timbaland", "Russ", "Linkin Park"]
print(list(artists))

artists.append("Stormzy")
print(sorted(artists))

print(sorted(artists)[:3])

artwork_tuple = ("Thriller", "Doo-Wops & Hooligans", "From Elvis in Memphis" )
print(artwork_tuple)

artstyle_set = {"Breezy", "Shock Value", "CHOMP 2","FutureSex/LoveSounds", "The Marshall Mathers LP"}
print(artstyle_set)

artstyle_set.add("+")
print(artstyle_set)

artstyle_set2 = {"This Is Acting", "Hybrid Theory"}
print(artwork_tuple, artstyle_set, artstyle_set2)

artists = {}
artists["Michael Jackson"] = "Thriller"
artists["Justin Timberlake"] = "FutureSex/LoveSounds"
print(artists)

del artists["Justin Timberlake"]
print(artists)

['Michael Jackson', 'Bruno Mars', 'Elvis Presley', 'Justin Timberlake', 'Eminem', 'Ed Sheeran', 'Sia', 'Chris Brown', 'Timbaland', 'Russ', 'Linkin Park']
['Bruno Mars', 'Chris Brown', 'Ed Sheeran', 'Elvis Presley', 'Eminem', 'Justin Timberlake', 'Linkin Park', 'Michael Jackson', 'Russ', 'Sia', 'Stormzy', 'Timbaland']
['Bruno Mars', 'Chris Brown', 'Ed Sheeran']
('Thriller', 'Doo-Wops & Hooligans', 'From Elvis in Memphis')
{'FutureSex/LoveSounds', 'Breezy', 'The Marshall Mathers LP', 'CHOMP 2', 'Shock Value'}
{'FutureSex/LoveSounds', 'Breezy', 'The Marshall Mathers LP', 'CHOMP 2', '+', 'Shock Value'}
('Thriller', 'Doo-Wops & Hooligans', 'From Elvis in Memphis') {'FutureSex/LoveSounds', 'Breezy', 'The Marshall Mathers LP', 'CHOMP 2', '+', 'Shock Value'} {'Hybrid Theory', 'This Is Acting'}
{'Michael Jackson': 'Thriller', 'Justin Timberlake': 'FutureSex/LoveSounds'}
{'Michael Jackson': 'Thriller'}
