# Session 5: sets and dictionaries

## Sets

Sets contain mutable and unordered collections of unique items. They're defined in a similar way to lists and tuples, but this time with curly brackets: `{}`. Or with `set()`.

```Python
my_set = {1, 2, 3}
```

The principal characteristic of sets is that they can only contain *unique* items.

In [1]:
set1 = {1, 1, 2, 2, 3, 3}

set1

{1, 2, 3}

In [2]:
set2 = set([1, 1, 2, 2, 3, 3])

set2

{1, 2, 3}

Say we want to check how many unique characters are there in a string:
* "lava" has 3 unique characters: "l", "a", "v"
* "savannah" has 5 unique characters: "s", "a", "v", "n", "h"

If we want to get the *unique characters* in a string we can use a for loop:

In [3]:
str_to_check = "savannah"

# for loop, we will learn about it
list_of_letters = []
for letter in str_to_check:
    if letter not in list_of_letters:
        list_of_letters.append(letter)

list_of_letters


['s', 'a', 'v', 'n', 'h']

Or we can use `set()` on the string

In [4]:
set(list("savannah"))

{'a', 'h', 'n', 's', 'v'}

Or we can use `set()` on a list with all the characters in the string

In [5]:
list_from_string = list(str_to_check)
set_from_list_from_string = set(list_from_string)
set_from_list_from_string

{'a', 'h', 'n', 's', 'v'}

### Basic set theory

Since we have learned how to build containers that store unique items, we can use some Set Theory on them.

Say we have two sets: `A` and `B`

* Union of `A` and `B`: items appearing in either. Use `|` or A.union(B)
* Intersection of `A` and `B`: items appearing in both. Use `&` or A.intersection(B)
* Difference of `A` and `B`: items in `A` but not in  `B`. Use `-` or A.difference(B)
* Symmetric difference of `A` and `B`: items in only `A` or only `B`, but not in both. Use `^` or A.symmetric_difference(B)

In [7]:
setA = {2, 3, 5, 7}
setB = {1, 3, 5, 7, 9}
setC = {8, 6}

In [8]:
# union: items appearing in either
setA | setB

{1, 2, 3, 5, 7, 9}

In [9]:
# intersection: items appearing in both
setA & setB 

{3, 5, 7}

In [10]:
# difference: items in setA but not in setB
setA - setB

{2}

In [11]:
# symmetric difference: items appearing in only one set
setA ^ setB

{1, 2, 9}

### Culinary example for Set Theory! 

For a set of 3 dishes and their ingredients, let's find similarities and differences

In [12]:
hummus = {"chickpeas", "olive oil", "garlic", "tahini", "lemon juice", "salt"}
cocido = {"chickpeas", "olive oil", "meat", "chicken", "veggies", "salt"}
allioli = {"olive oil", "garlic", "salt"}

In [13]:
# what's common to the three dishes?

hummus & cocido & allioli

{'olive oil', 'salt'}

In [14]:
# what's in hummus that's not in allioli
hummus - allioli

{'chickpeas', 'lemon juice', 'tahini'}

In [16]:
# what are the differential ingredients in cocido and hummus?

cocido ^ hummus

{'chicken', 'garlic', 'lemon juice', 'meat', 'tahini', 'veggies'}

In [17]:
# if we want to prepare hummus and allioli, which ingredients should we buy?

hummus | allioli

{'chickpeas', 'garlic', 'lemon juice', 'olive oil', 'salt', 'tahini'}

### Updating sets

We can't mutate sets, but we can add items to a set by using `update()`. This will add the new items *in place*.

In [18]:
cocido.update(allioli) # same as union

cocido

{'chicken', 'chickpeas', 'garlic', 'meat', 'olive oil', 'salt', 'veggies'}

## Dictionaries

Dictionaries are very flexible ways of mapping keys to values. Dictionaries are a super container, that can contain other containers. 

We define dictionaries with `dict()` or with curly brackets `{}`like sets **but** we specify keys and values. 

Let's create a dictionary with animal names and ages using `{}`

Dictionaries are mutable, and recently they have been ordered in Python 3.7 and above.

In [19]:
my_dict = {
    "key1": "a",
    "key2": "b",
    "key3": "c",
}

my_dict

{'key1': 'a', 'key2': 'b', 'key3': 'c'}

Once the dictionary is created, we can add new pairs of keys and values by using square brackets `[]` and the assignment operator `=`.

But, we can't add duplicate keys to a dictionary. If we try to add a key that already exists, the value will be updated.

In [20]:
my_dict["key4"] = 'd'

my_dict

{'key1': 'a', 'key2': 'b', 'key3': 'c', 'key4': 'd'}

In [21]:
my_dict["key4"] = 'e'

my_dict

{'key1': 'a', 'key2': 'b', 'key3': 'c', 'key4': 'e'}

We can nest dictionaries, lists, sets, and tuples inside dictionaries.

In [23]:
dict_ids = {
    1: {
        "name": "dani",
        "email": "asdfasdf@qdsfasdf.com"
        },
    2: {
        "name": "pepe",
        "email": "asdfaqsdfasdfsdf@qdsfasdf.com"
        },
}

print(dict_ids)

{1: {'name': 'dani', 'email': 'asdfasdf@qdsfasdf.com'}, 2: {'name': 'pepe', 'email': 'asdfaqsdfasdfsdf@qdsfasdf.com'}}


We can extract the values associated with a key using square brackets `[]` or the method `get()`

In [24]:
ages_dict = {
    "dani": 36,
    "churro": 11,
}

ages_dict["churro"]

11

In [25]:
ages_dict.get('dani')

36

The difference between the two is that if the key doesn't exist, `[]` will raise an error, while `get()` will return `None`.

In [8]:
ages_dict['plant']

KeyError: 'plant'

In [26]:
ages_dict.get('plant')

In [27]:
age_plant = ages_dict.get('plant')

type(age_plant)

NoneType

In [28]:
# update

new_additions = {
    'chevin': 99,
    'maureen': 99 
}

ages_dict.update(new_additions)

ages_dict

{'dani': 36, 'churro': 11, 'chevin': 99, 'maureen': 99}

We can also build a dictionary using `dict()` and passing to it a sequence of key-value pairs, like a list of tuples:

In [29]:
another_dict = dict([("dani", 35), ("churro", 10)])

another_dict

{'dani': 35, 'churro': 10}

We can create a dictionary using the generators we learned in the previous session: `zip()` and `enumerate()`

In [12]:
dict(
    zip(
        ['a', 'b', 'c'],
        [1, 2, 3]
    )
)

{'a': 1, 'b': 2, 'c': 3}

In [13]:
dict(
    enumerate(
        ["a", "b", "c"]
    )
)

{1: 'a', 2: 'b', 3: 'c'}

In [31]:
dict(
    enumerate(
        ["a", "b", "c"]
    , start = 1)
)

{1: 'a', 2: 'b', 3: 'c'}

We can extract the keys and values as a list of tuples when using the method `items()` on a dictionary.

This will be very useful when looping through dictionaries.

In [32]:
ages_dict.items()

dict_items([('dani', 36), ('churro', 11), ('chevin', 99), ('maureen', 99)])

## Exercises

* Create a dictionary containing the letters in the abc as keys, and their position in the abc as values
* Use the dictionary to calculate the sum of the positions of the letters in the word "hello"
* Represent a train as a dictionary, knowing that it contains the following parts
  * A train contains 4 wagons
  * A wagon contains 10 rows
  * A row contains 4 seats

In [42]:
# alphabet = dict(enumerate(list('abcdefghijklmnopqrstuvwxyz'), start = 1))
alphabet = dict(zip(list('abcdefghijklmnopqrstuvwxyz'), list(range(1, 27))))
alphabet = dict(zip(list('abcdefghijklmnopqrstuvwxyz'), range(1, 27)))

print(alphabet)

{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f': 6, 'g': 7, 'h': 8, 'i': 9, 'j': 10, 'k': 11, 'l': 12, 'm': 13, 'n': 14, 'o': 15, 'p': 16, 'q': 17, 'r': 18, 's': 19, 't': 20, 'u': 21, 'v': 22, 'w': 23, 'x': 24, 'y': 25}


In [43]:
word = 'hello'
count = 0

for letter in list(word):
    num = alphabet[letter]
    count += num

print(count)

52


### Keys and values

Dictionaries, as mentioned, have `keys` and `values`. We can access those independently using the methods `.keys()` and `.values()` on `dict` objects

In [15]:
ages_dict

{'dani': 36, 'churro': 11, 'chevin': 99, 'maureen': 99}

In [16]:
list(ages_dict.keys())

['dani', 'churro', 'chevin', 'maureen']

In [17]:
list(ages_dict.values())

[36, 11, 99, 99]

In [18]:
vs = ages_dict.values()

sum(vs)

245

### Updating dictionaries

Dictionaries, like sets, can use update to add new key-value pairs to the dictionary.

In [24]:
# the usual way, 1 by 1
ages_dict["car"] = 10

ages_dict

{'dani': 36, 'churro': 11, 'chevin': 99, 'maureen': 99, 'car': 10}

In [26]:
# lets add another entry to our people-age dictionary: pepe, 30 y.o.
ages_dict.update(
    {"pepe": 33, "plant": 2}
)
ages_dict

{'dani': 36,
 'churro': 11,
 'chevin': 99,
 'maureen': 99,
 'car': 10,
 'pepe': 33,
 'plant': 2}

Or we can add individual items this way:

```Python
dictionary[new_key] = new_value
```

### Sorting a dictionary

We can't directly sort a dictionary by its keys, we can use some tricks to sort them according to what we want:
1. Use `dict.items()` to convert it into a list of tuples
2. Order the list according to the first element in each tuple
3. Convert the sorted list of tuples into a dictionary again


In [30]:
new_dict_items = ages_dict.items() # 1

list_new_dict_items = list(new_dict_items) # 1

sorted_list = sorted(list_new_dict_items) # 2

dict(sorted_list) # 3

{'car': 10,
 'chevin': 99,
 'churro': 11,
 'dani': 36,
 'maureen': 99,
 'pepe': 33,
 'plant': 2}

In [31]:
# Sort people_dict according to the age (second item in each item)
sorted_people_dict = dict(sorted(ages_dict.items(), key=lambda item: item[1]))

sorted_people_dict

{'plant': 2,
 'car': 10,
 'churro': 11,
 'pepe': 33,
 'dani': 36,
 'chevin': 99,
 'maureen': 99}

In [32]:
# Sort people_dict according to the age (second item in each item), descending order
sorted_people_dict = dict(sorted(ages_dict.items(), key=lambda item: -item[1]))

sorted_people_dict

{'chevin': 99,
 'maureen': 99,
 'dani': 36,
 'pepe': 33,
 'churro': 11,
 'car': 10,
 'plant': 2}