# Session 4: Data Structures II - Dictionaries and Sets

## Sets

Sets contain mutable and unordered collections of unique items. They're defined in a similar way to lists and tuples, but this time with curly brackets: `{}`. Or with `set()`.

```Python
my_set = {1, 2, 3}
```

The principal characteristic of sets is that they can only contain *unique* items.

In [4]:
set1 = {1, 1, 2, 2, 3, 3}

set1

{1, 2, 3}

In [5]:
set2 = set([1, 1, 2, 2, 3, 3])

set2

{1, 2, 3}

Say we want to check how many unique characters are there in a string:
* "lava" has 3 unique characters: "l", "a", "v"
* "savannah" has 5 unique characters: "s", "a", "v", "n", "h"

If we want to get the *unique characters* in a string we can use a for loop:

In [None]:
str_to_check = "savannah"

# for loop, we will learn about it
list_of_letters = []
for letter in str_to_check:
    if letter not in list_of_letters:
        list_of_letters.append(letter)

list_of_letters

Or we can use `set()` on the string

In [None]:
word = "savannah"

unique_letters = set(list(word))

unique_letters

{'a', 'h', 'n', 's', 'v'}

In [6]:
set(list("savannah"))

{'a', 'h', 'n', 's', 'v'}

Or we can use `set()` on a list with all the characters in the string

In [None]:
list_from_string = list(str_to_check)
set_from_list_from_string = set(list_from_string)
set_from_list_from_string

### Basic set theory

Since we have learned how to build containers that store unique items, we can use some Set Theory on them.

Say we have two sets: `A` and `B`

* Union of `A` and `B`: items appearing in either. Use `|` or A.union(B)
* Intersection of `A` and `B`: items appearing in both. Use `&` or A.intersection(B)
* Difference of `A` and `B`: items in `A` but not in  `B`. Use `-` or A.difference(B)
* Symmetric difference of `A` and `B`: items in only `A` or only `B`, but not in both. Use `^` or A.symmetric_difference(B)

In [2]:
setA = {2, 3, 5, 7}
setB = {1, 3, 5, 7, 9}
setC = {8, 6}

In [5]:
setA = {2, 3, 5.0, 7}
setB = {1, 3, 5, 7, 9}

# union: items appearing in either
setA | setB  # A or B

{1, 2, 3, 5.0, 7, 9}

In [7]:
s = set([1, 2, 3])

s[0]

TypeError: 'set' object is not subscriptable

In [9]:
# intersection: items appearing in both
setA = {2, 3, 5, 7}
setB = {1, 3, 5, 7, 9}

setA & setB

{3, 5, 7}

In [None]:
setA = {2, 3, 5, 7}
setB = {1, 3, 5, 7, 9}

# difference: items in setA but not in setB
setA - setB

set()

In [10]:
list(setA)

[2, 3, 5, 7]

In [13]:
# symmetric difference: items appearing in only one set
setA = {2, 3, 5, 7}
setB = {1, 3, 5, 7, 9}

setA ^ setB

{1, 2, 9}

In [14]:
[1, 2, 3}

SyntaxError: closing parenthesis '}' does not match opening parenthesis '[' (1978465463.py, line 1)

### Culinary example for Set Theory! 

For a set of 3 dishes and their ingredients, let's find similarities and differences

In [16]:
hummus = {"chickpeas", "olive oil", "garlic", "tahini", "lemon juice", "salt"}
cocido = {"chickpeas", "olive oil", "meat", "chicken", "veggies", "salt"}
allioli = {"olive oil", "garlic", "salt"}

In [17]:
# what's common to the three dishes?

hummus & cocido & allioli

{'olive oil', 'salt'}

In [18]:
# what's in hummus that's not in allioli
hummus - allioli

{'chickpeas', 'lemon juice', 'tahini'}

In [19]:
# what are the differential ingredients in cocido and hummus?

cocido ^ hummus

{'chicken', 'garlic', 'lemon juice', 'meat', 'tahini', 'veggies'}

In [20]:
# if we want to prepare hummus and allioli, which ingredients should we buy?

hummus | allioli | cocido

{'chicken',
 'chickpeas',
 'garlic',
 'lemon juice',
 'meat',
 'olive oil',
 'salt',
 'tahini',
 'veggies'}

### Updating sets

We can't mutate sets, but we can add items to a set by using `update()`. This will add the new items *in place*.

In [23]:
cocido.update(allioli)  # same as union

cocido

{'chicken', 'chickpeas', 'garlic', 'meat', 'olive oil', 'salt', 'veggies'}

In [26]:
my_set = {1, 2, 3}

my_set.add(4)

my_set

{1, 2, 3, 4}

In [27]:
my_set = {"a", "b"}

my_set.update({"name", "asdfqsd"})

my_set

{'a', 'asdfqsd', 'b', 'name'}

In [28]:
my_set.remove("a")

In [29]:
my_set

{'asdfqsd', 'b', 'name'}

In [None]:
my_set.update({"a"})

my_set

{'a', 'asdfqsd', 'b', 'name'}

## Dictionaries

Dictionaries are very flexible ways of mapping keys to values. Dictionaries are a super container, that can contain other containers. 

We define dictionaries with `dict()` or with curly brackets `{}`like sets **but** we specify keys and values. 

Let's create a dictionary with animal names and ages using `{}`

Dictionaries are mutable, and recently they have been ordered in Python 3.7 and above.

In [None]:
super_dict = {"key_list": [1, 2, 3], "key_dict": {"a": 1, "b": 2}}

super_dict

{'key_list': [1, 2, 3], 'key_dict': {'a': 1, 'b': 2}}

In [None]:
list_users = [{"username": "dani", "age": 37}, {"username": "luis", "age": 27}]

list_users

[{'username': 'dani', 'age': 37}, {'username': 'luis', 'age': 27}]

In [None]:
list_users[0]["username"]

'dani'

In [None]:
purchases_dict = {"purchases": [10, 12, 45, 67, 34], "date": ["2025-05-03"]}

purchases_dict

{'purchases': [10, 12, 45, 67, 34], 'date': ['2025-05-03']}

In [33]:
my_dict = {
    "key1": "a",
    "key2": "b",
    "key3": "c",
}

my_dict

{'key1': 'a', 'key2': 'b', 'key3': 'c'}

Once the dictionary is created, we can add new pairs of keys and values by using square brackets `[]` and the assignment operator `=`.

But, we can't add duplicate keys to a dictionary. If we try to add a key that already exists, the value will be updated.

In [34]:
my_dict["key3"]

'c'

In [None]:
me = {
    "name": "dani",
    "age": 37,
}

me[1]

2

In [None]:
super_dict = {"keys": ["a", "b", "c"], "values": [1, 2, 3]}

In [39]:
my_dict["key4"] = "d"

my_dict

{'key1': 'a', 'key2': 'b', 'key3': 'c', 'key4': 'd'}

We can nest dictionaries, lists, sets, and tuples inside dictionaries.

In [41]:
dict_ids = {
    1: {"name": "dani", "email": "asdfasdf@qdsfasdf.com"},
    2: {"name": "pepe", "email": "asdfaqsdfasdfsdf@qdsfasdf.com"},
}

In [None]:
dict_ids[1]["email"]

'asdfasdf@qdsfasdf.com'

We can extract the values associated with a key using square brackets `[]` or the method `get()`

In [44]:
ages_dict = {
    "dani": 37,
    "churro": 12,
}

ages_dict.get("diego")

In [45]:
ages_dict["diego"]

KeyError: 'diego'

In [None]:
ages_dict.get("jorge")

The difference between the two is that if the key doesn't exist, `[]` will raise an error, while `get()` will return `None`.

In [None]:
ages_dict["plant"]

In [None]:
ages_dict.get("plant")

In [None]:
age_plant = ages_dict.get("plant")

type(age_plant)

In [2]:
ages_dict = {
    "dani": 37,
    "churro": 12,
}

ages_dict["diego"] = 22

ages_dict

{'dani': 37, 'churro': 12, 'diego': 22}

In [None]:
# update

ages_dict = {
    "dani": 37,
    "churro": 12,
}

new_additions = {"aaa": 99, "bbb": 99}
new_additions_2 = {"aaaa": 99, "bbba": 99}

ages_dict.update(new_additions).update(new_additions_2)

ages_dict

{'dani': 37, 'churro': 12, 'aaa': 99, 'bbb': 99}

In [5]:
ages_dict.update(new_additions_2)

ages_dict

{'dani': 37, 'churro': 12, 'aaa': 99, 'bbb': 99, 'aaaa': 99, 'bbba': 99}

In [None]:
new_additions = {"aaa": 99, "bbb": 99}

new_additions["ccc"] = 98
new_additions["ccc"] = 98

new_additions

{'aaa': 99, 'bbb': 99, 'ccc': 98}

In [None]:
new_additions.update({"ccc": 98})

We can also build a dictionary using `dict()` and passing to it a sequence of key-value pairs, like a list of tuples:

In [6]:
another_dict = dict([("dani", 35), ("churro", 12)])

another_dict

{'dani': 35, 'churro': 12}

another_dict.keys(), another_dict.values()

In [8]:
list(zip(["a", "b", "c"], [1, 2, 3]))

[('a', 1), ('b', 2), ('c', 3)]

In [9]:
dict(zip(["a", "b", "c"], [1, 2, 3]))  # keys  # values

{'a': 1, 'b': 2, 'c': 3}

In [10]:
dict(enumerate(["a", "b", "c"]))

{0: 'a', 1: 'b', 2: 'c'}

In [11]:
another_dict

{'dani': 35, 'churro': 12}

In [54]:
print(another_dict.keys())
print(another_dict.values())

dict_keys(['dani', 'churro'])
dict_values([35, 10])


In [14]:
{"a": 1, "a": 2}

{'a': 2}

In [13]:
ks = another_dict.keys()
vs = another_dict.values()

dict(zip(vs, ks))

{35: 'dani', 12: 'churro'}

In [15]:
another_dict

{'dani': 35, 'churro': 12}

We can extract the keys and values as a list of tuples when using the method `items()` on a dictionary.

This will be very useful when looping through dictionaries.

In [None]:
list_tup = another_dict.items()

list_tup1

dict_items([('dani', 35), ('churro', 12)])

## Exercises

* Create a dictionary containing the letters in the abc as keys, and their position in the abc as values
* Use the dictionary to calculate the sum of the positions of the letters in the word "hello"
* Represent a train as a dictionary, knowing that it contains the following parts
  * A train contains 4 cars
  * A car contains 10 rows
  * A row contains 4 seats

### Keys and values

Dictionaries, as mentioned, have `keys` and `values`. We can access those independently using the methods `.keys()` and `.values()` on `dict` objects

In [70]:
ages_dict

{'dani': 37, 'churro': 12, 'aaa': 99, 'bbb': 99}

In [71]:
keys = list(ages_dict.keys())

In [73]:
values = list(ages_dict.values())

In [None]:
# zip

dictionary = dict(zip(values, keys))

dictionary

{37: 'dani', 12: 'churro', 99: 'bbb'}

In [None]:
vs = ages_dict.values()

sum(vs)

### Updating dictionaries

Dictionaries, like sets, can use update to add new key-value pairs to the dictionary.

In [None]:
# the usual way, 1 by 1
ages_dict["car"] = 10

ages_dict

In [None]:
# lets add another entry to our people-age dictionary: pepe, 30 y.o.
ages_dict.update({"pepe": 33, "plant": 2})
ages_dict

In [None]:
Â # antonio
ages_dict['dani'] = {
    'a': 1,
    'b': 2,
}

ages_dict['dani']['b'] = 10

ages_dict

In [None]:
ages_dict["dani"] = 9

ages_dict

Or we can add individual items this way:

```Python
dictionary[new_key] = new_value
```

### Sorting a dictionary

We can't directly sort a dictionary by its keys, we can use some tricks to sort them according to what we want:
1. Use `dict.items()` to convert it into a list of tuples
2. Order the list according to the first element in each tuple
3. Convert the sorted list of tuples into a dictionary again


In [76]:
sorted(ages_dict)

['aaa', 'bbb', 'churro', 'dani']

In [None]:
new_dict_items = ages_dict.items()  # 1

print(new_dict_items)  # list of tuples

list_new_dict_items = list(new_dict_items)  # 2

print(list_new_dict_items)

sorted_list = sorted(list_new_dict_items)  # 3

print(list_new_dict_items)
print(sorted_list)

sorted_ages_dict = dict(sorted_list)  # 4

print(sorted_ages_dict)

In [80]:
ages_dict

{'dani': 37, 'churro': 12, 'aaa': 99, 'bbb': 99}

In [85]:
# Sort people_dict according to the age (second item in each item)
sorted_people_dict = dict(sorted(ages_dict.items(), key=lambda x: x[1], reverse=True))

sorted_people_dict

{'aaa': 99, 'bbb': 99, 'dani': 37, 'churro': 12}

In [None]:
# Sort people_dict according to the age (second item in each item), descending order
sorted_people_dict = dict(sorted(ages_dict.items(), key=lambda item: -item[1]))

sorted_people_dict