# Session 6: sets and dictionaries

## Sets

Sets contain unordered collections of unique items. They're defined in a similar way to lists and tuples, but this time with curly brackets: `{}`. Or with `set()`.

```Python
my_set = {1, 2, 3}
```

The principal characteristic of sets is that they can only contain *unique* items.

In [None]:
set1 = {1, 1, 2, 2, 3, 3}

set1

In [None]:
set2 = set([1, 1, 2, 2, 3, 3])

set2

Say we want to check how many unique characters are there in a string:
* "lava" has 3 unique characters: "l", "a", "v"
* "savannah" has 5 unique characters: "s", "a", "v", "n", "h"

If we want to get the *unique characters* in a string we can use a for loop:

In [None]:
str_to_check = "savannah"

unique_chars = []
for character in str_to_check:
    if character in unique_chars:
        continue
    else:
        unique_chars.append(character)
        
unique_chars

Or we can use `set()` on a list with all the characters

In [None]:
set(list(str_to_check))

### Basic set theory

Since we have learned how to build containers that store unique items, we can use some Set Theory on them.

Say we have two sets: `A` and `B`

* Union of `A` and `B`: items appearing in either. Use `|` or A.union(B)
* Intersection of `A` and `B`: items appearing in both. Use `&` or A.intersection(B)
* Difference of `A` and `B`: items in `A` but not in  `B`. Use `-` or A.difference(B)
* Symmetric difference of `A` and `B`: items in only `A` or only `B`, but not in both. Use `^` or A.symmetric_difference(B)

In [None]:
setA = {2, 3, 5, 7}
setB = {1, 3, 5, 7, 9}

In [None]:
# union: items appearing in either
setA | setB

In [None]:
# intersection: items appearing in both
setA & setB

In [None]:
# difference: items in setA but not in setB
setA - setB

In [None]:
# symmetric difference: items appearing in only one set
setA ^ setB

### Culinary example for Set Theory! 

For a set of 3 dishes and their ingredients, let's find similaritie and differences

In [None]:
hummus = {"chickpeas", "olive oil", "garlic", "tahini", "lemon juice", "salt"}
cocido = {"chickpeas", "olive oil", "meat", "chicken", "veggies", "salt"}
allioli = {"olive oil", "garlic", "salt"}

In [None]:
# what's common to the three dishes?

hummus & cocido & allioli

In [None]:
# what's in hummus that's not in allioli
hummus - allioli

In [None]:
# what are the differential ingredients in cocido and hummus?

cocido ^ hummus

In [None]:
# if we want to prepare hummus and allioli, which ingredients should we buy?

hummus | allioli

### Updating sets

We can't mutate sets, but we can add items to a set by using `update()`. This will add the new items *in place*.

In [None]:
cocido.update(allioli)

cocido

## Dictionaries

Dictionaries are very flexible ways of mapping keys to values. Dictionaries are a super container, that can contain other containers. 

We define dictionaries with `dict()` or with curly brackets `{}`like sets **but** we specify keys and values. 

Let's create a dictionary with people names and ages using `{}`

In [None]:
people_dict = {
    "dani": 33,
    "jean": 30,
    "pablo": 25,
    "hannah": 75
}

people_dict

We can also build a dictionary using `dict()` and passing to it a sequence of key-value pairs, like a list of tuples:

In [None]:
another_dict = dict([("dani", 33), ("jean", 30)])

another_dict

### Keys and values

Dictionaries, as mentioned, have `keys` and `values`. We can access those independently using the methods `.keys()` and `.values()` on `dict` objects

In [None]:
ks = people_dict.keys()

ks

In [None]:
vs = people_dict.values()

vs

Another way to build dictionaries: using `zip` with two containers: one for keys and one for values

```Python
new_dict = dict(zip(keys, values))
```

In [None]:
words = ["hola", "adios", "bye"]  # keys
len_words = [4, 5, 3]  # values

words_dict = dict(zip(words, len_words))

words_dict

### Updating dictionaries

Dictionaries, like sets, can use update:

In [None]:
# lets add another entry to our people-age dictionary: paco, 45 years old
people_dict.update({"paco": 45})

people_dict

Or we can add individual items this way:

```Python
dictionary[new_key] = new_value
```

In [None]:
people_dict["rose"] = 22

people_dict

### Looping through dictionaries

When looping through list, tuples or sets, we can just:

```Python
for item in my_list:
    something_with_item
    
for item in my_tuple:
    something_with_item
    
for item in my_set:
    something_with_item
```

But if we do the same with a dictionary, we're going to loop only through the keys:

In [None]:
for item in people_dict:
    print(item)

So in order to loop and access *both* keys and their assigned value, we have to use the method `items()` on our dict first. By using `items()` we convert the dictionary into a list of tuples with the following format:

```Python
my_dict.items() 
>> [(key_1, value_1), (key_2, value_2), ..., (key_n, value_n)]
```

And now we can loop through each item in that list:


```Python
for key, value in my_dict.items():
    something_with_key_and_value
```

In [None]:
# printing name and age in the same string for each key-value pair
for name, age in people_dict.items():
    print(f"{name} is {age} years old")

### Nested dictionaries

Just like lists, we can nest dictionaries:

In [None]:
nested_dict = {
    "person_1": {"name": "daniel", "age": 33},
    "person_2": {"name": "pepe", "age": 30}
}

nested_dict

In [None]:
nested_dict["person_1"]

In [None]:
nested_dict["person_2"]["age"]

### Sorting a dictionary

By default, dictionaries don't follow an order intelligible by humans, that's what makes them so efficient at handling information.

But we can use some tricks to sort them according to what we want:
1. Use `dict.items()`
2. Specifying the order using``key`` and a function -- `lambda`
3. Use `sorted()`
4. Save new sorted dict

In [None]:
# Sort people_dict according to the names (first item in each item)
sorted_people_dict = dict(sorted(people_dict.items(), key=lambda item: item[0]))

sorted_people_dict

In [None]:
# Sort people_dict according to the age (second item in each item)
sorted_people_dict = dict(sorted(people_dict.items(), key=lambda item: item[1]))

sorted_people_dict

In [None]:
# Sort people_dict according to the age (second item in each item), descending order
sorted_people_dict = dict(sorted(people_dict.items(), key=lambda item: -item[1]))

sorted_people_dict

### Exercises: Spotify data

Let's use some of my streaming data from Spotify and analyze it!

The original format is JSON (we'll learn about this, basically its lists and dictionaries nested). We're gonna convert it into a pure dictionary with the following structure:

```Python
{..., 
    song_n:{
        "endTime": endTime,
        "artistName": artistName,
        "trackName": trackName,
        "msPlayed": msPlayed
    },
...}
```

In [None]:
# we'll go through this in the next session, for now just follow the steps
import json
with open("spotify.json") as json_file:
    json_data = json.load(json_file)
    
# some preparation of the data
spotify = {}

for i, item in enumerate(json_data):
    spotify[f"song_{i}"] = item

1. How much time does this file span?

2. How many different artists I've listened to in that period?

3. How many different songs I've listened to?

4. How much time I've spent listening to music?

5. Most listened artist

6. Most listened song 