<a href="https://colab.research.google.com/github/krauseannelize/nb-py-ms-exercises/blob/main/notebooks/24_exercises_set.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 24 | Exercises - Set

**Sets** are used when you need to store an unordered collection of unique items. Since they are unordered, elements do not have indices and cannot be accessed by position.

## Creating a Set

In [1]:
# Method 1 | Using curly bracket {}
# Useful for initializing a set where some values are already known
syntax_set = {1, 2, 3}
print(syntax_set)
print(type(syntax_set))

{1, 2, 3}
<class 'set'>


In [5]:
# Method 2 | Using the set function
# Useful when you want to create an empty set or convert a list to a set
syntax_set = set([1, 2, 3])
print(syntax_set)
print(type(syntax_set))

{1, 2, 3}
<class 'set'>


As curly brackets `{}` are also used to create dictionaries, at least one element needs to be included.

In [8]:
# no element included, dictionary created
empty_set = {}
print(empty_set)
print(type(empty_set))

{}
<class 'dict'>


In [12]:
empty_set = set()
print(empty_set)
print(type(empty_set))

set()
<class 'set'>


## No Duplicates

Duplicate elements are not allowed in sets and will be ignored.

In [7]:
# duplicates removed
duplicate_set = {1, 1, 2, 2, 3, 3}
print(duplicate_set)

# element 1 exists so addition not successful
duplicate_set.add(1)
print(duplicate_set)

# element 4 does not exist so addition successful
duplicate_set.add(4)
print(duplicate_set)

{1, 2, 3}
{1, 2, 3}
{1, 2, 3, 4}


## Set Methods

| Method | Description |
| --- | --- |
| `add` | Adds an element to the set |
| `remove` | Remove an element from the set |
| `pop` | Remove and return an arbitrary element from the set |
| `clear` | Removes all elements from the set |

In [15]:
maniputation_set = {1, 2, 3, 4}
print("Before:", maniputation_set)

maniputation_set.add(5)
print("After add:", maniputation_set)

maniputation_set.remove(3)
print("After remove:", maniputation_set)

removed_element = maniputation_set.pop()
print("After pop:", maniputation_set)
print("Removed element:", removed_element)

maniputation_set.clear()
print("After clear:", maniputation_set)

Before: {1, 2, 3, 4}
After add: {1, 2, 3, 4, 5}
After remove: {1, 2, 4, 5}
After pop: {2, 4, 5}
Removed element: 1
After clear: set()


## Common Set Use Cases

The most common use case for sets are to remove duplicates from a list.

⚠️ Be aware of the unordered nature of sets when the order of the data is important.

In [18]:
original_list = [4, 1, 2, 3, 3, 2, 1, 4]
new_set = set(original_list)
print("Original list:", original_list)
print("Converted to a set:", new_set)
# order 4, 1, 2, 3 not retained

converted_list = list(new_set)
print("Converted back to a list:", converted_list)

Original list: [4, 1, 2, 3, 3, 2, 1, 4]
Converted to a set: {1, 2, 3, 4}
Converted back to a list: [1, 2, 3, 4]


## Membership Testing

The `in` and `not in` operators can be used to test membership in a set.

In [20]:
members_set = {"Adam", "John", "Peter"}

if "Adam" in members_set:
    print("Adam is a member")
else:
    print("Adam is not a member")

Adam is a member


## Sets Versus Lists & Dictionaries

### Similarities

- mutable
- dynamic sizing
- iterable

### Differences from Lists

| Sets | Lists |
| --- | --- |
| Must be unique | Duplicates allowed |
| Only immutable types (cannot be changed) | Any data type |
| Unordered | Ordered |
| Faster membership testing | Slower membership testing |
| Slower iteration | Faster iteration |

### Difference from Dictionaries

| Sets | Dictionaries |
| --- | --- |
| Only immutable types (cannot be changed) | Keys must be immutable, values can be any type |
| Elements must be unique | Keys must be unique, values can be duplicated |
| Finding an element by value is fast | Finding an element by key is fast, but by value is slow |
| Unordered	| Insertion-ordered |

## Set Operations

### Union

Combines all unique elements from both sets.

💡 Think SQL `UNION`

In [27]:
first_set = {1, 2, 3, 4}
second_set = {3, 4, 5, 6}

union_set = first_set.union(second_set)
print(union_set)

{1, 2, 3, 4, 5, 6}


### Intersection

Finds elements that are common to both sets.

💡 Think SQL `INNER JOIN`

In [28]:
first_set = {1, 2, 3, 4}
second_set = {3, 4, 5, 6}

intersection_set = first_set.intersection(second_set)
print(intersection_set)

{3, 4}


### Symmetric Difference

Finds elements that are in either set, but not in both.

💡 Think SQL `FULL OUTER JOIN` with `WHERE` excluding `INNER JOIN`,

In [29]:
first_set = {1, 2, 3, 4}
second_set = {3, 4, 5, 6}

symmetric_difference_set = first_set.symmetric_difference(second_set)
print(symmetric_difference_set)

{1, 2, 5, 6}


## Exercise 1

Create a variable called `my_set` and assign it to a set with 5 numbers:

In [3]:
my_set = {1, 2, 3, 4, 5}
print(my_set)

{1, 2, 3, 4, 5}


## Exercise 2

Create a variable called `duplicate_set` and assign it to a set with the number 1, 1, 2, 2, 3, 3. What numbers will be in the set?

In [4]:
duplicate_set = {1, 1, 2, 2, 3, 3}
print(duplicate_set)

{1, 2, 3}


## Exercise 3

Given the set my_set, follow the instructions in the comments to solve this task.

```python
my_set = {1, 3, 5, 6}
# add a number that is already in the set

print(my_set)
# add a number that is NOT already in the set

print(my_set)
# pop one element out of the set

print(my_set)
# clear the set
```

In [16]:
my_set = {1, 3, 5, 6}
# add a number that is already in the set
my_set.add(1)
print(my_set)
# add a number that is NOT already in the set
my_set.add(4)
print(my_set)
# pop one element out of the set
my_set.pop()
print(my_set)
# clear the set
my_set.clear()
print(my_set)

{1, 3, 5, 6}
{1, 3, 4, 5, 6}
{3, 4, 5, 6}
set()


## Exercise 4

Remove the duplicates from the array `original_list` and store in the variable `my_unique_list`. Make sure `unique_list` is a list: `original_list = [3, 2, 1, 1, 2, 3, 4, 5]`

In [19]:
original_list = [3, 2, 1, 1, 2, 3, 4, 5]
my_unique_list = list(set(original_list))
print(type(my_unique_list))

<class 'list'>


## Exercise 5

Complete the function `is_friendly` that returns True if the insect is in the `friendly_insects` set and `False` if it is not: `friendly_insects = {"ladybug", "butterfly", "bee"}`

In [22]:
friendly_insects = {"ladybug", "butterfly", "bee"}

def is_friendly(insect):
    if insect in friendly_insects:
        return True
    else:
        return False

print(is_friendly("ladybug"))
print(is_friendly("fly"))

True
False


## Exercise 6

Given a list of animals, use a set to check if there are duplicates in the list. Assign the result to the variable `has_duplicates`: `animals = ["ladybug", "butterfly", "bee", "grasshopper", "ladybug"]`

In [23]:
animals = ["ladybug", "butterfly", "bee", "grasshopper", "ladybug"]

has_duplicates = len(animals) != len(set(animals))
print(has_duplicates)

True


## Exercise 7

Find the union of the four sets and assign it to the variable `union_set`.

```python
set_1 = {1, 8, 5, 7}
set_2 = {1, 8, 2, 3, 4, 8}
set_3 = {1, 8, 5, 6, 8}
set_4 = {0, 1, 2, 3, 4, 5}

union_set = set_1
```

In [30]:
set_1 = {1, 8, 5, 7}
set_2 = {1, 8, 2, 3, 4, 8}
set_3 = {1, 8, 5, 6, 8}
set_4 = {0, 1, 2, 3, 4, 5}

union_set = set_1.union(set_2, set_3, set_4)
print(union_set)

{0, 1, 2, 3, 4, 5, 6, 7, 8}


## Exercise 8

Finn and his friends are trying to agree on what food they want to order. Find the food they all have in common and assign it to the variable `intersection_set`.

```python
food_finn_likes = {"sushi", "pizza", "thai", "greek"}
food_jake_likes = {"thai", "italian", "salad", "greek", "sushi"}
food_marceline_likes = {"chocolate", "cocoa", "thai", "marshmallows"}

intersection_set = food_finn_likes          # Add your code here

print(intersection_set)
```

In [31]:
food_finn_likes = {"sushi", "pizza", "thai", "greek"}
food_jake_likes = {"thai", "italian", "salad", "greek", "sushi"}
food_marceline_likes = {"chocolate", "cocoa", "thai", "marshmallows"}

intersection_set = food_finn_likes.intersection(food_jake_likes, food_marceline_likes)

print(intersection_set)

{'thai'}


## Exercise 9

You are planning a movie night where you’ll show _Wayne’s World 1_ and _Wayne’s World 2_. Some people have only purchased tickets to one movie, while others have purchased tickets to both.
Find the people who have only purchased tickets to one movie and assign them to the variable `offer_promotion_to`.

```python
movie_1 = {"Alice", "Bob", "David", "Ella", "Frank", "Isla", "James"}
movie_2 = {"Alice", "Bob", "Chloe", "Ella", "Frank", "Grace", "Henry", "James"}

offer_promotion_to = movie_1  ## Add your code here

print(offer_promotion_to)
```

In [32]:
movie_1 = {"Alice", "Bob", "David", "Ella", "Frank", "Isla", "James"}
movie_2 = {"Alice", "Bob", "Chloe", "Ella", "Frank", "Grace", "Henry", "James"}

offer_promotion_to = movie_1.symmetric_difference(movie_2)

print(offer_promotion_to)

{'Henry', 'David', 'Isla', 'Grace', 'Chloe'}


## Exercise 10

You want to count the number of unique visitors to your website. Given the following list of website visit IDs, create a collection with only unique visitor IDs and assign it to the variable “unique_visitors”.

```python
website_visitors = [524, 335, 306, 28, 42, 181, 463, 45, 45, 524, 28, 42]

unique_visitors = website_visitors # change this line

print(unique_visitors)
```

In [33]:
website_visitors = [524, 335, 306, 28, 42, 181, 463, 45, 45, 524, 28, 42]

unique_visitors = set(website_visitors)

print(unique_visitors)

{42, 524, 45, 335, 463, 306, 181, 28}


## Exercise 11

You have created a dating website and want to match people based on their interests. Given an individual (“main_person”) and two options (“option_1” and “option_2”), use set operations to determine which person has the most shared interests.

```python
main_person = {"football", "wine", "reading", "travel", "swimming", "golf", "fashion", "long term dating"}
option_1 = {"movies", "math", "netflix", "short term dating", "fashion", "wine", "golf", }
option_2 = {"travel", "long term dating", "golf", "fashion"}

shared_interests_with_option_1 = main_person # change this line
print(shared_interests_with_option_1)

shared_interests_with_option_2 = main_person # change this line
print(shared_interests_with_option_2)

if len(shared_interests_with_option_1) > len(shared_interests_with_option_2):
   print ("Option 1 is the best match")
elif len(shared_interests_with_option_1) < len(shared_interests_with_option_2):
   print ("Option 2 is the best match")
else:
   print ("Both options are equally good")
```

In [35]:
main_person = {"football", "wine", "reading", "travel", "swimming", "golf", "fashion", "long term dating"}
option_1 = {"movies", "math", "netflix", "short term dating", "fashion", "wine", "golf", }
option_2 = {"travel", "long term dating", "golf", "fashion"}

shared_interests_with_option_1 = main_person.intersection(option_1)
print(shared_interests_with_option_1)

shared_interests_with_option_2 = main_person.intersection(option_2)
print(shared_interests_with_option_2)

if len(shared_interests_with_option_1) > len(shared_interests_with_option_2):
   print ("Option 1 is the best match")
elif len(shared_interests_with_option_1) < len(shared_interests_with_option_2):
   print ("Option 2 is the best match")
else:
   print ("Both options are equally good")

{'wine', 'fashion', 'golf'}
{'golf', 'fashion', 'long term dating', 'travel'}
Option 2 is the best match


## Exercise 12

When nominating a candidate for an election, a voter can write in any name. Only one nomination is needed for a candidate to appear on the ballot.

Given the following lists of write-in nominations from different polls, create a set with no repeat nominations and assign it to the variable “unique_nominations”.

```python
poll_a = ["Maxwell Sterling", "Maxwell Sterling", "Harriet Vane", "Leonora Quint", "Harriet Vane", "Maxwell Sterling"]
poll_b = ["Harriet Vane", "Vincent Thorne", "Harriet Vane", "Selina Morrow", "Harriet Vane", "Harriet Vane"]
poll_c = ["Selina Morrow", "Jasper Creed", "Selina Morrow", "Jasper Creed", "Selina Morrow", "Maxwell Sterling"]

unique_nominations =             # change this line

print(unique_nominations)
```

In [36]:
poll_a = ["Maxwell Sterling", "Maxwell Sterling", "Harriet Vane", "Leonora Quint", "Harriet Vane", "Maxwell Sterling"]
poll_b = ["Harriet Vane", "Vincent Thorne", "Harriet Vane", "Selina Morrow", "Harriet Vane", "Harriet Vane"]
poll_c = ["Selina Morrow", "Jasper Creed", "Selina Morrow", "Jasper Creed", "Selina Morrow", "Maxwell Sterling"]

unique_nominations = set(poll_a + poll_b + poll_c)

print(unique_nominations)

{'Leonora Quint', 'Harriet Vane', 'Selina Morrow', 'Jasper Creed', 'Vincent Thorne', 'Maxwell Sterling'}
