# Sets

## Learning objectives
- Understand the nature of a set.
- Learn how to add to and update a set.
- Learn how to join sets using different methods.


## A Brief Introduction to Sets

- Sets are a data type in Python.
- They follow the rules of mathematical sets that you should already be familiar with.
- They are mutable and unordered, and they do not contain repeated items (items are unique).
- This means one useful usage of a set is to find all unique items in a list, as we will see
- Sets also have their own methods, with operations derived from mathematical sets.

We can define a set using the `set` method to cast, for example, a list into a set. If the list contained repeated elements, they will be removed in the set.

In [18]:
my_set = set([1, 2, 3, 4, 4, 4, 6])
print(my_set)

{1, 2, 3, 4, 6}


Observe above that number 4 appears only once in the set.

Also, observe that sets are represented by curly brackets (`{}`). This is the second way to define a set, using curly brackets when assigning it to a variable

In [19]:
my_set = {1, 2, 3, 4, 4, 4, 6}
print(my_set)

{1, 2, 3, 4, 6}


As mentioned above, sets are unordered and mutable. Mutable means that we will be able to change its content, as we will see later in this notebook

Unordered means that its elements don't have a specific order, and therefore sets can't be indexed.

In [20]:
# Trying to index a set
my_set[1]

TypeError: 'set' object is not subscriptable

After running the code above, we obtain a `TypeError`

### Sets functions and methods

We can retrieve the number of elements in a set using the `len` method, just like in a list

In [22]:
my_set = {1, 5, 3, 6, 7, 5, 4, 5, 5, 5, 6}
len(my_set)

6

And retrieve the minimum and maximum value in the set using the `min` and the `max` functions

In [23]:
min(my_set)

1

#### <font size=+2>`.add()`</font>

`add` (as the name suggests) adds an item to the set. As mentioned, sets are unordered, so it doesn't matter where we add it

In [24]:
set_x = set()

print(set_x)

set_x.add(1)

print(set_x)

set_x.add(2)

print(set_x)

# if we add 2 again, we see the set does not change, as items in a set are unique

set_x.add(2)

print(set_x)

set()
{1}
{1, 2}
{1, 2}


#### <font size=+1>Mathematical Operations on Sets</font>

Sets in Python share the same principles as sets in maths, so you can use the same operations. 

The most common ones are `Union`, `Intersection`, `Difference`, and `Symmetric Difference`

<p align=center><img src=images/sets.png width=400></p>

#### <font size=+2>`.union()`</font>

`union` essentially takes one set and it will add all its elements to another set

In [27]:
set_1 = {'Dog', 'Cat', 'Platypus', 'Koala'}
set_2 = {'Crocodile', 'Hyena', 'Koala', 'Cat'}
print(set_1)
print(set_2)
union_set = set_1.union(set_2)
print(union_set)

{'Platypus', 'Koala', 'Dog', 'Cat'}
{'Crocodile', 'Koala', 'Hyena', 'Cat'}
{'Platypus', 'Crocodile', 'Dog', 'Koala', 'Hyena', 'Cat'}


Once again, the obtained set doesn't contain repeated values

#### <font size=+2>`.intersection()`</font>

`intersection` returns a set containing the items common in both sets

In [28]:
inter_set = set_1.intersection(set_2)
print(inter_set)

{'Koala', 'Cat'}


#### <font size=+2>`.difference()`</font>

`difference` returns a set with the items that are in `set_1` but not in `set_2`

In [None]:
# a.difference(b) returns the items in a that are NOT in b
differ_set = set_1.difference(set_2)
print(differ_set)

#### <font size=+2>`.symmetric_difference()`</font>

`symmetric_difference` returns a set with the items that are in `set_1` and `set_2`, but without the items that are in BOTH

In [31]:
differ_set = set_1.symmetric_difference(set_2)
print(differ_set)

{'Crocodile', 'Platypus', 'Dog', 'Hyena'}


## Summary
We now understand:
- The nature of lists, and sets.
- The basic concept of mutability.
<br><br>

We now know:
- How to index and slice lists.
- List functions and methods including len(), .append(), .extend() etc.
- How to use a set to find the unique values in a list.

<br>

Please use this notebook as a reference, and refer to the links below for more information.

## Further reading
- List methods: https://docs.python.org/3/tutorial/datastructures.html
- Sets: https://docs.python.org/3/library/stdtypes.html#set