# Sets

You've already learned quite a bit about sets, so most of this is just a refresher. But maybe there are a few little new tricks you will see<br> here. In between and at the end there are some exercises that you can do together with your partner to see if you can put the theory into<br> practice.

The next data structure that we're going to take a look at today is the ```set```. A set combines some of the awesome features of both the ```list```<br> and the ```dictionary```. A set is defined as an unordered, mutable collection of unique items. This means that a ```set``` is a data structure where<br> you can store items, without caring about their order and knowing that there will be at most one of them in the structure.

This description, while highly informal, is rather spot on. Sets in Python are actually analogous to sets in math. For this reason, much of<br> that you will hear about when learning and talking about Python sets is similar too, if not exactly the same as, that which applies to mathematical<br> sets ([here's](https://en.wikipedia.org/wiki/Set_(mathematics)) the wiki on sets if you want a quick overview of them).

## Objectives
At the end of this notebook you should be able to:

- understand the difference between sets and lists
- apply mathematical operations on sets
- know when to use sets

Let's take a look at how we construct sets. The first way should look familiar. The second way not so much.

In [1]:
my_set = set([2, 4, 6])

In [4]:
second_set = {2, 4, 6}

In [5]:
# we create identical sets:
my_set == second_set

True

Here, we see the two ways we have to make sets. We can use the ```set``` constructor, which takes an iterable, as well as the syntactic<br> sugary curly braces. (**Note**, the curly braces are also used for dictionaries. With dictionaries, a colon is used separating the keywords<br> and values. This is how Python determines whether or not you're declaring a set or a dictionary. The only place where Python doesn't<br> know is when declaring an empty structure. When this happens, Python can't figure out if you want a dictionary or a set. For this<br> reason, the empty curly braces ```{}``` always mean an empty dictionary to remove ambiguity). Sets with the same items in them will<br> evaluate as equal.

There are several methods that are available on sets we see.

As discussed earlier, many of these methods are similar to, if not the same as, those available to mathematical sets. Naturally, we see<br> ways to compute set operations (```intersection()```, ```union()```, etc.) and alter the set (```add()```, ```update()```, ```pop()``` and ```remove()```).<br> Let's take a look at some of these methods in action.

In [6]:
my_set, second_set = {1, 2, 3}, {5, 6, 7}

In [7]:
my_set.union(second_set)

{1, 2, 3, 5, 6, 7}

In [8]:
my_set.add(4)
my_set

{1, 2, 3, 4}

In [9]:
my_set.update(second_set)
my_set

{1, 2, 3, 4, 5, 6, 7}

In [10]:
my_set.remove(5)
my_set

{1, 2, 3, 4, 6, 7}

In [11]:
my_set.intersection(second_set)

{6, 7}

All of these methods should look fairly intuitive. The ```update()``` method is like an ```add()``` en masse. The ```union()``` method is like<br> adding two sets together, but since there are only unique elements in a set, it removes duplicates. The ```intersection()``` method<br> returns those elements that the sets have in common.

These are some of the most common set operations you will ever use. If you'd like to take a look at the documentation for all of<br> them, check it out [here](https://docs.python.org/2/library/stdtypes.html#set).

### Why Do We Need Sets?
Alright, that's cool, but when would I use a set? The most apparent answer is for times when you need to perform set operations, like<br> checking what elements two lists have in common. Take the set of them both and find the intersection of those sets. The most obvious<br> use case is to find the unique items in an iterable. There's also another amazing place where we'll want to use sets that might not be<br> so apparent.

Remember, when discussing dictionaries above, we talked about how checking if an item is in a list requires us to check every item in<br> the list? This can be computationally expensive and generally we want to avoid it. What do we do instead, then?

We use a set! The reason why lies in the fact that sets in Python are built very similarly to dictionaries. There's an underlying hash<br> table that allows elements to be stored, and queried for membership in the set (_Note, this means that the elements of a set have to be<br> immutable_). This operation happens much faster with sets than with lists ([here's](https://wiki.python.org/moin/TimeComplexity) some coverage on how quickly some Python methods<br> run). Let's take a look at this in action, and simultaneously learn about how to time things in IPython.

In [10]:
my_list = list(range(1000000))

In [11]:
my_set = set(my_list)

In [12]:
timeit 100000 in my_list

555 µs ± 4.03 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [13]:
timeit 100000 in my_set

28.9 ns ± 0.427 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


Here, we used the magic ```timeit``` function that's built into Jupyter. To use it, call ```timeit``` and then a line of code. We can see that<br> the list version of checking membership in a collection took ~20 000 times longer than the set version. That number would only get<br> bigger as the size of the collection that we're checking against grows.

## Check your understanding

**Part 1**

1. Make a set called ```first_set``` with the values 1-10 and another with the values 5-15 called ```second_set```.
2. Add the value 11 to ```first_set```.
3. Add the string ```'hey_you'``` to ```second_set```.
4. Using one of the methods discussed above, find what elements ```first_set``` and ```second_set``` have in common.
5. In one line of code, add all the elements of ```second_set``` to ```first_set```.

In [1]:
first_set = {1,2,3,4,5,6,7,8,9,10}
second_set = {5,6,7,8,9,10,11,12,13,14,15}
first_set.add(11)
print("First set",first_set)
second_set.add("hey_you")
print("second set",second_set)
print("Common",first_set.intersection(second_set))
first_set.update(second_set)
print("First set",first_set)


First set {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}
second set {5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 'hey_you'}
Common {5, 6, 7, 8, 9, 10, 11}
First set {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 'hey_you'}


**Part 2**

Given the lists ```first_list = ['hello', 'there', 'things', 'stuff', 'other', 'soda', 'chicken wings',<br> 'things', 'soda']``` and ```'second_list = ['turkey sandwich', 'guacamole', 'chicken wings', 'OJ', 'soda']:```

1. Find the unique elements in ```first_list```.
2. In one line, find the common elements in the two lists.
3. Write a single line that outputs ```True``` or ```False``` depending on if the string ```'pizza'``` is in both lists.

In [None]:
#1
first_list = ['hello', 'there', 'things', 'stuff', 'other', 'soda', 'chicken wings', 'things', 'soda']
second_list = ['turkey sandwich', 'guacamole', 'chicken wings', 'OJ', 'soda']

unique_elem = set(first_list)

print(unique_elem)

{'other', 'things', 'soda', 'chicken wings', 'stuff', 'hello', 'there'}


In [None]:
#2
first_list = ['hello', 'there', 'things', 'stuff', 'other', 'soda', 'chicken wings', 'things', 'soda']
second_list = ['turkey sandwich', 'guacamole', 'chicken wings', 'OJ', 'soda']

common = list(set(first_list).intersection(second_list))
print(common)

['chicken wings', 'soda']


In [None]:
#2
is_pizza_in_both = 'pizza' in set(first_list).intersection(second_list)
print(is_pizza_in_both)

False


**Part 3**

Write a Python program to create an intersection of sets below:<br>
- setx = {"green", "blue"}
- sety = {"blue", "yellow"}

In [5]:
setx = {"green", "blue"}
sety = {"blue", "yellow"}
print(setx.intersection(sety))

{'blue'}


**Part 4**

Write a Python program to find the length of the set given below:<br>
setn = {5, 10, 3, 15, 2, 20}

In [6]:
set_n = {5, 10, 3, 15, 2, 20}
print(len(set_n))

6
