## Sets

A type of data structure that consists of unique items.

#### Sets vs lists:

Sets are a good alternative to lists when we want to store only the unique values of a list.

The greatest benefit of using sets is that lookup is done by a hash-table. When you search through a list, then you must, in the worst case, look through all n entries to find what you look for. This is called O(n) complexity. When you look through a set, a hash table can give you whether or not the item is an element of the set in only one step, for example checking if a username is currently taken. This is O(1) complexity.

Sets are created by enclosing a sequence of values inside a pair of curly brackets...

In [1]:
set1 = {'banana', 'apple', 'pear', 'peach'}

In [2]:
print(len(set1))

4


In [3]:
print(type(set1))

<class 'set'>


...or by passing a list of values to the `set` function:

In [4]:
set2 = set(['banana', 'apple', 'pear', 'peach'])
print(len(set2))
print(type(set2))

4
<class 'set'>


Notice that the `set` function will split single strings into characters if not enclosed in brackets.

In [5]:
print(set('apple'))

{'l', 'e', 'p', 'a'}


Notice that only one of each character is stored.

In [6]:
print(set(['apple']))

{'apple'}


Notice that the values in a set are not necessarily stored in the same order that they were passed.

In [7]:
set1 = {'banana', 'apple', 'pear', 'peach'}

print(set1)

{'banana', 'apple', 'peach', 'pear'}


Therefore, we cannot access an item in a set using its index.

In [8]:
set1[0] 

TypeError: 'set' object is not subscriptable

To access whether an item is in a set, we must either use the membership operator...

In [9]:
set1 = {'banana', 'apple', 'pear', 'peach'}

In [None]:
'apple' in set1

...or print all of the items in the set:

In [None]:
for item in set1:
    print(item)

Sets differ from lists in that the order of the items do not matter:

Here we test if the lists are equal:

In [None]:
list1 = [1, 2, 4, 3]
list2 = [1, 2, 3, 4]

list1 == list2

Here we test if the sets are equal:

In [None]:
set1 = {1, 2, 4, 3}
set2 = {1, 2, 3, 4}

set1 == set2

Sets only care about which elements are present, not the order.

### Common set operations:

In [None]:
set1 = {'banana', 'apple', 'pear', 'peach'}

Add an item to a set:

In [None]:
set1.add('pineapple')
set1

Because sets do not have duplicate elements, adding an already existing item to a set has no effect.

In [None]:
set1.add('pineapple')
set1

Items can be removed from a set using either the `remove` function...

In [None]:
set1.remove('pineapple')
set1

...or the `discard` function:

In [None]:
set1.discard('apple')
set1

Howevever, notice that `remove` will throw an error if the item is not in the set.

In [None]:
set1.remove('pineapple')
set1

Whereas discard will remove the item if the item is in the set but not throw an error if it is not in the set.

In [None]:
set1.discard('pineapple')
set1

### Specific set operations:

In addition to common sequence operations, sets have some operations that are specific to sets.

In [None]:
canadian = {'Red', 'White'}
british = {'Red', 'Blue', 'White'}
italian = {'Red', 'White', 'Green'}

Two sets are equal if, and only if, they contain the exact same values.

In [None]:
canadian == british

In [None]:
canadian != british

The `issubset` function checks whether a set is a subset of another set.

In [None]:
canadian.issubset(british) 

In [None]:
italian.issubset(british) 

The `union` function returns a set containing all of the values in two sets.

In [None]:
british.union(italian) 

The `intersection` function returns a set containg only the values in both sets.

In [None]:
british.intersection(italian) 

The `difference` function returns a set with the values that are only in the first set and not in the second set. Think of it as a 'minus' operation.

In [None]:
british.difference(italian) 

In [None]:
italian.difference(british) 