<div class="pagebreak"></div>

# Sets
A set is an unordered collection of objects with no duplicates.

Sets support many of the same operations that lists and tuples have, but set objects also support mathematical operations such as union, intersection, and difference.  Since sets are unordered, index operations to retrieve and insert elements at specific points in the collection do not make sense and have no implementation.

Another way to think about sets is to equate them to a dictionary with its keys but no values corresponding to those keys.

[Set theory](https://en.wikipedia.org/wiki/Set_theory) plays a pivotal role in many computer science applications - most significantly relational databases, but programmers use sets for various purposes. For example, we can create a unique collection (no repeated elements) from a list by converting it to a set.

To create a set, use the built-in function `set()` or enclose a comma-separated list of values within curly brackets. The `set()` function iterates through the the value of its argument and adds members to the set, removing duplicates.

To create an empty set, you must use the `set()` function as `{}` creates an empty dictionary.

In [None]:
empty_set = set()
primes = {1,2,3,5,7,13,11,13,17,19}
colors = set(("red","green","blue","red")) # without the inner parenthesis, we have 4 arguments.  Set expects an single iterable.

print(colors)

In [None]:
set('abracadabra')

In the above example, we pass a string, an iterable sequence of characters. Only five unique characters exist in the string. 

As members must be unique within a set, we cannot place mutable objects into a set.  Otherwise, the value of that item could change and affect the set's uniqueness.  In the second example, we create a set of a single tuple.  Tuples are immutable.

In [None]:
invalid_set = {['test', 'word']}    # raises a TypeError, sets cannot contain mutable members

In [None]:
valid_set = { ('test', 'word')}

## Getting the size (number of entries) of a set
As with the other data structures, we can get the number of entries (length, size) of a set by using the `len()` function 

In [None]:
len(primes)

## Adding to a set
Use the `add()` method on a set to add another element.  To add multiple items, use the `update()` method.

In [None]:
primes.add(23)
primes.update([29,31,37])
primes

## Deleting from a set
Use `remove()` to delete (remove) an item from a set by the value. If the value does not exist, the Python interpreter raises a `KeyError`.  `discard()` will remove a value from a set, but will not raise an error if the value does not exist.

To remove multiple items, use `difference_update()`

In [None]:
primes.add(4)
print(primes)
primes.remove(4)
print(primes)
primes.difference_update([6,8,29,31,37])
print(primes)

## Iterating over Sets
Just like the other built-in data structures, we can use the `for` statement(loop) to process all of the items in a set.

In [None]:
for color in colors:
    print(color)

## Checking if a values exists
Use the `in` operator to test if a contains a value for not

In [None]:
'magenta' in colors

## Set Operations
Python supports standard math operations on sets: union, intersection, difference, and symmetric difference.

Set Operations can be performed either by method calls or opeartors.


![](images/set_operations.png)

<small>Source:https://www.datacamp.com/community/tutorials/sets-in-python</small>

In [None]:
us_flag_colors = set(["red", "white", "blue"])
france_flag_colors = set(["blue", "white", "red"])
switzerland_flag_colors = set(["white", "red"])
mexico_flag_colors = {"green", "white", "red"}
germany_flag_colors = {"black", "red", "gold"}

### Union
Returns a new set with all of the unique elements from both sets.  Can either use the `union()` method or `|`

Note: the difference with the `update()` method is that update operates on the set itself, not creating a new set.

In [None]:
print(us_flag_colors | germany_flag_colors)
print(mexico_flag_colors.union(switzerland_flag_colors))

The union operation can be performed on multiple sets at once:

In [None]:
print(us_flag_colors | germany_flag_colors | france_flag_colors | mexico_flag_colors | switzerland_flag_colors)
print(us_flag_colors.union(germany_flag_colors, france_flag_colors, mexico_flag_colors, switzerland_flag_colors))

### Intersection
Returns a new set with all of the elements that exist in both sets.  Can use `intersection()` or `&`

This operation can be performed on multiple sets as well.

In [None]:
print(us_flag_colors & germany_flag_colors)
print(mexico_flag_colors.intersection(switzerland_flag_colors))

In [None]:
print(us_flag_colors & germany_flag_colors & france_flag_colors & mexico_flag_colors & switzerland_flag_colors)
print(us_flag_colors.intersection(germany_flag_colors, france_flag_colors, mexico_flag_colors, switzerland_flag_colors))

### Difference
Returns a new set with the elements in the first set that are not in the other set. (Difference removes any elements in the second set from the first set.)  Use `difference()` or `-`

Yes, you can chain these operators. The operations evaluate left to right.

In [None]:
print(us_flag_colors - germany_flag_colors)
print(mexico_flag_colors.difference(switzerland_flag_colors))

print (us_flag_colors - germany_flag_colors - switzerland_flag_colors)

### Symmetric Difference
Returns a new set with the elements in the first set and elements in the second set, but the elements are not in both sets.  Use `symmetric_difference()` or `^`

In [None]:
print(us_flag_colors ^ germany_flag_colors)
print(mexico_flag_colors.symmetric_difference(switzerland_flag_colors))

### Set Operation Discussion

Given [De Morgan's Laws](https://en.wikipedia.org/wiki/De_Morgan%27s_laws) for union and intersection, demonstrate these two equations in Python using sets:

$\overline{A \cup B} = \overline{A} \cap \overline{B}$

$\overline{A \cap B} = \overline{A} \cup \overline{B}$

where
* $\overline{A}$ is the negation of set $A$
* $\cap$ is the intersection operator
* $\cup$ is the union operator

At first glance, this problem seems easy - these laws have been around since the 19th century.  However, what does $\overline{A}$ mean to a computer?  Fundamentally, this is everything in the universe $U$ not in set $A$.  Python, and most other general-purpose programming languages, do not have direct support for such a universal set. Therefore, we need to define the universal set for our problem domain.

So, let us practice these laws on the flag colors. We can define a universal set of colors:

$U = \{ red, orange, yellow, green, blue, purple, pink, brown, grey, black, white, gold, silver \}$



In [None]:
a = { "green", "white", "red"}
b = { "black", "red", "gold"}
U = { "red", "orange", "yellow", "green", "blue", "purple", "pink", "brown", "grey", "black", "white", "gold", "silver" }

In [None]:
print(U - (a | b) == (U - a) & (U - b))
print(U - (a | b))
print(a | b)

In [None]:
print(U - (a & b) == (U - a) | (U - b))
print(U - (a & b))
print(a & b)

Throughout your programming career, you will find De Morgan's laws arising in Boolean operations - an `if` statements will have a condition like `not(A or B)`.

## Comparing Sets
As with lists, we can use the complete range of comparison operators on sets.

|Comparison | Method| Description|
|:----:|:---|:----|
| `a == b` | |Do sets `a` and `b` contain the same members?
| `a != b` | |Is there at least one difference between sets `a` and `b`?
| `a < b` | `a.issubset(b)`|Is set `a` a proper subset of `b`?
| `a <= b` | | Is set `a` a subset or equal to `b`?
| `a > b` | `a.issuperset(b)`| Is set `a` a proper superset of `b`?
| `a >= b` | |Is set `a` a superset or equal to `b`?

In [None]:
print("us_flag_colors == france_flag_colors:",us_flag_colors == france_flag_colors)
print("us_flag_colors != france_flag_colors:",us_flag_colors != france_flag_colors)
print("us_flag_colors <  france_flag_colors:",us_flag_colors < france_flag_colors)
print("us_flag_colors <= france_flag_colors:",us_flag_colors <= france_flag_colors)
print("us_flag_colors >  france_flag_colors:",us_flag_colors > france_flag_colors)
print("us_flag_colors >= france_flag_colors:",us_flag_colors >= france_flag_colors)

print ("switzerland_flag_colors < us_flag_colors:", switzerland_flag_colors < us_flag_colors)
print ("us_flag_colors > switzerland_flag_colors:", us_flag_colors > switzerland_flag_colors)

## Exercises
TODO
Thought  create two lists of strings  (common household stuff - book, cup , glass, knife, fork, ...)
- create sets
- perform various operations
