# Sets

## Definition

A set is an abstract data type that is a well-defined, unordered collection of elements. Sets contain only distinct elements. Sets are one of the most fundamental concept in mathematics since they were developed near the end of the 19th century.

Unlike various other data structures, elements are more often tested for membership within a set rather than the retrieval of a specific element for said set.


In [8]:
# To create a set in Python, we use curly parenthesis, quite similar to creating a dictionary 
# Let's create a set of numbers ranging from 1 to 7
A = {1, 2, 3, 4, 5, 6, 7}
print ('Set A: ', A)

# Since a set only holds distinct values, what would happen if we attempted to create the following
B = {1, 2, 3, 3, 3, 4, 5, 5, 6, 6, 7}
print ('Set B: ', B)

# Assert that A is in fact equal to B since we discard the additional 3, 5 and 6 values
assert A == B

Set A:  {1, 2, 3, 4, 5, 6, 7}
Set A:  {1, 2, 3, 4, 5, 6, 7}


A small gotcha with Python? How would you create an empty set? 

Like so,

A = {} ?

This would create an empty dict object in Python. To achieve this we need to call the set constructor

In [None]:
# An empty set
A = set()

## Operations

### Union

A union of two sets is the combined set of all elements from a given set A and B. The *union* operation in Python is performed by either using the | operator or the instance method _union()_

Let's consider out set A from above and create a new set C which contains additional values to demonstrate set union

In [7]:
C = {11, 12, 13}

# Now, let's union set A and set C
union_of_two_sets = A.union(C)

# Be sure to make sure set A and C are correctly asserted
assert union_of_two_sets == {1, 2, 3, 4, 5, 6, 7, 11, 12, 13}
print (union_of_two_sets)

{1, 2, 3, 4, 5, 6, 7, 11, 12, 13}


### Intersection

Intersection of two sets is the elements of a given set A and B that are _only_ available in both sets. For example, given set A and set B, both list of names.

Intersection is performed by the ampersand operator - & or alternatively the instance method _intersection()_


In [16]:
# Two sample sets of names
A = {'john', 'mary', 'joseph'}
B = {'john', 'derek', 'sandra'}

C = A.intersection(B) # Or A | B

print ('A - {} \r\nB - {} \r\nIntersection of A and B - {}'.format(A, B, C))

assert C == {'john'}

A - {'joseph', 'mary', 'john'} 
B - {'derek', 'sandra', 'john'} 
Intersection of A and B - {'john'}


### Difference

Difference is the difference between sets A and B. This is the inverse of intersection, such that we will get the values only in one set and not in another.

Difference is performed using the hyphen operator -

And of course using the instance method _difference()_

In [18]:
# Two sample sets of colours
A = {'blue', 'red', 'green'}
B = {'magenta', 'cyan', 'blue'}

C = A.difference(B) # Or A - B

print ('A - {} \r\nB - {} \r\nDifference of A and B - {}'.format(A, B, C))

assert C == {'red', 'green'}

A - {'red', 'blue', 'green'} 
B - {'magenta', 'cyan', 'blue'} 
Difference of A and B - {'red', 'green'}


### Symmetric Difference

Symmetric Difference of two sets is the set of elements that exist in both a set A and a set B except elements that are common in both. 

Symmetric Difference is performed using the hat operator - ^.

Or the instance method __symmetric_difference()__

In [23]:
# Two sample sets of car manufacturers
A = {'ford', 'opel', 'land rover'}
B = {'lotus', 'toyota', 'nissan', 'opel'}

C = A.symmetric_difference(B) # Or A ^ B

print ('A - {} \r\nB - {} \r\nSymmetric Distance of A and B - {}'.format(A, B, C))

assert C == {'nissan', 'ford', 'land rover', 'toyota', 'lotus'}

A - {'ford', 'land rover', 'opel'} 
B - {'nissan', 'toyota', 'opel', 'lotus'} 
Symmetric Distance of A and B - {'nissan', 'ford', 'land rover', 'toyota', 'lotus'}
