# Sets 

Similar to a list, but does not have duplicate values. A benifit to using a set instead of a list is the speed in which a set is computed. 

Sets are created: 
- set() creates an empty set
- set(listVariable) creates a set based on a list, string or tuple

In [3]:
Set = set()
print(type(Set))

<class 'set'>


In [6]:
L = [10, 13, 10, 5, 6, 13, 2, 10, 5]
S = set(L) # Duplicates in L will be removed
print(type(S))
print(S) 

<class 'set'>
{2, 5, 6, 10, 13}


In [31]:
S = {x**2 for x in range(10)}
print(type(S))
print(S) 

<class 'set'>
{0, 1, 64, 4, 36, 9, 16, 49, 81, 25}


## Set Functions & Methods 

- len() function: returns the number of elements in the set
- add method: adds an element to a set
- update method: adds a group of elements to a set
- remove and discard methods: remove the specified item from the set
- clear method: clears all the elements of the set

In [7]:
Set = S 
print(type(Set))
print(Set) 

<class 'set'>
{2, 5, 6, 10, 13}


In [9]:
#Returns the number of elements in the set 
len(Set)

5

In [13]:
#Adds a single variable to a set
Set.add(89)
print(type(Set))
print(Set)

<class 'set'>
{2, 5, 6, 10, 13, 89}


In [18]:
Set.update('Hello')
print(Set)

{2, '2', '9', 5, 6, 'e', 10, 13, 'H', 'l', 89, 'o'}


In [19]:
Set.remove("2")
print(Set)

{2, '9', 5, 6, 'e', 10, 13, 'H', 'l', 89, 'o'}


In [20]:
Set.discard('9')
print(Set)

{2, 5, 6, 'e', 10, 13, 'H', 'l', 89, 'o'}


In [22]:
Set.clear()
print(type(Set))
print(Set)

<class 'set'>
set()


## Set Comparison Operators 

In [23]:
Set1 = {2, 5, 7, 8, 9, 12}
Set2 = {1, 5, 6, 7, 11, 12}
print(type(Set1))

<class 'set'>


In [25]:
#What elements are in Set1, Set2 or both Sets
Set1 | Set2

#Could also call the .union method 
#Set1.union(Set2)

{1, 2, 5, 6, 7, 8, 9, 11, 12}

In [26]:
#What elements are in both Set1 and Set2
Set1 & Set2

#Could also call the .insersection method 
#Set1.intersection(Set2)

{5, 7, 12}

In [27]:
#What elements are in Set1 but NOT in Set2
Set1 - Set2 

#Could also call the .difference method 
#Set1.difference(Set2)

{2, 8, 9}

In [28]:
#What elements are in Set2 but NOT in Set1
Set2 - Set1

{1, 6, 11}

In [29]:
#Elements in A OR B but NOT both 
Set1 ^ Set2 

#Could also call the symmetric_difference method
#Set1.symmetric_difference(Set2)

{1, 2, 6, 8, 9, 11}

## Subsets 

Set A is subset of set B if all the elements in set A are included in set B. There are two ways to determine if one set is a subset of another: 
- .issubset method 
- <= operator 

In [32]:
#determine whether set 1 is subset of set 2
Set1.issubset(Set2)

False

In [35]:
#The order matters, SetB is a subset of SetA
SetA = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
SetB = {1, 2, 3, 4, 5}
SetB.issubset(SetA)

True

In [36]:
#SetA is NOT a subset of SetB
SetA.issubset(SetB)

False

In [37]:
#Is SetB a subset of SetA
SetB <= SetA

True

## Supersets 

Set A is superset of set B if it contains all the elements of set B. There are two ways to determine if one set is a superset of another: 

- **.issuperset method**
- **>= operator** 

In [38]:
SetA = {1, 2, 3, 4}
SetB = SetA 

#Is SetA a perfect match with SetB 
SetA.issuperset(SetB)

True

In [40]:
#Is SetA a perfect match with SetB 
SetA >= SetB

True

## Sets & Computing Time 

Run the code below to see the computing time difference for a large list and a large set. The results on my computer were: 
- List elapsed: 11.654843807220459
- Set elapsed: 0.08272624015808105

In [15]:
# time.clock() was removed in python 3.8
# time.clock() was replaced with time.time()

# Data structure size
size = 1000

# Make a big set
S = {x**2 for x in range(size)}

# Make a big list
L = [x**2 for x in range(size)]

# Verify the type of S and L
print('Set:', type(S), ' List:', type(L))

# Search size
search_size = 1000000

# Time list access
start_time = time.time()
for i in range(search_size):
    if i in L:
        pass

stop_time = time.time()
print('List elapsed:', stop_time - start_time)

# Time set access
start_time = time.time()
for i in range(search_size):
    if i in S:
        pass
stop_time = time.time()
print('Set elapsed:', stop_time - start_time)

Set: <class 'set'>  List: <class 'list'>
List elapsed: 11.654843807220459
Set elapsed: 0.08272624015808105
