# Sets

Sets are lists with no duplicate entries. Let's say you want to collect a list of words in a paragraph: 

In [1]:
print(set("my name is Eric and Eric is my name".split()))

{'my', 'and', 'is', 'name', 'Eric'}


In [2]:
print(list("my name is Eric and Eric is my name".split()))

['my', 'name', 'is', 'Eric', 'and', 'Eric', 'is', 'my', 'name']


The above 'set' prints out the only 5 unique words and 'list' prints out 8 words. 

Sets are good in Python as they can calculate differences and intersections between other sets. For example, with a list of participants in events A and B:

In [47]:
a = set(["Jake", "John", "Eric","Jill"])
print(a)
b = set(["James", "Bill"])
b.add("Limmy")

print(b)

len(b)

{'Jake', 'Eric', 'John', 'Jill'}
{'James', 'Limmy', 'Bill'}


3

To find out which people attended both events, one can use the 'intersection' method:

In [40]:
print(a.intersection(b))
print(b.intersection(a))

{'John', 'Jill'}
{'John', 'Jill'}


To find out which members attended only one events, one can use the 'symmetric_difference' method:

In [41]:
print(a.symmetric_difference(b))
print(b.symmetric_difference(a))

{'Jake', 'Eric', 'Limmy'}
{'Jake', 'Eric', 'Limmy'}


One can use the "difference" method to find out which members attended only one event and not the other.

In [42]:
print(a.difference(b))
print(b.difference(a))

{'Jake', 'Eric'}
{'Limmy'}


One can use the 'union' method to generate a list of all participants:

In [43]:
print(a.union(b))

{'Jake', 'John', 'Jill', 'Limmy', 'Eric'}


In [44]:
a.issubset(b)

False

In [45]:
a.issuperset(b)

False

In [48]:
if a.isdisjoint(b):
    print("blahem")

blahem


In [59]:
c = set(["Theo","Keir"])

d = ["Germany", "Germany", "Austria"]
if c.isdisjoint(b):
    print("yes")

print(c)
print(d)

yes
{'Keir', 'Theo'}
['Germany', 'Germany', 'Austria']


In [64]:
my_unique = list(set(d))
print(my_unique)

['Germany', 'Austria']


In [52]:
set([x ** 2 for x in [1,2,3,4]])

{1, 4, 9, 16}

In [51]:
e = set(x ** 2 for x in [1,2,3,4])
print(e)

{16, 1, 4, 9}


In [53]:
set({x: x**2 for x in [1,2,3,4]})

{1, 2, 3, 4}

In [56]:
g = {**a, **b}
print(g)

TypeError: 'set' object is not a mapping

## Exercise

In the exercise below, use the given lists to print out a set containing all the participants from event A which did not attend event B.

In [9]:
a = set(["Jake", "John", "Eric"])
b = set(["John", "Jill"])

print(a)
print(b)

print(a.difference(b))

{'John', 'Jake', 'Eric'}
{'John', 'Jill'}
{'Jake', 'Eric'}


## Initialising a Set

In [2]:
my_set = set(1, 2, 3)

TypeError: set expected at most 1 argument, got 3

In [6]:
my_set = set([1,2,3])
print(my_set)

{1, 2, 3}


In [4]:
my_set = set()
print(my_set)

set()


In [8]:
my_set = {1,2,3}
print(my_set)

{1, 2, 3}


## Creating an empty set

In [9]:
my_set = {[]}

TypeError: unhashable type: 'list'

In [10]:
my_set = set(None)

TypeError: 'NoneType' object is not iterable

In [15]:
my_set = {}
print(my_set)

{}


## Making a copy of a set

In [16]:
set(my_set)

set()

In [17]:
my_set.copy()

{}

In [18]:
my_set[:]

TypeError: unhashable type: 'slice'

In [19]:
from copy import copy
copy(my_set)

{}

## Removing duplicate elements from a list

In [50]:
my_list = [1,1,2,2]
my_unique_list = list(set(my_list))
print(my_unique_list)

[1, 2]
