## Sets and Frozensets


### Introduction

<img width=250 height=250 class="imgright" src="../images/sets_with_notations.webp" alt="Graphical Depiction of Sets as Circles " />

In this chapter of our tutorial, we are dealing with Python's implementation of sets. Though sets are nowadays an integral part of modern mathematics, this has not always been the case. The set theory had been rejected by many, even by some great thinkers. One of them was the philosopher Wittgenstein. He didn't like the set theory and complained mathematics is "ridden through and through with the pernicious idioms of set theory...". He dismissed the set theory as "utter nonsense", as being "laughable" and "wrong". His criticism appeared years after the death of the German mathematician Georg Cantor, the founder of the set theory. David Hilbert defended it from its critics by famously declaring: "No one shall expel us from the Paradise that Cantor has created.

Cantor defined a set at the beginning of his "Beiträge zur Begründung der transfiniten Mengenlehre" as:
"A set is a gathering together into a whole of definite, distinct objects of our perception and of our thought - which are called elements of the set." Nowadays, we can say in "plain" English: A set is a well-defined collection of objects.

The elements or members of a set can be anything: numbers, characters, words, names, letters of the alphabet, even other sets, and so on. Sets are usually denoted with capital letters. This is not the exact mathematical definition, but it is good enough for the following.

The data type "set", which is a collection type, has been part of Python since version 2.4. A set contains an unordered collection of unique and immutable objects. The set data type is, as the name implies, a Python implementation of the sets as they are known from mathematics. This explains, why sets unlike lists or tuples can't have multiple occurrences of the same element.

### Sets
If we want to create a set, we can call the built-in set function with a sequence or another iterable object.

In the following example, a string is singularized into its characters to build the resulting set x:

In [1]:
x = set("A Python Tutorial")
x

{' ', 'A', 'P', 'T', 'a', 'h', 'i', 'l', 'n', 'o', 'r', 't', 'u', 'y'}

In [2]:
type(x)

set

We can pass a list to the built-in set function, as we can see in the following:

In [3]:
x = set(["Perl", "Python", "Java"])
x

{'Java', 'Perl', 'Python'}

Now, we want to show what happens, if we pass a tuple with reappearing elements to the set function - in our example the city "Paris":

In [4]:
cities = set(("Paris", "Lyon", "London","Berlin","Paris","Birmingham"))
cities

{'Berlin', 'Birmingham', 'London', 'Lyon', 'Paris'}

As we have expected, no doubles occur in the resulting set of cities.

### Immutable Sets
Sets are implemented in a way, which doesn't allow mutable objects. The following example demonstrates that we cannot include, for example, lists as elements:

In [2]:
cities = set((["Python","Perl"], ["Paris", "Berlin", "London"]))

cities

TypeError: unhashable type: 'list'

Tuples on the other hand are fine:

In [None]:
cities = set((("Python","Perl"), ("Paris", "Berlin", "London")))

### Frozensets
Though sets can't contain mutable objects, sets are mutable:

In [1]:
cities = set(["Frankfurt", "Basel","Freiburg"])
cities.add("Strasbourg")
cities

{'Basel', 'Frankfurt', 'Freiburg', 'Strasbourg'}

Frozensets are like sets except that they cannot be changed, i.e. they are immutable:

In [10]:
cities = frozenset(["Frankfurt", "Basel","Freiburg"])
cities.add("Strasbourg")

AttributeError: 'frozenset' object has no attribute 'add'

### Improved notation
We can define sets (since Python2.6) without using the built-in set function. We can use curly braces instead:

In [11]:
adjectives = {"cheap","expensive","inexpensive","economical"}
adjectives

{'cheap', 'economical', 'expensive', 'inexpensive'}

### Set Operations

#### add(element)
    A method which adds an element to a set. This element has to be immutable.


In [2]:
colours = {"red","green"}
colours.add("yellow")
colours

{'green', 'red', 'yellow'}

In [3]:
colours.add(["black","white"])

TypeError: unhashable type: 'list'

Of course, an element will only be added, if it is not already contained in the set. If it is already contained, the method call has no effect. 

#### clear()

All elements will be removed from a set.

In [13]:
cities = {"Stuttgart", "Konstanz", "Freiburg"}
cities.clear()
cities

set()

#### copy
Creates a shallow copy, which is returned.

In [14]:
more_cities = {"Winterthur","Schaffhausen","St. Gallen"}
cities_backup = more_cities.copy()
more_cities.clear()
cities_backup 

{'Schaffhausen', 'St. Gallen', 'Winterthur'}

    Just in case, you might think, an assignment might be enough:

In [15]:
more_cities = {"Winterthur","Schaffhausen","St. Gallen"}
cities_backup = more_cities
more_cities.clear()
cities_backup

set()

The assignment "cities_backup = more_cities" just creates a pointer, i.e. another name, to the same data structure. 
#### difference()

This method returns the difference of two or more sets as a new set, leaving the original set unchanged.

In [20]:
x = {"a","b","c","d","e"}
y = {"b","c"}
z = {"c","d"}
x.difference(y) 

{'a', 'd', 'e'}

In [21]:
x.difference(y).difference(z)

{'a', 'e'}

Instead of using the method difference, we can use the operator "-":

In [22]:
x - y

{'a', 'd', 'e'}

In [23]:
x - y - z

{'a', 'e'}

#### difference_update()

The method difference_update removes all elements of another set from this set. x.difference_update(y) is the same as "x = x - y" or even x -= y works.

In [16]:
x = {"a","b","c","d","e"}
y = {"b","c"}
x.difference_update(y)
x = {"a","b","c","d","e"}
y = {"b","c"}
x = x - y
x

{'a', 'd', 'e'}

#### discard(el)
An element el will be removed from the set, if it is contained in the set. If el is not a member of the set, nothing will be done. 

In [26]:
x = {"a","b","c","d","e"}
x.discard("a")
x    

{'b', 'c', 'd', 'e'}

In [27]:
x.discard("z")
x   

{'b', 'c', 'd', 'e'}

#### remove(el)

Works like discard(), but if el is not a member of the set, a KeyError will be raised.

In [17]:
x = {"a","b","c","d","e"}
x.remove("a")
x   

{'b', 'c', 'd', 'e'}

In [18]:
x.remove("z")    

KeyError: 'z'

#### union(s)
This method returns the union of two sets as a new set, i.e. all elements that are in either set.

In [30]:
x = {"a","b","c","d","e"}
y = {"c","d","e","f","g"}
x.union(y)   

{'a', 'b', 'c', 'd', 'e', 'f', 'g'}

This can be abbreviated with the pipe operator "|":

In [31]:
x = {"a","b","c","d","e"}
y = {"c","d","e","f","g"}
x | y

{'a', 'b', 'c', 'd', 'e', 'f', 'g'}

#### intersection(s)
Returns the intersection of the instance set and the set s as a new set. In other words, a set with all the elements which are contained in both sets is returned.

In [32]:
x = {"a","b","c","d","e"}
y = {"c","d","e","f","g"}
x.intersection(y)

{'c', 'd', 'e'}

This can be abbreviated with the ampersand operator "&":

In [33]:
x = {"a","b","c","d","e"}
y = {"c","d","e","f","g"}
x  & y

{'c', 'd', 'e'}

#### isdisjoint()
    This method returns True if two sets have a null intersection.

In [34]:
x = {"a","b","c"}
y = {"c","d","e"}
x.isdisjoint(y)

False

In [35]:
x = {"a","b","c"}
y = {"d","e","f"}
x.isdisjoint(y) 

True

#### issubset()
    x.issubset(y) returns True, if x is a subset of y. "<=" is an abbreviation for "Subset of" and ">=" for "superset of"
    "<" is used to check if a set is a proper subset of a set.  

In [36]:
x = {"a","b","c","d","e"}
y = {"c","d"}
x.issubset(y)

False

In [37]:
y.issubset(x)

True

In [38]:
x < y

False

In [39]:
y < x # y is a proper subset of x   

True

In [40]:
x < x # a set can never be a proper subset of oneself.

False

In [41]:
x <= x 

True

#### issuperset()
    x.issuperset(y) returns True, if x is a superset of y. ">=" is an abbreviation for "issuperset of"
    ">" is used to check if a set is a proper superset of a set.    

In [42]:
x = {"a","b","c","d","e"}
y = {"c","d"}
x.issuperset(y)

True

In [43]:
x > y

True

In [44]:
x >= y

True

In [45]:
x >= x   

True

In [46]:
x > x

False

In [47]:
x.issuperset(x)

True

#### pop()
    pop() removes and returns an arbitrary set element. The method raises a KeyError if the set is empty.

In [48]:
x = {"a","b","c","d","e"}
x.pop()

'e'

In [49]:
x.pop()

'a'