# The Set Class

The previous tutorial examined the frozenset class which is an unordered immutable collection of unique references. The set is its mutable counterpart and is more commonly used.

## Initialisation Signature

The initialisation signature can be viewed by inputting:

In [1]:
? set

[1;31mInit signature:[0m  [0mset[0m[1;33m([0m[0mself[0m[1;33m,[0m [1;33m/[0m[1;33m,[0m [1;33m*[0m[0margs[0m[1;33m,[0m [1;33m**[0m[0mkwargs[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m     
set() -> new empty set object
set(iterable) -> new set object

Build an unordered collection of unique elements.
[1;31mType:[0m           type
[1;31mSubclasses:[0m     

The default initialisation signature, initialises a set from an existing set. The set is typically instantiated shorthand as it is from Python builtins:

A set can be initialised from an iterable such as a string:

In [2]:
set('hello')

{'e', 'h', 'l', 'o'}

The formal representation in the cell output shows the standard shorthand notation used to initialise a set. This notation is also used for a single element set:

In [3]:
set('h')

{'h'}

However cannot be used for an empty set:

In [4]:
set('')

set()

The reason for this is the braces {} are also used to enclose the dictionary class. Empty braces {} are therefore assumed to belong to enclose an empty dictionary and not an empty set:

In [5]:
type({})

dict

There is no confusion for 1 or more elements as the dict has a colon for every key:value pair which is not present in a set.

In [6]:
{'h'} # set

{'h'}

In [7]:
{'h': ord('h')} # dict

{'h': 104}

The mutatable set is more commonly used than the immutable frozenset. Notice that the formal representation of the frozenset encloses the shorthand notation used for a set:

In [8]:
frozenset('hello')

frozenset({'e', 'h', 'l', 'o'})

## Identifiers

The set identifiers can be viewed by using the help function on the set class:

In [9]:
help(set)

Help on class set in module builtins:

class set(object)
 |  set() -> new empty set object
 |  set(iterable) -> new set object
 |  
 |  Build an unordered collection of unique elements.
 |  
 |  Methods defined here:
 |  
 |  __and__(self, value, /)
 |      Return self&value.
 |  
 |  __contains__(...)
 |      x.__contains__(y) <==> y in x.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iand__(self, value, /)
 |      Return self&=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __ior__(self, value, /)
 |      Return self|=value.
 |  
 |  __isub__(self, value, /)
 |      Return self-=value.
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __ixor__(self, value, /)
 |      Re

These identifiers can be split into attributes (there aren't any):

In [10]:
for identifier in dir(set):
    isfunction = callable(getattr(set, identifier))
    isdatamodel = identifier[0] == '_'
    if (not isfunction and not isdatamodel):
        print(identifier, end=' ')

Data model attributes:

In [11]:
for identifier in dir(set):
    isfunction = callable(getattr(set, identifier))
    isdatamodel = identifier[0] == '_'
    if (not isfunction and isdatamodel):
        print(identifier, end=' ')

__doc__ __hash__ 

Data model attributes:

In [12]:
for identifier in dir(set):
    isfunction = callable(getattr(set, identifier))
    isdatamodel = identifier[0] == '_'
    if (not isfunction and isdatamodel):
        print(identifier, end=' ')

__doc__ __hash__ 

Data model methods:

In [13]:
for identifier in dir(set):
    isfunction = callable(getattr(set, identifier))
    isdatamodel = identifier[0] == '_'
    if (isfunction and isdatamodel):
        print(identifier, end=' ')

__and__ __class__ __class_getitem__ __contains__ __delattr__ __dir__ __eq__ __format__ __ge__ __getattribute__ __getstate__ __gt__ __iand__ __init__ __init_subclass__ __ior__ __isub__ __iter__ __ixor__ __le__ __len__ __lt__ __ne__ __new__ __or__ __rand__ __reduce__ __reduce_ex__ __repr__ __ror__ __rsub__ __rxor__ __setattr__ __sizeof__ __str__ __sub__ __subclasshook__ __xor__ 

Note that \_\_hash\_\_ is an attribute and not a callable method indicating that this is a mutatable class and is therefore not hashable.

Since the set is a mutatable frozenset, it shares the immutable methods which behave analogously. The methods in a set not present in a frozenset can be examined.

In [14]:
for identifier in dir(set):
    isfunction = callable(getattr(set, identifier))
    isinfrozenset = identifier in dir(frozenset)
    isdatamodel = identifier[0] == '_'
    if (isfunction and not isdatamodel and not isinfrozenset):
        print(identifier, end=' ')

add clear difference_update discard intersection_update pop remove symmetric_difference_update update 

In [15]:
for identifier in dir(set):
    isfunction = callable(getattr(set, identifier))
    isinfrozenset = identifier in dir(frozenset)
    isdatamodel = identifier[0] == '_'
    if (isfunction and isdatamodel and not isinfrozenset):
        print(identifier, end=' ')

__iand__ __ior__ __isub__ __ixor__ 

## Mutable Set Methods

Four of these methods and four of the data model identifiers perform previously examined frozenset operations in place:

|Immutable Method|Immutable Data Model Method|Immutable Operator|Mutable Method|Mutable Data Model Method|Operator|
|---|---|---|---|---|---|
|union|\_\_or\_\_|\||update|\_\_ior\_\_|\|=|
|intersection|\_\_and\_\_|&|intersection_update|\_\_iand\_\_|&=|
|difference|\_\_sub\_\_|-|difference_update|\_\_isub\_\_|-=|
|symmetric_difference|\_\_xor\_\_|^|symmetric_difference_update|\_\_ixor\_\_|^=|

union and update are compared below:

In [16]:
unique1 = {0, 1, 2, 3, 4, 5, 6}
unique2 = {4, 5, 6, 7, 8, 9}

In [17]:
unique1.union(unique2) # Return value - immutable

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

In [18]:
unique1

{0, 1, 2, 3, 4, 5, 6}

In [19]:
unique1.update(unique2) # No return value - mutable

In [20]:
unique1

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

intersection and intersection_update are compared below:

In [21]:
unique1 = {0, 1, 2, 3, 4, 5, 6}
unique2 = {4, 5, 6, 7, 8, 9}

In [22]:
unique1.intersection(unique2) # Return value - immutable

{4, 5, 6}

In [23]:
unique1

{0, 1, 2, 3, 4, 5, 6}

In [24]:
unique1.intersection_update(unique2) # No return value - mutatable

In [25]:
unique1

{4, 5, 6}

difference and difference_update are compared below:

In [26]:
unique1 = {0, 1, 2, 3, 4, 5, 6}
unique2 = {4, 5, 6, 7, 8, 9}

In [27]:
unique1.difference(unique2) # Return value - immutable

{0, 1, 2, 3}

In [28]:
unique1

{0, 1, 2, 3, 4, 5, 6}

In [29]:
unique1.difference_update(unique2) # No return value - mutatable

In [30]:
unique1

{0, 1, 2, 3}

symmetric_difference and symmetric_difference_update are compared below:

In [31]:
unique1 = {0, 1, 2, 3, 4, 5, 6}
unique2 = {4, 5, 6, 7, 8, 9}

In [32]:
unique1.symmetric_difference(unique2) # Return value - immutable

{0, 1, 2, 3, 7, 8, 9}

In [33]:
unique1

{0, 1, 2, 3, 4, 5, 6}

In [34]:
unique1.symmetric_difference_update(unique2) # No return value - mutatable

In [35]:
unique1

{0, 1, 2, 3, 7, 8, 9}

The mutatable identifier remove is similar to the list identifier remove and will remove a reference in a set. Because a set can only store unique references there is only one occurance of a value within the set:

In [36]:
? list.remove

[1;31mSignature:[0m  [0mlist[0m[1;33m.[0m[0mremove[0m[1;33m([0m[0mself[0m[1;33m,[0m [0mvalue[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Remove first occurrence of value.

Raises ValueError if the value is not present.
[1;31mType:[0m      method_descriptor

In [37]:
? set.remove

[1;31mDocstring:[0m
Remove an element from a set; it must be a member.

If the element is not a member, raise a KeyError.
[1;31mType:[0m      method_descriptor

An associated set method is discard which behaves similarly to remove when a reference exists in a set removing or discarding it. When the reference is not in a set, it does not flag up an error:

In [38]:
? set.discard

[1;31mDocstring:[0m
Remove an element from a set if it is a member.

Unlike set.remove(), the discard() method does not raise
an exception when an element is missing from the set.
[1;31mType:[0m      method_descriptor

For example:

In [39]:
unique1 = {0, 1, 2, 3, 4, 5, 6}

In [40]:
unique1.remove(0) # No return value - mutatable

In [41]:
unique1

{1, 2, 3, 4, 5, 6}

In [42]:
unique1.discard(1) # No return value - mutatable

In [43]:
unique1

{2, 3, 4, 5, 6}

In [44]:
# unique1.remove(0)

<span style='color:red'>KeyError</span>: 0

In [45]:
unique1.discard(1) # No return value - mutatable

In [46]:
unique1 # No changes as reference previously removed

{2, 3, 4, 5, 6}

The mutatable set method add can be used to add a single reference to a set. This behaves similarly to the mutatable list method append but recall that a set is unordered meaning there is no numeric index and therefore no concept of the end of a set:

In [47]:
? list.append

[1;31mSignature:[0m  [0mlist[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mself[0m[1;33m,[0m [0mobject[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m Append object to the end of the list.
[1;31mType:[0m      method_descriptor

In [48]:
? set.add

[1;31mDocstring:[0m
Add an element to a set.

This has no effect if the element is already present.
[1;31mType:[0m      method_descriptor

In [49]:
unique1 = {0, 1, 2, 3, 4, 5, 6}

In [50]:
unique1.add(7) # No return value - mutatable

In [51]:
unique1

{0, 1, 2, 3, 4, 5, 6, 7}

It is more helpful to think of add as being analogous to the mutatable list method append and not the immutable list data model method \_\_add\_\_ which performs concatenation. The data model method \_\_add\_\_ is not defined in the set class and therefore the + operator cannot be used:

In [52]:
unique1 = {0, 1, 2, 3, 4, 5, 6}
unique2 = {4, 5, 6, 7, 8, 9}

In [53]:
# unique1 + unique2

<span style='color:red'>TypeError</span>: unsupported operand type(s) for +: 'set' and 'set'

Two sets cannot be concatenated as sets instead cannot contain duplicate references. Instead recall the concept of union of two sets, which is the closest operation to concatenation:

In [54]:
unique1 | unique2

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

The method pop is similar to the list method pop but as a set is an unordered collection there is no concept of index and a random value gets popped. Recall that most methods are either immutable and return a value or are mutatable, modify the instance in place and have no return value. pop is an exception as it mutates the original instance and returns the reference popped:

In [55]:
? list.pop

[1;31mSignature:[0m  [0mlist[0m[1;33m.[0m[0mpop[0m[1;33m([0m[0mself[0m[1;33m,[0m [0mindex[0m[1;33m=[0m[1;33m-[0m[1;36m1[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Remove and return item at index (default last).

Raises IndexError if list is empty or index is out of range.
[1;31mType:[0m      method_descriptor

In [56]:
? set.pop

[1;31mDocstring:[0m
Remove and return an arbitrary set element.
Raises KeyError if the set is empty.
[1;31mType:[0m      method_descriptor

For example:

In [57]:
unique1 = {0, 1, 2, 3, 4, 5, 6}

In [58]:
unique1.pop() # Returns the reference popped

0

In [59]:
unique1 # Mutates the set in place

{1, 2, 3, 4, 5, 6}

The set method clear is analogous to the list method clear and clears all references from the collection:

In [60]:
? list.clear

[1;31mSignature:[0m  [0mlist[0m[1;33m.[0m[0mclear[0m[1;33m([0m[0mself[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m Remove all items from list.
[1;31mType:[0m      method_descriptor

In [61]:
? set.clear

[1;31mDocstring:[0m Remove all elements from this set.
[1;31mType:[0m      method_descriptor

For example:

In [62]:
unique1 = {0, 1, 2, 3, 4, 5, 6}

In [63]:
unique1.clear()

In [64]:
unique1 # No return value - mutatable

set()