# Builtins Module: The Set Class (set)

The ```set``` class is the mutable counterpart to the immutable ```frozenset```. It is a mutable ```Unordered``` ```Collection``` of unique references to an ```object```.

## Categorize_Identifiers Module

This notebook will use the following functions ```dir2```, ```variables``` and ```view``` in the custom module ```categorize_identifiers``` which is found in the same directory as this notebook file. ```dir2``` is a variant of ```dir``` that groups identifiers into a ```dict``` under categories and ```variables``` is an IPython based a variable inspector. ```view``` is used to view a ```Collection``` in more detail:

In [1]:
from categorize_identifiers import dir2, variables, view

## Initialisation Signature

The initialisation signature of a ```set``` is generally used for creation of an empty ```set``` or type casting an iterable to a ```set``` instance:

In [2]:
set?

[1;31mInit signature:[0m [0mset[0m[1;33m([0m[0mself[0m[1;33m,[0m [1;33m/[0m[1;33m,[0m [1;33m*[0m[0margs[0m[1;33m,[0m [1;33m**[0m[0mkwargs[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m     
set() -> new empty set object
set(iterable) -> new set object

Build an unordered collection of unique elements.
[1;31mType:[0m           type
[1;31mSubclasses:[0m     LazySet, LazySet, LazySet

The top line shows initialisation of a ```set``` from an existing ```set``` instance:

```python
set(self, /, *args, **kwargs)
```

In other words ```set``` is a ```builtins``` class and can be initialised shorthand: 

In [3]:
unique_active_archive = {'h', 'e', 'l', 'o'}

The ```set``` is an unordered ```Collection``` and the default representation shown in the cell output instead displays the characters in order using their ordinal values:

In [4]:
unique_active_archive

{'e', 'h', 'l', 'o'}

When the initialisation signature is used explictly, the existing ```set``` instance is cast into a ```set``` and is unchanged:

In [5]:
unique_active_archive = set({'h', 'e', 'l', 'o'})

It is more common to use the shorthand form:

In [6]:
unique_active_archive = {'h', 'e', 'l', 'o'}

Use of the ```set``` class is required to create an empty ```set```:

```python
set() -> new empty set object
```

In [7]:
unique_active__archive_empty = set()

In Python the ```{}``` are also used to enclose a ```dict``` where each element has a ```key: value``` pair. The ```dict``` is much more commonly used than the ```set``` and therefore empty ```{}``` create an empty ```dict``` instance and not an empty ```set``` instance:

In [8]:
not_a_set = {}

In [9]:
variables()

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
unique_active_archive,set,4,"{'l', 'o', 'h', 'e'}"
unique_active__archive_empty,set,0,set()
not_a_set,dict,0,{}


For a ```set``` and ```dict``` that are not empty there is no ambiguity because the ```dict``` has a colon ```:``` which separates out each ```key: value``` pair:

In [10]:
unique_active_archive = {'value1', 'value2', 'value3'}

In [11]:
mapping = {'key1': 'value1', 'key2': 'value2', 'key3': 'value3'}

In [12]:
del unique_active_archive, not_a_set
variables()

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
unique_active__archive_empty,set,0,set()
mapping,dict,3,"{'key1': 'value1', 'key2': 'value2', 'key3': 'value3'}"


The ```dict``` class will be examined in more detail in the next tutorial.

The intialisation signature of a ```set``` can be used to type cast of an iterable such as a ```tuple```:

```python
set(iterable) -> new set object
```

In [13]:
unique_active_archive_letters = set('hello')

In [14]:
variables()

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
unique_active__archive_empty,set,0,set()
mapping,dict,3,"{'key1': 'value1', 'key2': 'value2', 'key3': 'value3'}"
unique_active_archive_letters,set,4,"{'l', 'o', 'h', 'e'}"


## Identifiers

The ```set``` is the mutable counterpart to the immutable ```frozenset```. Therefore the ```__hash__``` datamodel identifier in the ```frozenset``` is defined and the ```frozenset``` is hashable and all the ```frozenset``` methods have a ```return``` value:

In [15]:
frozenset.__hash__ == None

False

The ```__hash__``` identifier in the ```set``` is ```None``` and is not hashable because it is mutable. The ```set``` has mutable methods which mutate the instance inplace and have no ```return``` value:

In [16]:
set.__hash__ == None

True

As the ```set``` is the mutable counterpart to the ```frozenset``` it has consistent immutable methods:

In [17]:
dir2(set, frozenset, consistent_only=True)

{'method': ['copy',
            'difference',
            'intersection',
            'isdisjoint',
            'issubset',
            'issuperset',
            'symmetric_difference',
            'union'],
 'datamodel_attribute': ['__doc__', '__hash__'],
 'datamodel_method': ['__and__',
                      '__class__',
                      '__class_getitem__',
                      '__contains__',
                      '__delattr__',
                      '__dir__',
                      '__eq__',
                      '__format__',
                      '__ge__',
                      '__getattribute__',
                      '__getstate__',
                      '__gt__',
                      '__init__',
                      '__init_subclass__',
                      '__iter__',
                      '__le__',
                      '__len__',
                      '__lt__',
                      '__ne__',
                      '__new__',
                      '__or__',
       

It also has the following mutable methods:

In [18]:
dir2(set, frozenset, unique_only=True)

{'method': ['add',
            'clear',
            'difference_update',
            'discard',
            'intersection_update',
            'pop',
            'remove',
            'symmetric_difference_update',
            'update'],
 'datamodel_method': ['__iand__', '__ior__', '__isub__', '__ixor__']}


## Mutable Set Methods

Four of these methods and four of the datamodel identifiers perform previously examined ```frozenset``` operations inplace:

|Immutable Method|Immutable Data Model Method|Immutable Operator|Mutable Method|Mutable Data Model Method|Operator|
|---|---|---|---|---|---|
|union|\_\_or\_\_|\||update|\_\_ior\_\_|\|=|
|intersection|\_\_and\_\_|&|intersection_update|\_\_iand\_\_|&=|
|difference|\_\_sub\_\_|-|difference_update|\_\_isub\_\_|-=|
|symmetric_difference|\_\_xor\_\_|^|symmetric_difference_update|\_\_ixor\_\_|^=|

```union``` and ```update``` are compared below:

In [19]:
unique1 = {0, 1, 2, 3, 4, 5, 6}
unique2 = {4, 5, 6, 7, 8, 9}

In [20]:
unique1.union(unique2) # Return value - immutable

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

In [21]:
unique1

{0, 1, 2, 3, 4, 5, 6}

In [22]:
unique1.update(unique2) # No return value - mutable

In [23]:
variables(['unique1', 'unique2'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
unique1,set,10,"{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}"
unique2,set,6,"{4, 5, 6, 7, 8, 9}"


```intersection``` and ```intersection_update``` are compared below:

In [24]:
unique1 = {0, 1, 2, 3, 4, 5, 6}
unique2 = {4, 5, 6, 7, 8, 9}

In [25]:
unique1.intersection(unique2) # Return value - immutable

{4, 5, 6}

In [26]:
unique1.intersection_update(unique2) # No return value - mutable

In [27]:
variables(['unique1', 'unique2'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
unique1,set,3,"{4, 5, 6}"
unique2,set,6,"{4, 5, 6, 7, 8, 9}"


```difference``` and ```difference_update``` are compared below:

In [28]:
unique1 = {0, 1, 2, 3, 4, 5, 6}
unique2 = {4, 5, 6, 7, 8, 9}

In [29]:
unique1.difference(unique2) # Return value - immutable

{0, 1, 2, 3}

In [30]:
unique1.difference_update(unique2) # No return value - mutable

In [31]:
variables(['unique1', 'unique2'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
unique1,set,4,"{0, 1, 2, 3}"
unique2,set,6,"{4, 5, 6, 7, 8, 9}"


```symmetric_difference``` and ```symmetric_difference_update``` are compared below:

In [32]:
unique1 = {0, 1, 2, 3, 4, 5, 6}
unique2 = {4, 5, 6, 7, 8, 9}

In [33]:
unique1.symmetric_difference(unique2) # Return value - immutable

{0, 1, 2, 3, 7, 8, 9}

In [34]:
unique1.symmetric_difference_update(unique2) # No return value - mutatable

In [35]:
variables(['unique1', 'unique2'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
unique1,set,7,"{0, 1, 2, 3, 7, 8, 9}"
unique2,set,6,"{4, 5, 6, 7, 8, 9}"


The ```set``` mutable method ```remove``` is similar to the ```list``` mutable method ```remove``` and will remove a reference in a ```set```. Because a ```set``` can only store unique references there is only one occurrence of a value within the ```set```:

In [36]:
list.remove?

[1;31mSignature:[0m [0mlist[0m[1;33m.[0m[0mremove[0m[1;33m([0m[0mself[0m[1;33m,[0m [0mvalue[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Remove first occurrence of value.

Raises ValueError if the value is not present.
[1;31mType:[0m      method_descriptor

In [37]:
set.remove?

[1;31mDocstring:[0m
Remove an element from a set; it must be a member.

If the element is not a member, raise a KeyError.
[1;31mType:[0m      method_descriptor

An associated ```set``` method is ```discard``` which behaves similarly to ```remove``` when a reference exists in a ```set``` removing or discarding it. When the reference is not in a ```set```, it does not flag up an error:

In [38]:
set.discard?

[1;31mDocstring:[0m
Remove an element from a set if it is a member.

Unlike set.remove(), the discard() method does not raise
an exception when an element is missing from the set.
[1;31mType:[0m      method_descriptor

For example:

In [39]:
unique1 = {0, 1, 2, 3, 4, 5, 6}

In [40]:
unique1.remove(0) # No return value - mutable

In [41]:
variables(['unique1'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
unique1,set,6,"{1, 2, 3, 4, 5, 6}"


In [42]:
unique1.discard(1) # No return value - mutable

In [43]:
variables(['unique1'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
unique1,set,5,"{2, 3, 4, 5, 6}"


Attempting to remove a reference that doesn't exist will result in a ```KeyError```:

```python
unique1.remove(0)
```

In [44]:
unique1.discard(1) # No return value - mutable

In [45]:
unique1 # No changes as reference previously removed

{2, 3, 4, 5, 6}

The ```set``` mutable method ```add``` can be used to add a single reference to a ```set``` instance. This behaves similarly to the mutable ```list``` method ```append``` but recall that a ```set``` is unordered meaning there is no numeric index and therefore no concept of the "end" of a ```set```:

In [46]:
list.append?

[1;31mSignature:[0m [0mlist[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mself[0m[1;33m,[0m [0mobject[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m Append object to the end of the list.
[1;31mType:[0m      method_descriptor

In [47]:
set.add?

[1;31mDocstring:[0m
Add an element to a set.

This has no effect if the element is already present.
[1;31mType:[0m      method_descriptor

In [48]:
unique1 = {0, 1, 2, 3, 4, 5, 6}

In [49]:
unique1.add(7) # No return value - mutable

In [50]:
variables(['unique1'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
unique1,set,8,"{0, 1, 2, 3, 4, 5, 6, 7}"


It is more helpful to think of the ```set``` mutable method ```add``` as being analogous to the mutable ```list``` method ```append``` and not the immutable ```list``` datamodel method ```__add__``` which performs concatenation. The datamodel method ```__add__``` is not defined in the ```set``` class and therefore the ```+``` operator cannot be used:

In [51]:
unique1 = {0, 1, 2, 3, 4, 5, 6}
unique2 = {4, 5, 6, 7, 8, 9}

If the ```+``` operator is attempted to be used between these two ```set``` instances a ```TypeError``` will display:

```python
unique1 + unique2
```

Two ```set``` instances cannot be concatenated as a ```set``` cannot contain duplicate references. Instead recall the concept of ```union``` of two ```set``` instances, which is the closest operation to concatenation:

In [52]:
unique1 | unique2

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

The ```set``` method ```pop``` is similar to the ```list``` method ```pop``` but as a ```set``` is an unordered ```Collection``` there is no concept of index and a random value gets popped. Recall that most methods are either immutable and ```return``` a value or are mutable, modify the instance inplace and have no ```return``` value. ```pop``` is an exception as it mutates the original instance and also has a ```return``` value returning the reference popped:

In [53]:
list.pop?

[1;31mSignature:[0m [0mlist[0m[1;33m.[0m[0mpop[0m[1;33m([0m[0mself[0m[1;33m,[0m [0mindex[0m[1;33m=[0m[1;33m-[0m[1;36m1[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Remove and return item at index (default last).

Raises IndexError if list is empty or index is out of range.
[1;31mType:[0m      method_descriptor

In [54]:
set.pop?

[1;31mDocstring:[0m
Remove and return an arbitrary set element.
Raises KeyError if the set is empty.
[1;31mType:[0m      method_descriptor

For example:

In [55]:
unique1 = {0, 1, 2, 3, 4, 5, 6}

In [56]:
unique1.pop() # Returns the reference popped

0

In [57]:
variables(['unique1'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
unique1,set,6,"{1, 2, 3, 4, 5, 6}"


The ```set``` method ```clear``` is analogous to the ```list``` method ```clear``` and clears all references from the ```Collection```:

In [58]:
list.clear?

[1;31mSignature:[0m [0mlist[0m[1;33m.[0m[0mclear[0m[1;33m([0m[0mself[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m Remove all items from list.
[1;31mType:[0m      method_descriptor

In [59]:
set.clear?

[1;31mDocstring:[0m Remove all elements from this set.
[1;31mType:[0m      method_descriptor

For example:

In [60]:
unique1 = {0, 1, 2, 3, 4, 5, 6}

In [61]:
unique1.clear()

In [62]:
variables(['unique1'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
unique1,set,0,set()


[Return to Python Tutorials](../readme.md)