# Sets

A Set is a data structure that stores unordered unique hashable elements.

In [1]:
s = set() # create an empty set
s

set()

In [2]:
s = set((1,2,3,4,5)) # create a set from a tuple
s = set([1,2,3,4,5]) # create a set from a list
s = {1, 2, 3, 4, 5} # another way to create a set
s

{1, 2, 3, 4, 5}

In [3]:
s = {1, 2, 3, 4, 5, "test", ()}
s

{(), 1, 2, 3, 4, 5, 'test'}

In [4]:
s.add("new test") # add a new element
s

{(), 1, 2, 3, 4, 5, 'new test', 'test'}

In [5]:
s.add(3)
s # after adding 3 everything remained unchanged

{(), 1, 2, 3, 4, 5, 'new test', 'test'}

In [6]:
s.add([]) # can't add list as it's not hashable

TypeError: unhashable type: 'list'

In [7]:
s.discard(3) # remove element from set
s

{(), 1, 2, 4, 5, 'new test', 'test'}

In [8]:
first_part = set((1,2,3,4,5))
second_part = set((6,7,8,9,10))
first_part.union(second_part) # union of two sets

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

In [9]:
first_part.intersection(second_part) # intersection of two sets

set()

Set Comprehensions - As with list generation, Python supports syntactic sugar for generating sets:

In [10]:
{s for s in [1, 2, 1, 0]}

{0, 1, 2}

In [11]:
{s**2 for s in [1, 2, 1, 0]}

{0, 1, 4}

In [12]:
{s**2 for s in range(10)}

{0, 1, 4, 9, 16, 25, 36, 49, 64, 81}

You can also use an if-elif-else statement:

In [13]:
{s for s in [1, 2, 3] if s % 2}

{1, 3}

Use multiple for loops:

In [14]:
{(m, n) for n in range(2) for m in range(3, 5)}

{(3, 0), (3, 1), (4, 0), (4, 1)}

### Complexity of basic operations in big (O) notation

The runtime table for operations on sets in Big O notation looks like this:

In [None]:
Operation               Medium difficulty
len                     O(1)
add                     O(1)
entry check             O(1) 
.remove                 O(1)
.discard                O(1)
.pop                    O(1)
.clear                  O(1)
creation                O(len(t))
Checks ==, !=           O(len(s))
s.issubset(t)           O(len(s))
t.issuperset(s)         O(len(t))
.union                  O(len(s)+len(t))
.intersection           O(min(len(s), len(t)))
.difference             O(len(s))
.symmetric_difference   O(len(s))
Pass                    O(n)
.copy                   O(n)

### How sets are arranged inside Python

Inside, the set is a table, in the cells of which new elements are added. In Python, sets represent the following structure:

In [None]:
typedef struct{
    PyObject_HEAD
    Py_ssize_t fill;
    Py_ssize_t used;
    Py_ssize_t mask;
    setentry *table;
    Py_hash_t hash;
    Py_ssize_t finger;
    setentry smalltable[PySet_MINSIZE];
    PyObject *weakreflist;
} PySetObject;

Where:

PyObject_HEAD - C-macro, used when declaring new types that represent objects without variable length;

Py_ssize_t fill - the number of active and dummy records;

Py_ssize_t used - the number of active records;

setentry *table - pointer to the table where the data is stored;

Py_hash_t hash - hash, filled only in case of frozen set;

Py_ssize_t finger is a special variable for specifying the element to be thrown out of the collection during the pop() operation;

PyObject *weakreflist - A list of references for the garbage collector.

### Application in practice

In practice, sets are used to quickly test whether a new element belongs to a set of data, to insert or remove new values from a set, and to calculate the union or intersection of two sets.

### Frozen Set

An immutable version of sets is a unique and hashable set of elements in a structure, just like in a normal set. Differs in the absence of the ability to add or remove a new element.

In [16]:
a = frozenset((1,2,3))
a

frozenset({1, 2, 3})

In [17]:
a.add(2)

AttributeError: 'frozenset' object has no attribute 'add'