# Sets

A Set is an unordered collection data type that is iterable, mutable and has no duplicate elements. Python’s set class represents the mathematical notion of a set.

**Advantages**:
 
* Sets are fast for checking if an element is in the set

**Disadvantages**:

* They don't keep order of elements

## Creating

The ways how we can create `set` class object.

In [1]:
def g(): yield from 'abc'

print('Empty set')
empty_set = set()
print(empty_set, type(empty_set))

print('\nSet from list')
set_from_iteration = set(['a', 'b', 'c'])
print(set_from_iteration, type(set_from_iteration))

print('\nSet by simplified syntax')
set_from_syntax = {'a', 'b', 'c'}
print(set_from_syntax, type(set_from_syntax))

print('\nSet from generator')
set_from_generator = set(g())
print(set_from_generator, type(set_from_generator))

print('\nSet comprehensions')
set_comprehensions = {x for x in 'abc'}
print(set_comprehensions, type(set_comprehensions))
set_from_comprehension_and_generator = set(x for x in 'abc')
print(set_from_comprehension_and_generator, type(set_from_comprehension_and_generator))

Empty set
set() <class 'set'>

Set from list
{'c', 'a', 'b'} <class 'set'>

Set by simplified syntax
{'c', 'a', 'b'} <class 'set'>

Set from generator
{'c', 'a', 'b'} <class 'set'>

Set comprehensions
{'c', 'a', 'b'} <class 'set'>
{'c', 'a', 'b'} <class 'set'>


## Adding elements

For single elements adding the method `add`. For whole collection adding the method `update`. Adding existing elements not cause any error, they will be ignored. All adding methods mutate the object.

In [2]:
some_set = set()
print('Set before:', some_set)

some_set.add(1)
print('Set after:', some_set)

some_set.update([2, 3, 4])
print('Set after adding array:', some_set)

some_set.update([1, 2, 3, 4])
print('Set after adding existing elements', some_set)

Set before: set()
Set after: {1}
Set after adding array: {1, 2, 3, 4}
Set after adding existing elements {1, 2, 3, 4}


## Removing elements

For removing single element there is a method remove and discard, accepting the elements which needs to be removed. If element is not existing in the set, an exception (the KeyError) is thrown.

For removing multiple elements you can use the difference_update method or -= operator, which is described later.

For removing all elements there is method clear.

There is the pop method for set. Does not accepting any parameters. According to the documentation is random, but the randomness of that solution is based on order of elements inside the set.

All remove methods are mutable.

In [3]:
some_set = set([1, 2, 3, 4, 5])
print('Before removing:', some_set)
some_set.remove(3)
print('After removing:', some_set)

print('Pop operation:', some_set.pop())
print('After pop:', some_set)

some_set.clear()

print('After clear:', some_set)

Before removing: {1, 2, 3, 4, 5}
After removing: {1, 2, 4, 5}
Pop operation: 1
After pop: {2, 4, 5}
After clear: set()


## Operations on sets

In Pythons sets, each sets operation has two versions: immutable (which returns new set) and mutable (with *_update* suffix, and which does not return any value).

The differences between the using method or operator:

* Operator require other set for working
* Methods can accept any iterable object

### Sets difference

All elements which are in the first set, but not in second.

Methods: `difference`, `difference_update`.

Operator: `-`.

In [4]:
a = set('abracadabra')
b = set('alacazam')

print('Letters in a but not in b:', a - b)
print('Letters in a but not in b:', a.difference(b))
print('Set a after above operations:', a)

print('Result of difference_update:', a.difference_update(b))
print('Set a after above difference_update:', a)

Letters in a but not in b: {'r', 'd', 'b'}
Letters in a but not in b: {'r', 'd', 'b'}
Set a after above operations: {'d', 'a', 'b', 'r', 'c'}
Result of difference_update: None
Set a after above difference_update: {'d', 'b', 'r'}


### Sets union

The result set contain all elements from first set and all from the second set.

Methods: `union`, `update`.

Operator: `|`.

In [5]:
a = set('abracadabra')
b = set('alacazam')

print('Letters both from a and b:', a | b)
print('Letters both from a and b:', a.union(b))
print('Set a after above operations:', a)

print('Result of update:', a.update(b))
print('Set a after above update:', a)

Letters both from a and b: {'d', 'l', 'a', 'b', 'm', 'z', 'r', 'c'}
Letters both from a and b: {'d', 'l', 'a', 'b', 'm', 'z', 'r', 'c'}
Set a after above operations: {'d', 'a', 'b', 'r', 'c'}
Result of update: None
Set a after above update: {'d', 'l', 'a', 'b', 'm', 'z', 'r', 'c'}


### Sets intersections

The result set contains all elements which are in first set and in the second set.

Methods: `intersection`, `intersection_update`.

Operator: `&`.

In [6]:
a = set('abracadabra')
b = set('alacazam')

print('Letters which are both in a and b:', a & b)
print('Letters which are both in a and b:', a.intersection(b))
print('Set a after above operations:', a)

print('Result of intersection_update:', a.intersection_update(b))
print('Set a after above intersection_update:', a)

Letters which are both in a and b: {'c', 'a'}
Letters which are both in a and b: {'c', 'a'}
Set a after above operations: {'d', 'a', 'b', 'r', 'c'}
Result of intersection_update: None
Set a after above intersection_update: {'c', 'a'}


### Sets symmetric difference

All elements which are only in the first set or only in the second set.

Methods: `symmetric_difference`, `symmetric_difference_update`.

Operator: `^`.

In [7]:
a = set('abracadabra')
b = set('alacazam')

print('Letters which are only in the a or in the b:', a ^ b)
print('Letters which are only in the a or in the b:', a.symmetric_difference(b))
print('Set a after above operations:', a)

print('Result of symmetric_difference_update:', a.symmetric_difference_update(b))
print('Set a after above symmetric_difference_update:', a)

Letters which are only in the a or in the b: {'d', 'm', 'z', 'l', 'r', 'b'}
Letters which are only in the a or in the b: {'d', 'm', 'z', 'l', 'r', 'b'}
Set a after above operations: {'d', 'a', 'b', 'r', 'c'}
Result of symmetric_difference_update: None
Set a after above symmetric_difference_update: {'d', 'l', 'b', 'm', 'z', 'r'}


## Sets comparison

Methods for checking if there are similarities or differences among sets. Returns boolean value.

The notation: **first set**`.`**method**`(`**second set**`)`

In [8]:
orginal_set = set('abcdef')
subset_of_original = set('cde')
set_share_with_original = set('defghi')
set_out_of_original = set('yvz')
larger_containg_original_set = set('abcdefghi')

### Is element in the set

Syntax support: `in`.

In [9]:
print('Is "a" in {a, b, c, d, e, f}:', 'a' in orginal_set)
print('Is "a" not in {a, b, c, d, e, f}:', 'a' not in orginal_set)
print('Is "z" in {a, b, c, d, e, f}:', 'z' in orginal_set)

Is "a" in {a, b, c, d, e, f}: True
Is "a" not in {a, b, c, d, e, f}: False
Is "z" in {a, b, c, d, e, f}: False


### Is set an subset

If wish check if first set is subset of second, use the method name: `issubset` (the operator for it is: `<=`).

The method checking the same thing, but in opposite direction: `issuperset` (the operator for it is: `=>`).

Both methods support also checking for not equality (`<` and `>`). In order of that to work both sets cannot be equal (contain the same elememnts).

In [10]:
print('Is {c, d, e} subset of {a, b, c, d, e, f}:', subset_of_original.issubset(orginal_set))
print('Is {d, e, f, g, h, i} subset of {a, b, c, d, e, f}:', set_share_with_original.issubset(orginal_set))
print('Is {y, v, z} subset of {a, b, c, d, e, f}:', set_out_of_original.issubset(orginal_set))
print('Is {a, b, c, d, e, g, h, i} subset of {a, b, c, d, e, f}:', larger_containg_original_set.issubset(orginal_set))
print()
print('Is {c, d, e} < {a, b, c, d, e, f}:', set('cde') < set('abcdef'))
print('Is {a, b, c, d, e, f} < {c, d, e}:', set('abcdef') < set('cde'))
print('Is {c, d, e} < {c, d, e}:', set('cde') < set('cde'))
print('Is {c, d, e} <= {c, d, e}:', set('cde') <= set('cde'))


Is {c, d, e} subset of {a, b, c, d, e, f}: True
Is {d, e, f, g, h, i} subset of {a, b, c, d, e, f}: False
Is {y, v, z} subset of {a, b, c, d, e, f}: False
Is {a, b, c, d, e, g, h, i} subset of {a, b, c, d, e, f}: False

Is {c, d, e} < {a, b, c, d, e, f}: True
Is {a, b, c, d, e, f} < {c, d, e}: False
Is {c, d, e} < {c, d, e}: False
Is {c, d, e} <= {c, d, e}: True


In [11]:
print('Is {c, d, e} issuperset of {a, b, c, d, e, f}:', subset_of_original.issuperset(orginal_set))
print('Is {d, e, f, g, h, i} issuperset of {a, b, c, d, e, f}:', set_share_with_original.issuperset(orginal_set))
print('Is {y, v, z} issuperset of {a, b, c, d, e, f}:', set_out_of_original.issuperset(orginal_set))
print('Is {a, b, c, d, e, g, h, i} issuperset of {a, b, c, d, e, f}:', larger_containg_original_set.issuperset(orginal_set))
print()
print('Is {c, d, e} > {a, b, c, d, e, f}:', set('cde') > set('abcdef'))
print('Is {a, b, c, d, e, f} > {c, d, e}:', set('abcdef') > set('cde'))
print('Is {c, d, e} > {c, d, e}:', set('cde') > set('cde'))
print('Is {c, d, e} >= {c, d, e}:', set('cde') >= set('cde'))

Is {c, d, e} issuperset of {a, b, c, d, e, f}: False
Is {d, e, f, g, h, i} issuperset of {a, b, c, d, e, f}: False
Is {y, v, z} issuperset of {a, b, c, d, e, f}: False
Is {a, b, c, d, e, g, h, i} issuperset of {a, b, c, d, e, f}: True

Is {c, d, e} > {a, b, c, d, e, f}: False
Is {a, b, c, d, e, f} > {c, d, e}: True
Is {c, d, e} > {c, d, e}: False
Is {c, d, e} >= {c, d, e}: True


### Sets with the same elements

The equal operator (`==`) is adapted for compearing the elements of sets. The negation version of the operator (`!=`) is supported as well.

In [12]:
first_set = { 1, 2, 3 }
second_set = { 1, 2, 3 }

print('Are sets equal:', first_set == second_set)
print('Is first set a subset of second:', first_set.issubset(second_set))
print('Is first set a superset of second:', first_set.issuperset(second_set))

Are sets equal: True
Is first set a subset of second: True
Is first set a superset of second: True


### Is set not shearing any elements with other

Method name `isdisjoint`.

In [13]:
print('Is {c, d, e} isdisjoint of {a, b, c, d, e, f}:', subset_of_original.isdisjoint(orginal_set))
print('Is {d, e, f, g, h, i} isdisjoint of {a, b, c, d, e, f}:', set_share_with_original.isdisjoint(orginal_set))
print('Is {y, v, z} isdisjoint of {a, b, c, d, e, f}:', set_out_of_original.isdisjoint(orginal_set))
print('Is {a, b, c, d, e, g, h, i} isdisjoint of {a, b, c, d, e, f}:', larger_containg_original_set.isdisjoint(orginal_set))

Is {c, d, e} isdisjoint of {a, b, c, d, e, f}: False
Is {d, e, f, g, h, i} isdisjoint of {a, b, c, d, e, f}: False
Is {y, v, z} isdisjoint of {a, b, c, d, e, f}: True
Is {a, b, c, d, e, g, h, i} isdisjoint of {a, b, c, d, e, f}: False


## Objects in set

Sets can accept object by default.

If you wish controle the process, you need override the methods `__hash__` and `__eq__`. First is used to "place" the object inside the hash table.

Implementing just `__eq__` will end up with a TypeError exception for unhashable type.

In [14]:
class Node():
    def __init__(self, value):
        self.value = value
        self.other_value = (value * 45) % 4

    def __hash__(self):
        return hash(self.value)

    def __eq__(self, other):
        return self.value == other.value

    def __repr__(self):
        return f'Node({self.value})'

print({ Node(1), Node(2), Node(3) })

{Node(1), Node(2), Node(3)}


## Links

* [Python 3 Data Structures, Sets](https://docs.python.org/3/tutorial/datastructures.html#sets)
* [Real Python](https://realpython.com/python-sets/)
