# Sets
A collection of "hashable" objects where all elements are always unique (even when adding duplicates).

Hashable is defined as an object that is immutable, which allows them to be unique instances which can be used for identification.

## Sets vs Lists
Although both are collections, list can contain duplicate values whereas set does not. Additionally, dictionaries cannot be stored in a set because they are not hashable by default, whereas they can be stored in a list.

Additionally, unlike lists, sets are unordered (cannot be sorted).

## Common Functions
This only lists the common set functions used. An exhaustive list can be found in Python's documentation:

https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset

### Insertion Functions
These are functions that insert data into the set.

In [7]:
my_set = {1, 2, 3 ,2, 1}
print(my_set)

my_set.add(4)
print(my_set)

my_set.update({1, 3, 5})
print(my_set)

my_set.update([2, 4 ,6])
print(my_set)

{1, 2, 3}
{1, 2, 3, 4}
{1, 2, 3, 4, 5}
{1, 2, 3, 4, 5, 6}


### Deletion Functions
These are functions that removes data from the set

#### pop()
Removes an arbitrary element from the set. Similar to the list's `pop()`, this returns the element that was removed from the set. Useful when blindly removing data from the set to slowly shrink the memory usage.

**This will throw an error if the set is empty**

In [11]:
my_set = {10, 73, 99, 2, 45, 68}
removed_value = my_set.pop()
print(removed_value)

2


#### remove() and discard()
Both of them removes the element from the set. However, `remove()` will throw an error if the element is not within the set whereas `discard()` will just ignore it. Useful when the set needs to be cleaned.

**`remove()` will throw an error if the value is not in the list**

In [13]:
authors = {'Bob', 'Ana', 'Kay'}
authors.remove('Bob')
print(authors)
# NOTE: Doing `authors.remove('Bob')` again will throw an error

authors.discard('Kay')
print(authors)
authors.discard('Kay')
print(authors)

{'Ana', 'Kay'}
{'Ana'}
{'Ana'}


#### clear()
This removes all contents from the set. Useful when the set is to be reused but existing content is unnecessary.

In [14]:
authors = {'Bob', 'Ana', 'Kay', 'Will', 'Cyril', 'Smith'}
def process_authors(authors):
    """
    TODO: Store the authors in the database
    """
    return authors

batch = []
for author in authors:
    batch.append(author)
    print(f'{batch=}')
    if len(batch) > 2:
        process_authors(batch)
        batch.clear()
        print(f'`batch` has been cleared: {batch=}')

if len(batch) > 0:
    process_authors(batch)
    batch.clear()
    print(f'`batch` has been cleared: {batch=}')

batch=['Cyril']
batch=['Cyril', 'Will']
batch=['Cyril', 'Will', 'Ana']
`batch` has been cleared: batch=[]
batch=['Bob']
batch=['Bob', 'Kay']
batch=['Bob', 'Kay', 'Smith']
`batch` has been cleared: batch=[]


### frozenset()
Creates an immutable copy of the set that can be used as a hashable value. Useful for building dictionaries that contains sets as the keys.

In [18]:
author_patents = {
    frozenset({'Ana'}): ['light bulb'],
    frozenset({'Ana', 'Bob'}): ['computer', 'electronics'],
    frozenset({'Bob'}): ['website']
}

print(author_patents[frozenset({'Ana', 'Bob'})])

['computer', 'electronics']


### copy()
Copies the set to another variable. This is useful when a set needs to be updated without touching the original one.

In [19]:
def function_update_set(set_input):
    """
    A function that updates the input set
    """
    set_input.clear()

my_set = {1,2,3}
copy_set = my_set.copy()
print(f'{my_set=}')
print(f'{copy_set=}')
function_update_set(copy_set)
print(f'{my_set=}')
print(f'{copy_set=}')

my_set={1, 2, 3}
copy_set={1, 2, 3}
my_set={1, 2, 3}
copy_set=set()


### Set Comparison Functions
These are functions that helps return the elements that are similar or different between 2 given sets.

In [20]:
patent_1_authors = {'Bob', 'Ana'}
patent_2_authors = {'Will', 'Cyril', 'Bob'}

# Returns the values that are in
# `patent_1_authors` but not in `patent_2_authors`
print(f'{patent_1_authors.difference(patent_2_authors)=}')

# Returns the values that are in both sets
print(f'{patent_1_authors.intersection(patent_2_authors)=}')

# Returns the elements combination of values from both sets
print(f'{patent_1_authors.union(patent_2_authors)=}')

patent_1_authors.difference(patent_2_authors)={'Ana'}
patent_1_authors.intersection(patent_2_authors)={'Bob'}
patent_1_authors.union(patent_2_authors)={'Cyril', 'Will', 'Ana', 'Bob'}


### Set Conditional Functions
These are functions that helps compare 2 sets.

#### isdisjoint()
Returns `True` if the 2 sets have no similar elements, otherwise, returns `False`.

In [58]:
patent_1_authors = {'Bob', 'Ana'}
patent_2_authors = {'Will', 'Cyril', 'Bob'}

print(patent_1_authors.isdisjoint(patent_2_authors))

False


#### issubset()
Returns `True` if all elements within the set being compared is also in the other set.

In [21]:
celestial_bodies = {'Sun','Earth', 'Moon'}
solar_system_bodies = {'Sun', 'Mercury', 'Venus', 'Earth', 'Moon', 'Mars'}

print(celestial_bodies.issubset(solar_system_bodies))

True


#### issuperset()
Returns `True` if the set being compared has all the elements in the other set.

In [60]:
celestial_bodies = {'Sun','Earth', 'Moon'}
solar_system_bodies = {'Sun', 'Mercury', 'Venus', 'Earth', 'Moon', 'Mars'}

print(solar_system_bodies.issuperset(celestial_bodies))

True


## Other notes
When used as conditionals, a non-empty set is considered a `True` whereas an empty set is a `False`. Useful when needing to check whether the set is empty or not.

**Considered means that it will pass the `if/else` statement but its value itself is not a boolean, ie `{1,2,3} != True`**

In [22]:
authors = {'Bob', 'Ana'}

while authors:
    author = authors.pop()
    print(author)

# Another `pop()` will throw an error
# authors.pop()

Ana
Bob
