<small><small><i>
All the IPython Notebooks in this lecture series by Dr. Milan Parmar are available @ **[GitHub](https://github.com/milaan9/02_Python_Datatypes)**
</i></small></small>

# Python Sets

A set is an unordered collection of items. Every set element is unique (no duplicates) and must be immutable (cannot be changed).

However, a set itself is mutable. We can add or remove items from it.

Sets can also be used to perform mathematical set operations like union, intersection, symmetric difference, etc.

Sets in Python are a collection of unordered and unindexed elements of different data types. **Every element** in a set **should be unique** (no duplicates allowed) & **immutable** (cannot be modified or changed). Since the elements in the set are unordered and unindexed, we cannot assure that the output will be printed in the same way in which the elements are stored in a set.

## How Are Sets Better Than Other DataTypes?

Sets will not contain multiple occurrences of the same element, they are very useful in removing duplicate elements from a list or a tuple. Also, they are useful in computing mathematical notations such as union, intersection, etc.


## Creating Python Sets

A set is created by placing all the items (elements) inside curly braces **`{}`**, separated by comma, or by using the built-in **`set()`** function.

It can have any number of items and they may be of different types (integer, float, tuple, string etc.). But a set cannot have mutable elements like lists, sets or dictionaries as its elements.

In [1]:
# Different types of sets in Python

# Example 1:

# set of integers
my_set = {1, 2, 3}
print(my_set)

# set of mixed datatypes
my_set = {1.0, "Hello", (1, 2, 3)}
print(my_set)

{1, 2, 3}
{1.0, (1, 2, 3), 'Hello'}


In [2]:
# Different types of sets in Python

# Example 2:

# set cannot have duplicates
# Output: {1, 2, 3, 4}
my_set = {1, 2, 3, 4, 3, 2}
print(my_set)

# we can make set from a list
# Output: {1, 2, 3}
my_set = set([1, 2, 3, 2])
print(my_set)

# set cannot have mutable items
# here [3, 4] is a mutable list
# this will cause an error.

my_set = {1, 2, [3, 4]}

{1, 2, 3, 4}
{1, 2, 3}


TypeError: unhashable type: 'list'

### Creating an empty set is a bit tricky.

Empty curly braces **`{}`** will make an empty dictionary in Python. To make a set without any elements, we use the **`set()`** function without any argument.

In [3]:
# Distinguish set and dictionary while creating empty set

# initialize a with {}
a = {}

# check data type of a
print(type(a))

# initialize a with set()
a = set()

# check data type of a
print(type(a))

<class 'dict'>
<class 'set'>


## Modifying a set in Python

Sets are mutable. However, since they are unordered, indexing has no meaning.

We cannot access or change an element of a set using indexing or slicing. Set data type does not support it.

We can add a single element using the **`add()`** method, and multiple elements using the **`update()`** method. The **`update()`** method can take tuples, lists, strings or other sets as its argument. In all cases, duplicates are avoided.

In [4]:
# initialize my_set
my_set = {1, 3}
print(my_set)

# if you uncomment line 9,
# you will get an error
# TypeError: 'set' object does not support indexing

# my_set[0]

# add an element
# Output: {1, 2, 3}
my_set.add(2)
print(my_set)

# add multiple elements
# Output: {1, 2, 3, 4}
my_set.update([2, 3, 4])
print(my_set)

# add list and set
# Output: {1, 2, 3, 4, 5, 6, 8}
my_set.update([4, 5], {1, 6, 8})
print(my_set)

{1, 3}
{1, 2, 3}
{1, 2, 3, 4}
{1, 2, 3, 4, 5, 6, 8}


## Removing elements from a set

A particular item can be removed from a set using the methods **`discard()`** and **`remove()`**.

The only difference between the two is that the **`discard()`** function leaves a set unchanged if the element is not present in the set. On the other hand, the **`remove()`** function will raise an error in such a condition (if element is not present in the set).

The following example will illustrate this.

In [5]:
# Difference between discard() and remove()

# initialize my_set
my_set = {1, 3, 4, 5, 6}
print(my_set)

# discard an element
# Output: {1, 3, 5, 6}
my_set.discard(4)
print(my_set)

# remove an element
# Output: {1, 3, 5}
my_set.remove(6)
print(my_set)

# discard an element
# not present in my_set
# Output: {1, 3, 5}
my_set.discard(2)
print(my_set)

# remove an element
# not present in my_set
# you will get an error.
# Output: KeyError

my_set.remove(2)

{1, 3, 4, 5, 6}
{1, 3, 5, 6}
{1, 3, 5}
{1, 3, 5}


KeyError: 2

Similarly, we can remove and return an item using the **`pop()`** method.

Since set is an unordered data type, there is no way of determining which item will be popped. It is completely arbitrary.

We can also remove all the items from a set using the **`clear()`** method.

In [6]:
# initialize my_set
# Output: set of unique elements
my_set = set("HelloWorld")
print(my_set)

# pop an element
# Output: random element
print(my_set.pop())

# pop another element
my_set.pop()
print(my_set)

# clear my_set
# Output: set()
my_set.clear()
print(my_set)

print(my_set)

{'W', 'r', 'H', 'e', 'l', 'o', 'd'}
W
{'H', 'e', 'l', 'o', 'd'}
set()
set()


## Set Membership Test

We can test if an item exists in a set or not, using the **`in`** keyword.

In [7]:
# in keyword in a set
# initialize my_set
my_set = set("apple")

# check if 'a' is present
# Output: True
print('a' in my_set)

# check if 'p' is present
# Output: False
print('p' not in my_set)

True
False


## Python Set Operations

Sets can be used to carry out mathematical set operations like union, intersection, difference and symmetric difference. We can do this with operators or methods.

Let us consider the following two sets for the following operations.

```python
>>> A = {1, 2, 3, 4, 5}
>>> B = {4, 5, 6, 7, 8}
```

### Set Union

<div>
<img src="img/set1.png" width="300"/>
</div>

Union of **`A`** and **`B`** is a set of all elements from both sets.

Union is performed using **`|`** operator. Same can be accomplished using the **`union()`** method.

In [8]:
# Example 1:

# Set union method
# initialize A and B
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

# use | operator
# Output: {1, 2, 3, 4, 5, 6, 7, 8}
print(A | B)

{1, 2, 3, 4, 5, 6, 7, 8}


In [9]:
# Example 2:

# use union function
A.union(B)
{1, 2, 3, 4, 5, 6, 7, 8}

# use union function on B
B.union(A)
{1, 2, 3, 4, 5, 6, 7, 8}

{1, 2, 3, 4, 5, 6, 7, 8}

### Set Intersection

<div>
<img src="img/set2.png" width="300"/>
</div>

Intersection of **`A`** and **`B`** is a set of elements that are common in both the sets.

Intersection is performed using **`&`** operator. Same can be accomplished using the **`intersection()`** method.

In [10]:
# Example 1:

# Intersection of sets
# initialize A and B
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

# use & operator
# Output: {4, 5}
print(A & B)

{4, 5}


In [11]:
# Example 2:

# use intersection function on A
A.intersection(B)
{4, 5}

# use intersection function on B
B.intersection(A)
{4, 5}

{4, 5}

### Set Difference

<div>
<img src="img/set3.png" width="300"/>
</div>

Difference of the set **`B`** from set **`A`**, **`(A - B)`** is a set of elements that are only in A but not in **`B`**. Similarly, **`B - A`** is a set of elements in **`B`** but not in **`A`**.

Difference is performed using **`-`** operator. Same can be accomplished using the **`difference()`** method.

In [12]:
# Example 1:

# Difference of two sets
# initialize A and B
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

# use - operator on A
# Output: {1, 2, 3}
print(A - B)

{1, 2, 3}


In [13]:
# Example 2:

# use difference function on A
A.difference(B)
{1, 2, 3}

# use - operator on B
B - A
{8, 6, 7}

# use difference function on B
>>> B.difference(A)
{8, 6, 7}


SyntaxError: invalid syntax (<ipython-input-13-1e5380e32c12>, line 12)

### Set Symmetric Difference

<div>
<img src="img/set4.png" width="300"/>
</div>

Symmetric Difference of **`A`** and **`B`** is a set of elements in **`A`**A and **`B`** but not in both (excluding the intersection).

Symmetric difference is performed using `^` operator. Same can be accomplished using the method `symmetric_difference()`.

In [14]:
# Example 1:

# Symmetric difference of two sets
# initialize A and B
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

# use ^ operator
# Output: {1, 2, 3, 6, 7, 8}
print(A ^ B)

{1, 2, 3, 6, 7, 8}


In [15]:
# Example 2:

# use symmetric_difference function on A
A.symmetric_difference(B)
{1, 2, 3, 6, 7, 8}

# use symmetric_difference function on B
B.symmetric_difference(A)
{1, 2, 3, 6, 7, 8}

{1, 2, 3, 6, 7, 8}

## Other Python Set Methods

There are many set methods, some of which we have already used above. Here is a list of all the methods that are available with the set objects:

| Operation | Operator | Method | Description |
|:---- |:---- |:---- | :--- |
| **add** |  | **`add()`** |  Adds an element to the set. | 
| **copy** |  | **`copy()`** |  Returns a copy of the set. | 
| **remove** |  | **`remove()`** |  Removes an element from the set. If the element is not a member, raises a **`KeyError`**. |
| **discard** |  | **`discard()`** |  Removes an element from the set if it is a member. (Do nothing if the element is not in set). | 
| **pop** |  | **`pop()`** |  Removes and returns an arbitrary set element. Raises **`KeyError`** if the set is empty. | 
| **clear** |  | **`clear()`** |  Removes all elements from the set. | 
| **union** | **`A \| B`** | **`A.union(B)`** |  Returns the union of sets in a new set. | 
| **update** | **`A \|=B`** | **`A.update(B)`** |  Updates the set with the union of itself and others. | 
| **intersection** | **`A & B`** | **`A.intersection(B)`** |  Returns the intersection of two sets as a new set. | 
| **intersection_update** | **`A &=B`** | **`A.intersection_update(B)`** |  Updates the set with the intersection of itself and another. | 
| **disjoint** |  | **`isdisjoint()`** |  Returns **`True`** if two sets have a null intersection. | 
| **difference** | **`A – B`** | **`A.difference(B)`** |  Returns the difference of two or more sets as a new set. | 
| **difference_update** | **`A -=B`** | **`A.difference_update(B)`** |  Removes all elements of another set from this set. | 
| **symmetric_difference** | **`A ^ B`** | **`A.symmetric_difference(B)`** |  Returns the symmetric difference of two sets as a new set. | 
| **symmetric_difference_update** | **`A ^=B`** | **`A.symmetric_difference_update(B)`** |  Updates a set with the symmetric difference of itself and another. | 
| **subset** | **`A <=B`** | **`A.issubset(B)`** |  Returns **`True`** if another set contains this set. | 
| **superset** | **`A >=B`** | **`A.issuperset(B)`** | Returns **`True`** if this set contains another set. |

## Python Frozenset

Frozenset is a new class that has the characteristics of a set, but its elements cannot be changed once assigned. While tuples are immutable lists, frozensets are immutable sets.

Sets being mutable are unhashable, so they can't be used as dictionary keys. On the other hand, frozensets are hashable and can be used as keys to a dictionary.

Frozensets can be created using the **`frozenset()`** function.

The **`frozenset()`** function returns an immutable frozenset object initialized with elements from the given iterable.

Syntax:

**`frozenset([iterable])`**

In [16]:
## Example 1:

# Frozensets
# initialize A and B
A = frozenset([1, 2, 3, 4])
B = frozenset([3, 4, 5, 6])

In [17]:
# Example 2: frozenset() for Dictionary

# When you use a dictionary as an iterable for a frozen set, 
# it only takes keys of the dictionary to create the set.

# random dictionary
person = {"name": "John", "age": 23, "sex": "male"}

fSet = frozenset(person)
print('The frozen set is:', fSet)

The frozen set is: frozenset({'sex', 'name', 'age'})


In [18]:
# Example 3:

# tuple of vowels
vowels = ('a', 'e', 'i', 'o', 'u')

fSet = frozenset(vowels)
print('The frozen set is:', fSet)
print('The empty frozen set is:', frozenset())

# frozensets are immutable
fSet.add('v')

The frozen set is: frozenset({'u', 'i', 'e', 'a', 'o'})
The empty frozen set is: frozenset()


AttributeError: 'frozenset' object has no attribute 'add'

In [19]:
## Example 4:

A.isdisjoint(B)

A.difference(B)

A | B

A.add(3)


AttributeError: 'frozenset' object has no attribute 'add'

### Frozenset methods

This data type supports methods like **`copy()`**, **`difference()`**, **`intersection()`**, **`isdisjoint()`**, **`issubset()`**, **`issuperset()`**, **`symmetric_difference()`** and **`union()`**. Being immutable, it does not have methods that add or remove elements.

In [20]:
# Frozensets
# initialize A and B
A = frozenset([1, 2, 3, 4])
B = frozenset([3, 4, 5, 6])

# copying a frozenset
C = A.copy()  # Output: frozenset({1, 2, 3, 4})
print(C)

# union
print(A.union(B))  # Output: frozenset({1, 2, 3, 4, 5, 6})

# intersection
print(A.intersection(B))  # Output: frozenset({3, 4})

# difference
print(A.difference(B))  # Output: frozenset({1, 2})

# symmetric_difference
print(A.symmetric_difference(B))  # Output: frozenset({1, 2, 5, 6})

frozenset({1, 2, 3, 4})
frozenset({1, 2, 3, 4, 5, 6})
frozenset({3, 4})
frozenset({1, 2})
frozenset({1, 2, 5, 6})


### Other Frozenset methods 

Similarly, other set methods like **`isdisjoint()`**, **`issubset()`**, and **`issuperset()`** are also available.

In [21]:
# Frozensets
# initialize A, B and C
A = frozenset([1, 2, 3, 4])
B = frozenset([3, 4, 5, 6])
C = frozenset([5, 6])

# isdisjoint() method
print(A.isdisjoint(C))  # Output: True

# issubset() method
print(C.issubset(B))  # Output: True

# issuperset() method
print(B.issuperset(C))  # Output: True

True
True
True
