## Sets in Python

In Python, a Set is an unordered collection of data items that are unique. In other words, Python Set is a collection of elements (Or objects) that contains no duplicate elements.

Unlike List, Python Set doesn’t maintain the order of elements, i.e., It is an unordered data set. So you cannot access elements by their index or perform insert operation using an index number.

![image.png](attachment:image.png)

In [1]:
# Creating a set from list
# list with duplicate items
number_list = [20, 30, 20, 30, 50, 30]
# create a set from a list
sample_set = set(number_list)

print(sample_set)
# Output {50, 20, 30}

{50, 20, 30}


## Creating a set with mutable elements


In [2]:
# set of mutable types
sample_set = {'Mark', 'Jessa', [35, 78, 92]}
print(sample_set)
# Output TypeError: unhashable type: 'list' [35, 78, 92]

TypeError: unhashable type: 'list'

In [3]:
# Empty Set
empty_set = set()
print(type(empty_set))


<class 'set'>


**When the same object ‘person’ is created without any items inside the curly brackets then it will be created as a dictionary which is another built-in data structure in Python.

So whenever you wanted to create an empty set always use the set() constructor.**

In [4]:
emptySet = {}
print(type(emptySet)) # class 'dict'

<class 'dict'>


## Accessing the items of a set

The items of the set are unordered and they don’t have any index number. In order to access the items of a set, we need to iterate through the set object using a for loop

In [5]:
book_set = {"Harry Potter", "Angels and Demons", "Atlas Shrugged"}
for book in book_set:
    print(book)

Angels and Demons
Harry Potter
Atlas Shrugged


## Checking if an item exists in Set


In [6]:
book_set = {"Harry Potter", "Angels and Demons", "Atlas Shrugged"}
if 'Harry Potter' in book_set:
    print("Book exists in the book set")
else:
    print("Book doesn't exist in the book set")
# Output Book exists in the book set

# check another item which is not present inside a set
print("A Man called Ove" in book_set)  
# Output False

Book exists in the book set
False


## Finding length of set

In [10]:
# finding length of set
print(len(book_set))

3


## Adding items to a Set

Though the value of the item in a Set can’t be modified. We can add new items to the set using the following two ways.

1. The add() method: The add() method is used to add one item to the set.

2. Using update() Method: The update() method is used to multiple items to the Set. We need to pass the list of items to the update() method.

In [None]:
book_set = {'Harry Potter', 'Angels and Demons'}
# add() method
book_set.add('The God of Small Things')
# display the updated set
print(book_set)
# Output {'Harry Potter', 'The God of Small Things', 'Angels and Demons'}

# update() method to add more than one item
book_set.update(['Atlas Shrugged', 'Ulysses'])
# display the updated set
print(book_set)
# Output {'The God of Small Things', 'Angels and Demons', 'Atlas Shrugged', 'Harry Potter', 'Ulysses'}

## Removing item(s) from a set


In [11]:
color_set = {'red', 'orange', 'yellow', 'white', 'black'}

# remove single item
color_set.remove('yellow')
print(color_set)
# Output {'red', 'orange', 'white', 'black'}

# remove single item from a set without raising an error
color_set.discard('white')
print(color_set)
# Output {'orange', 'black', 'red'}

# remove any random item from a set
deleted_item = color_set.pop()
print(deleted_item)

# remove all items
color_set.clear()
print(color_set)
# output set()

# delete a set
del color_set

{'black', 'red', 'white', 'orange'}
{'black', 'red', 'orange'}
black
set()


## remove() vs discard()

1. The remove() method throws a keyerror if the item you want to delete is not present in a set

2. The discard() method will not throw any error if the item you want to delete is not present in a set

## Set Operations


**Union of sets**

![image.png](attachment:image.png)

In [12]:
color_set = {'violet', 'indigo', 'blue', 'green', 'yellow'}
remaining_colors = {'indigo', 'orange', 'red'}

# union of two set using OR operator
vibgyor_colors = color_set | remaining_colors
print(vibgyor_colors)
# Output {'indigo', 'blue', 'violet', 'yellow', 'red', 'orange', 'green'}

# union using union() method
vibgyor_colors = color_set.union(remaining_colors)
print(vibgyor_colors)
# Output {'indigo', 'blue', 'violet', 'yellow', 'red', 'orange', 'green'}

{'indigo', 'red', 'yellow', 'violet', 'orange', 'blue', 'green'}
{'indigo', 'red', 'yellow', 'violet', 'orange', 'blue', 'green'}


## Intersection of Sets

The intersection of two sets will return only the common elements in both sets. The intersection can be done using the & operator and intersection() method.

The intersection() method will return a new set with only the common elements in all the sets. Use this method to find the common elements between two or more sets.

The following image shows the intersection operation of two sets A and B.

![image.png](attachment:image.png)

In [13]:
color_set = {'violet', 'indigo', 'blue', 'green', 'yellow'}
remaining_colors = {'indigo', 'orange', 'red'}

# intersection of two set using & operator
new_set = color_set & remaining_colors
print(new_set)
# Output {'indigo'}

# using intersection() method
new_set = color_set.intersection(remaining_colors)
print(new_set)
# Output {'indigo'}


{'indigo'}
{'indigo'}


## Intersection update
In addition to the above intersection() method, we have one more method called intersection_update().

There are two key differences between intersection() and intersection_update()

1. intersection() will not update the original set but intersection_update() will update the original set with only the common elements.

2. intersection() will have a return value which is the new set with common elements between two or more sets whereas intersection_update() will not have any return value.

In [14]:
color_set = {'violet', 'indigo', 'blue', 'green', 'yellow'}
remaining_colors = {'indigo', 'orange', 'red'}

# intersection of two sets
common_colors = color_set.intersection(remaining_colors)
print(common_colors)  # output {'indigo'}
# original set after intersection
print(color_set)
# Output {'indigo', 'violet', 'green', 'yellow', 'blue'}

# intersection of two sets using intersection_update()
color_set.intersection_update(remaining_colors)
# original set after intersection
print(color_set)
# output {'indigo'}

{'indigo'}
{'indigo', 'yellow', 'violet', 'blue', 'green'}
{'indigo'}


## Difference of Sets

The difference operation will return the items that are present only in the first set i.e the set on which the method is called. This can be done with the help of the - operator or the difference() method.

![image.png](attachment:image.png)

In [15]:
color_set = {'violet', 'indigo', 'blue', 'green', 'yellow'}
remaining_colors = {'indigo', 'orange', 'red'}

# difference using '-' operator
print(color_set - remaining_colors)
# output {'violet', 'blue', 'green', 'yellow'}

# using difference() method
print(color_set.difference(remaining_colors))
# Output {'violet', 'blue', 'green', 'yellow'}

{'violet', 'blue', 'yellow', 'green'}
{'violet', 'blue', 'yellow', 'green'}


## Difference update

In addition to the difference(), there is one more method called difference_update(). There are two main differences between these two methods.

1. The difference() method will not update the original set while difference_update() will update the original set.
2. The difference() method will return a new set with only the unique elements from the set on which this method was called. difference_update() will not return anything.

In [16]:
color_set = {'violet', 'indigo', 'blue', 'green', 'yellow'}
remaining_colors = {'indigo', 'orange', 'red'}

# difference of two sets
new_set = color_set.difference(remaining_colors)
print(new_set)
# output {'violet', 'yellow', 'green', 'blue'}
# original set after difference
print(color_set)
# {'green', 'indigo', 'yellow', 'blue', 'violet'}

# difference of two sets
color_set.difference_update(remaining_colors)
# original set after difference_update
print(color_set)
# Output {'green', 'yellow', 'blue', 'violet'}

{'violet', 'blue', 'yellow', 'green'}
{'indigo', 'yellow', 'violet', 'blue', 'green'}
{'yellow', 'violet', 'blue', 'green'}


## Symmetric difference of Sets

The Symmetric difference operation returns the elements that are unique in both sets. This is the opposite of the intersection. This is performed using the ^ operator or by using the symmetric_difference() method.

![image.png](attachment:image.png)


In [17]:
color_set = {'violet', 'indigo', 'blue', 'green', 'yellow'}
remaining_colors = {'indigo', 'orange', 'red'}

# symmetric difference between using ^ operator
unique_items = color_set ^ remaining_colors
print(unique_items)
# Output {'blue', 'orange', 'violet', 'green', 'yellow', 'red'}

# using symmetric_difference()
unique_items2 = color_set.symmetric_difference(remaining_colors)
print(unique_items2)
# Output {'blue', 'orange', 'violet', 'green', 'yellow', 'red'}

{'red', 'violet', 'orange', 'blue', 'green', 'yellow'}
{'red', 'violet', 'orange', 'blue', 'green', 'yellow'}


## Symmetric difference update

In addition to the symmetric_difference(), there is one more method called symmetric_difference_update(). There are two main differences between these two methods.

The symmetric_difference() method will not update the original set while symmetric_difference_update() will update the original set with the unique elements from both sets.



In [18]:
color_set = {'violet', 'indigo', 'blue', 'green', 'yellow'}
remaining_colors = {'indigo', 'orange', 'red'}

# symmetric difference
unique_items = color_set.symmetric_difference(remaining_colors)
print(unique_items)
# output {'yellow', 'green', 'violet', 'red', 'blue', 'orange'}
# original set after symmetric difference
print(color_set)
# {'yellow', 'green', 'indigo', 'blue', 'violet'}

# using symmetric_difference_update()
color_set.symmetric_difference_update(remaining_colors)
# original set after symmetric_difference_update()
print(color_set)
# {'yellow', 'green', 'red', 'blue', 'orange', 'violet'}

{'red', 'violet', 'orange', 'blue', 'green', 'yellow'}
{'indigo', 'yellow', 'violet', 'blue', 'green'}
{'red', 'yellow', 'violet', 'orange', 'blue', 'green'}


## Copying a Set

In Python, we can copy the items from one set to another in three ways.

1. Using copy() method.
2. Using the set() constructor
3. Using the = (assignment) operator (assigning one set to another)

In [19]:
color_set = {'violet', 'blue', 'green', 'yellow'}

# creating a copy using copy()
color_set2 = color_set.copy()

# creating a copy using set()
color_set3 = set(color_set)

# creating a copy using = operator
color_set4 = color_set

# printing the original and new copies
print('Original set:', color_set)
# {'violet', 'green', 'yellow', 'blue'}

print('Copy using copy():', color_set2)
# {'green', 'yellow', 'blue', 'violet'}

print('Copy using set(): ', color_set3)
# {'green', 'yellow', 'blue', 'violet'}

print('Copy using assignment', color_set4)
# {'green', 'yellow', 'blue', 'violet'}

Original set: {'violet', 'blue', 'green', 'yellow'}
Copy using copy(): {'violet', 'blue', 'green', 'yellow'}
Copy using set():  {'violet', 'blue', 'green', 'yellow'}
Copy using assignment {'violet', 'blue', 'green', 'yellow'}


## Subset and Superset

In Python, we can find whether a set is a subset or superset of another set. We need to use the set methods issubset() and issuperset.

**issubset()**

The issubset() is used to find whether a set is a subset of another set i.e all the items in the set on which this method is called are present in the set which is passed as an argument.

This method will return true if a set is a subset of another set otherwise, it will return false.

**issuperset()**

This method determines whether the set is a superset of another set.

It checks whether the set on which the method is called contains all the items present in the set passed as the argument and returns true if the set is a superset of another set; otherwise, it will return false.

In [20]:
color_set1 = {'violet', 'indigo', 'blue', 'green', 'yellow', 'orange', 'red'}
color_set2 = {'indigo', 'orange', 'red'}

# subset
print(color_set2.issubset(color_set1))
# True
print(color_set1.issubset(color_set2))
# False

# superset
print(color_set2.issuperset(color_set1))
# False
print(color_set1.issuperset(color_set2))
# True

True
False
False
True


## find whether two sets are disjoint

The isdisjoint() method will find whether two sets are disjoint i.e there are no common elements. This method will return true if they are disjoint otherwise it will return false.

In [21]:
color_set1 = {'violet', 'blue', 'yellow', 'red'}
color_set2 = {'orange', 'red'}
color_set3 = {'green', 'orange'}

# disjoint
print(color_set2.isdisjoint(color_set1))
# Output 'False' because contains 'red' as a common item

print(color_set3.isdisjoint(color_set1))
# Output 'True' because no common items

False
True


## Sort the set

A set is an unordered collection of data items, so there is no point n sorting it. If you still want to sort it using the sorted() method but this method will return the list

The sorted() function is used to sort the set. This will return a new list and will not update the original set.

In [22]:
set1 = {20, 4, 6, 10, 8, 15}
sorted_list = sorted(set1)
sorted_set = set(sorted_list)
print(sorted_set)
# output {4, 6, 8, 10, 15, 20}

{4, 6, 8, 10, 15, 20}


## Frozen Set

**When to use frozenset ?**

1. When you want to create an immutable set that doesn’t allow adding or removing items from a set.

2. When you want to create a read-only set.


In [23]:
rainbow = ('violet', 'indigo', 'blue')
f_set = frozenset(rainbow)
# Add to frozenset
f_set.add(f_set)
# output AttributeError: 'frozenset' object has no attribute 'add'


AttributeError: 'frozenset' object has no attribute 'add'

## When to use a Set Data structure?

1. Eliminating duplicate entries: In case a set is initialized with multiple entries of the same value, then the duplicate entries will be dropped in the actual set. A set will store an item only once.

2. Membership Testing: In case we need to check whether an item is present in our dataset or not, then a Set could be used as a container. Since a Set is implemented using Hashtable, it is swift to perform a lookup operation, i.e., for each item, one unique hash value will be calculated, and it will be stored like a key-value pair.

3. So to search an item, we just have to compute that hash value and search the table for that key. So the speed of lookup is just O(1).

4. Performing arithmetic operations similar to Mathematical Sets: All the arithmetic operations like union, Intersection, finding the difference that we perform on the elements of two sets could be performed on this data structure.
