Hello and welcome to this video on data structures in Python. In the previous video, we learned about dictionaries, which are unordered collections of key-value pairs that can store different types of data and allow fast lookup and modification. In this video, we will learn about another data structure that is very simple and useful: sets. Sets are unordered collections of unique items that can perform set operations, such as union, intersection, and difference.

## Introduction to Sets

A set is a data structure that contains only distinct items, meaning that there are no duplicates in a set. Sets are useful for storing and manipulating data that has some kind of membership or inclusion, such as categories, tags, or groups. Sets can also be used to perform mathematical operations on sets, such as finding the common or different elements between two sets.

To create a set, we use curly braces { } and separate the items with commas. For example, we can create a set of colors like this:

In [2]:
colors = {"red", "green", "blue"}
colors

{'blue', 'green', 'red'}

We can also create an empty set by using the set function with no arguments:

In [3]:
empty_set = set()

Note that we cannot use just the curly braces to create an empty set, because that would create an empty dictionary instead.

To access the items in a set, we cannot use indexing or slicing, because sets are unordered and have no fixed positions. Instead, we can use loops or the in operator to check if an item is in a set. For example, to print each color in the colors set, we can use a for loop:

In [4]:
for color in colors:
    print(color)

blue
green
red


To check if "yellow" is in the colors set, we can use the in operator:

In [5]:
"yellow" in colors

False

To modify the items in a set, we can use some of the common set methods, such as add, remove, discard, and clear. For example, to add a new item "black" to the colors set, we can use the add method:

In [6]:
colors.add("black")

In [7]:
colors

{'black', 'blue', 'green', 'red'}

To remove an item "red" from the colors set and raise an error if it does not exist, we can use the remove method:

In [8]:
colors.remove("red")

In [9]:
colors

{'black', 'blue', 'green'}

To remove an item "yellow" from the colors set and do nothing if it does not exist, we can use the discard method:

In [10]:
colors.discard("yellow")

In [11]:
colors

{'black', 'blue', 'green'}

To delete all the items from the colors set, we can use the clear method:

In [12]:
colors.clear()

In [13]:
colors

set()

There are many other set methods that we can use to perform various operations on sets, such as union, intersection, difference, symmetric_difference, issubset, issuperset, and isdisjoint. You can find more information about these methods in the Python documentation or by using the help function in Python.

## Set Operations

In this section, we will look at some of the set operations that we can use to manipulate and compare sets in Python. We will use the following example sets of fruits and vegetables:

In [16]:
fruits = {"apple", "banana", "orange", "grape", "cucumber"}
vegetables = {"carrot", "potato", "onion", "tomato", "cucumber"}

- The intersection operation returns a new set that contains only the items that are common to both sets. For example, to get a set of the items that are both fruits and vegetables, we can use the intersection method or the & operator:

In [17]:
fruits.intersection(vegetables)
fruits & vegetables

{'cucumber'}

- The difference operation returns a new set that contains only the items that are in the first set but not in the second set. For example, to get a set of the items that are fruits but not vegetables, we can use the difference method or the - operator:

In [18]:
fruits.difference(vegetables)
fruits - vegetables

{'apple', 'banana', 'grape', 'orange'}

- The symmetric_difference operation returns a new set that contains only the items that are in either set but not in both sets. For example, to get a set of the items that are either fruits or vegetables but not both, we can use the symmetric_difference method or the ^ operator:

In [19]:
fruits.symmetric_difference(vegetables)
fruits ^ vegetables

{'apple', 'banana', 'carrot', 'grape', 'onion', 'orange', 'potato', 'tomato'}

- The issubset operation returns a boolean value that indicates whether the first set is a subset of the second set, meaning that all the items in the first set are also in the second set. For example, to check if the set {"apple", "banana"} is a subset of the fruits set, we can use the issubset method or the <= operator:

In [20]:
{"apple", "banana"}.issubset(fruits)
{"apple", "banana"} <= fruits

True

- The issuperset operation returns a boolean value that indicates whether the first set is a superset of the second set, meaning that all the items in the second set are also in the first set. For example, to check if the fruits set is a superset of the set {"apple", "banana"}, we can use the issuperset method or the >= operator:

In [21]:
fruits.issuperset({"apple", "banana"})
fruits >= {"apple", "banana"}

True

- The isdisjoint operation returns a boolean value that indicates whether the two sets have no items in common. For example, to check if the fruits set and the vegetables set are disjoint, we can use the isdisjoint method:

In [22]:
fruits.isdisjoint(vegetables)

False

## Practical use cases of Sets

- Sets can be used to perform mathematical operations on sets, such as finding the common or different elements between two sets. For example, we can create two sets of friends and find their intersection and difference like this:

In [None]:
friends1 = {"Alice", "Bob", "Charlie"}
friends2 = {"Bob", "David", "Eve"}
common_friends = friends1 & friends2 # intersection
different_friends = friends1 ^ friends2 # symmetric difference

- Sets can be used to remove duplicates from a list or a string, by converting them to a set and then back to a list or a string. For example, we can remove duplicate letters from a word like this:

In [23]:
word = "banana"
unique_letters = list(set(word)) # convert to set and then to list
unique_word = "".join(unique_letters) # join the list to a string

In [24]:
unique_word

'bna'