# Sets in Python

Sets are unordered collections of unique elements in Python. They are useful for storing items when the order does not matter and duplicates are not allowed.

In [1]:
# so simple set example
students = {"Alice", "Bob", "Charlie", "David", "Bob"}  # 'Bob' is duplicated
print(students)  # Output: {'Alice', 'Bob', 'Charlie', 'David'}

{'Alice', 'Charlie', 'Bob', 'David'}


In [2]:
# we can also perform set operations like union, intersection, and difference
set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}
union_set = set_a.union(set_b)
union_set  # Output: {1, 2, 3, 4, 5, 6}

{1, 2, 3, 4, 5, 6}

In [3]:
# intersection - elements common to both sets
intersection_set = set_a.intersection(set_b)
intersection_set  # Output: {3, 4}

{3, 4}

In [4]:
# we can also do difference - elements in set_a but not in set_b
difference_set = set_a.difference(set_b) # this is not symmetrical
difference_set  # Output: {1, 2}

{1, 2}

In [None]:
# finally we can do symmetrical difference - elements in either set_a or set_b but not in both
symmetric_difference_set = set_a.symmetric_difference(set_b)
symmetric_difference_set  # Output: {1, 2, 5, 6

{1, 2, 5, 6}

: 

## Use cases for Sets

* Removing duplicates from a list
* Membership testing (checking if an item is in a collection)
* Performing mathematical set operations like union, intersection, and difference

## References

* [Python Sets Documentation](https://docs.python.org/3/tutorial/datastructures.html#sets)

## Pandas and Sets

Pandas offers several functionalities that leverage the properties of sets, such as:
* Removing duplicates from DataFrames using `drop_duplicates()`
* Performing set operations on Series and DataFrames using methods like `merge()`, `isin()`, and `concat()`
* also has unique() method to get unique values from a Series or DataFrame column.