# **`Data Science Learners Hub`**

**Module : Python**

**email** : [datasciencelearnershub@gmail.com](mailto:datasciencelearnershub@gmail.com)

### **`# Exploring 'Set' Data Type in Python`**

### 1. Introduction

A **set** in Python is an unordered collection of unique elements. It's similar to a list or tuple, but it doesn't allow duplicate elements. Sets are commonly used to perform mathematical set operations like union, intersection, difference, etc. Sets are enclosed in curly braces `{ }`

### 2. Why Learn This Topic

Sets are crucial in Python for various reasons, and they play a significant role in SQL when dealing with databases. In SQL, we often use the `UNION`, `INTERSECT`, and `EXCEPT` operations, which are similar to set operations. Understanding sets in Python can make it easier to work with data and perform operations efficiently.

### 3. Real-world Scenario

Imagine you have a list of products, and you want to find the unique items across different categories. A set would be perfect for this scenario, giving you a collection of distinct products.


### 4. Practical Applications
* `Removing Duplicates`: Use sets to quickly remove duplicate elements from a list or string.
* `Membership Testing`: Check if an element exists in a set, useful in scenarios like checking for unique usernames.
* `Mathematical Operations`: Perform operations like union, intersection, and difference for data analysis.
* Finding common or unique items in data sets.
* Implementing algorithms like finding the most frequent elements.
* Representing relationships between entities (e.g., social networks).


### 5. Set Methods Summary

| Method                      | Syntax                                           | Return Type | In-place or Copy | Input Parameters       | One-liner Explanation                           | Peculiarities/Considerations                          |
|-----------------------------|--------------------------------------------------|-------------|------------------|-------------------------|-------------------------------------------------|------------------------------------------------------|
| Creating                    | my_set = {1, 2, 3}                               | Set         | Copy             | Elements                | Initialize a set                                  |                                                      |
| Adding Elements             | my_set.add(element)                              | None        | In-place         | Element                 | Add an element to the set                        | If the element already exists, it won't be added      |
| Removing Elements           | my_set.remove(element)                           | None        | In-place         | Element                 | Remove an element from the set                   | Raises KeyError if the element is not present         |
| Removing Elements (Safe)    | my_set.discard(element)                          | None        | In-place         | Element                 | Remove an element safely from the set           | Won't raise an error if the element is not present    |
| Removing Arbitrary Element | my_set.pop()                                     | Element     | In-place         | None                    | Remove and return an arbitrary element          | Raises KeyError if the set is empty                  |
| Clearing the Set            | my_set.clear()                                   | None        | In-place         | None                    | Remove all elements from the set                |                                                      |
| Checking for Membership     | element in my_set                               | bool        | Copy             | Element                 | Check if an element is in the set               |                                                      |
| Getting Set Size            | len(my_set)                                     | int         | Copy             | None                    | Get the number of elements in the set          |                                                      |
| Copying the Set             | my_set_copy = my_set.copy()                      | Set         | Copy             | None                    | Create a shallow copy of the set               |                                                      |
| Set Union                   | union_set = set1.union(set2)                    | Set         | Copy             | Another set (set2)      | Get the union of two sets                       |                                                      |
| Set Intersection            | intersection_set = set1.intersection(set2)      | Set         | Copy             | Another set (set2)      | Get the intersection of two sets                |                                                      |
| Set Difference              | difference_set = set1.difference(set2)          | Set         | Copy             | Another set (set2)      | Get the difference between two sets             |                                                      |
| Set Symmetric Difference    | symmetric_diff_set = set1.symmetric_difference(set2) | Set    | Copy             | Another set (set2)      | Get the symmetric difference of two sets       |                                                      |
| Set Update                  | set1.update(set2)                               | None        | In-place         | Another set (set2)      | Update set1 with elements from set2            |                                                      |
| Set Intersection Update     | set1.intersection_update(set2)                 | None        | In-place         | Another set (set2)      | Update set1 with the intersection of two sets  |                                                      |
| Set Difference Update       | set1.difference_update(set2)                   | None        | In-place         | Another set (set2)      | Update set1 with the difference of two sets    |                                                      |
| Set Symmetric Diff Update   | set1.symmetric_difference_update(set2)         | None        | In-place         | Another set (set2)      | Update set1 with the symmetric diff. of sets  |                                                      |
| Checking Subset             | is_subset = set1.issubset(set2)                 | bool        | Copy             | Another set (set2)      | Check if set1 is a subset of set2              |                                                      |
| Checking Superset           | is_superset = set1.issuperset(set2)             | bool        | Copy             | Another set (set2)      | Check if set1 is a superset of set2            |                                                      |


### 5. Syntax and Example

#### Creating a Set:
```python
my_set = {1, 2, 3, 4, 5}
```

#### Set Operations:
```python
set1 = {1, 2, 3}
set2 = {3, 4, 5}

# Union
union_set = set1.union(set2)  # {1, 2, 3, 4, 5}

# Intersection
intersection_set = set1.intersection(set2)  # {3}

# Difference
difference_set = set1.difference(set2)  # {1, 2}
```

### 6. Set Methods

#### Adding Elements:
```python
my_set.add(6)      # Add an element
my_set.update({7, 8})  # Update with another set
```

#### Removing Elements:
```python
my_set.remove(3)   # Remove an element
my_set.discard(4)   # Discard an element without raising an error if it's not present
my_set.pop()        # Remove and return an arbitrary element
```


### 6. Peculiarities and Considerations

* `Unordered`: Elements don't have a specific order.
* `Unique`: No duplicate elements allowed.
* `Mutable`: Can be modified after creation.
* `Not hashable`: Cannot be used as dictionary keys.

### 7. Common Mistakes

- Forgetting that sets are unordered and do not support indexing.
- Attempting to access elements using indices, which is not allowed in sets.
* `Trying to create sets with mutable elements`: Sets can only contain immutable types like numbers, strings, and tuples.
* `Assuming sets are ordered`: Use a list or tuple if order matters.
* `Modifying elements inside a set`: Sets store references to elements, so modifying them outside the set affects the set as well.



### 7. Hands-On Experience

#### `PART - I`

#### Question 1:

In [1]:
# Find the common elements between two sets
set1 = {1, 2, 3, 4}
set2 = {3, 4, 5, 6}
common_elements = set1.intersection(set2)
print(common_elements)
# Output: {3, 4}

{3, 4}


#### Question 2:

In [2]:
# Remove duplicates from a list
my_list = [1, 2, 2, 3, 4, 4, 5]
unique_elements = set(my_list)
print(list(unique_elements))
# Output: [1, 2, 3, 4, 5]

[1, 2, 3, 4, 5]


#### Question 3:

In [3]:
# Check if two sets are disjoint
set1 = {1, 2, 3}
set2 = {4, 5, 6}
are_disjoint = set1.isdisjoint(set2)
print(are_disjoint)
# Output: True

True


#### Explanation:

Two sets are considered disjoint if they have no common elements, meaning the intersection of the sets is an empty set. In mathematical terms, two sets A and B are disjoint if \(A ⋂ B = \{\}\), where `⋂` represents the intersection operation.

For example:
- Set \(A = \{1, 2, 3\}\)
- Set \(B = \{4, 5, 6\}\)

In this case, sets A and B are disjoint because their intersection is an empty set (\(A ⋂ B = \{\}\)). There are no elements that exist in both sets.

In Python, you can check if two sets are disjoint using the `isdisjoint()` method. Here's an example:

```python
set1 = {1, 2, 3}
set2 = {4, 5, 6}

if set1.isdisjoint(set2):
    print("Sets are disjoint.")
else:
    print("Sets have common elements.")
```

In the example, the `isdisjoint()` method returns `True` because set1 and set2 have no common elements, making them disjoint.

#### Question 4:

In [4]:
# Perform a symmetric difference operation on sets
set1 = {1, 2, 3, 4}
set2 = {3, 4, 5, 6}
symmetric_difference = set1.symmetric_difference(set2)
print(symmetric_difference)
# Output: {1, 2, 5, 6}

{1, 2, 5, 6}


#### Explanation:

The `symmetric_difference()` method returns a new set containing elements that are in either of the sets, but not in both. In other words, it returns the set of elements that are unique to each set.

#### `PART - II`

#### Question 1:
Write a Python program to remove the common elements between two sets and display the result.

In [5]:
set1 = {1, 2, 3, 4}
set2 = {3, 4, 5, 6}
result_set = set1.symmetric_difference(set2)
print(result_set)
# Output: {1, 2, 5, 6}

{1, 2, 5, 6}


#### Question 2:
Create a function that takes two sets as parameters and returns a new set containing only the elements common to both sets.

In [6]:
def common_elements(set1, set2):
    return set1.intersection(set2)

set1 = {1, 2, 3, 4}
set2 = {3, 4, 5, 6}
result_set = common_elements(set1, set2)
print(result_set)
# Output: {3, 4}

{3, 4}


#### Question 3:
Write a Python program to find the union of three sets.

In [7]:
set1 = {1, 2, 3}
set2 = {3, 4, 5}
set3 = {5, 6, 7}
union_set = set1.union(set2, set3)
print(union_set)
# Output: {1, 2, 3, 4, 5, 6, 7}

{1, 2, 3, 4, 5, 6, 7}


#### Question 4:
Create a function to check if a given set is a proper subset of another set.

In [8]:
def is_proper_subset(set1, set2):
    return set1.issubset(set2) and set1 != set2

set1 = {1, 2}
set2 = {1, 2, 3, 4}
result = is_proper_subset(set1, set2)
print(result)
# Output: True

True


#### Question 5:
Write a Python program to find the difference between two sets.

In [9]:
set1 = {1, 2, 3, 4}
set2 = {3, 4, 5, 6}
difference_set = set1.difference(set2)
print(difference_set)
# Output: {1, 2}

{1, 2}


#### Question 6:
Create a function that removes all occurrences of a specific element from a set.

In [10]:
def remove_element(my_set, element):
    my_set.discard(element)
    return my_set

original_set = {1, 2, 3, 4, 1, 2}
result_set = remove_element(original_set, 2)
print(result_set)
# Output: {1, 3, 4}

{1, 3, 4}


#### Question 7:
Write a Python program to find the intersection of multiple sets.

In [11]:
set1 = {1, 2, 3}
set2 = {2, 3, 4}
set3 = {3, 4, 5}
intersection_set = set.intersection(set1, set2, set3)
print(intersection_set)
# Output: {3}

{3}


#### Question 8:
Create a function that checks if two sets are disjoint.

In [12]:
def are_disjoint(set1, set2):
    return set1.isdisjoint(set2)

set1 = {1, 2, 3}
set2 = {4, 5, 6}
result = are_disjoint(set1, set2)
print(result)
# Output: True

True


#### Question 9:
Write a Python program to perform a symmetric difference operation on sets.

In [13]:
set1 = {1, 2, 3, 4}
set2 = {3, 4, 5, 6}
symmetric_difference = set1.symmetric_difference(set2)
print(symmetric_difference)
# Output: {1, 2, 5, 6}

{1, 2, 5, 6}


#### Question 10:
Create a function to check if a set is a superset of another set.

In [14]:
def is_superset(set1, set2):
    return set1.issuperset(set2)

set1 = {1, 2, 3, 4}
set2 = {2, 3}
result = is_superset(set1, set2)
print(result)
# Output: True

True


#### Question 11:
Write a Python program to find the Cartesian product of two sets.

In [15]:
from itertools import product

set1 = {1, 2}
set2 = {'a', 'b'}
cartesian_product = set(product(set1, set2))
print(cartesian_product)
# Output: {(1, 'a'), (2, 'a'), (1, 'b'), (2, 'b')}

{(2, 'a'), (2, 'b'), (1, 'a'), (1, 'b')}


#### Question 12:
Create a function that returns the power set (all subsets) of a given set.

In [16]:
from itertools import chain, combinations

def power_set(my_set):
    return list(chain.from_iterable(combinations(my_set, r) for r in range(len(my_set)+1)))

original_set = {1, 2, 3}
result_power_set = power_set(original_set)
print(result_power_set)
# Output: [(), (1,), (2,), (3,), (1, 2), (1, 3), (2, 3), (1, 2, 3)]

[(), (1,), (2,), (3,), (1, 2), (1, 3), (2, 3), (1, 2, 3)]


#### Explanation:


1. Import necessary modules:
   ```python
   from itertools import chain, combinations
   ```
   - The `itertools` module is used to work with iterators and generators.
   - Specifically, `chain` and `combinations` functions are imported.

2. Define the `power_set` function:
   ```python
   def power_set(my_set):
       return list(chain.from_iterable(combinations(my_set, r) for r in range(len(my_set)+1)))
   ```
   - The function takes a set `my_set` as an argument.
   - It uses a generator expression to create all combinations of the set for each length `r` ranging from 0 to the length of the set.
   - The `chain.from_iterable` function is used to flatten the resulting list of tuples into a single list.
   - The final result is a list containing all subsets of the given set, including the empty set and the set itself.

3. Define the original set and call the function:
   ```python
   original_set = {1, 2, 3}
   result_power_set = power_set(original_set)
   ```

4. Print the result:
   ```python
   print(result_power_set)
   ```
   - The output will be:
     ```
     [(), (1,), (2,), (3,), (1, 2), (1, 3), (2, 3), (1, 2, 3)]
     ```
     This output represents all possible subsets (including the empty set and the original set) of the given set `{1, 2, 3}`.

The power set is a mathematical concept that refers to the set of all subsets of a set. The code efficiently generates the power set using combinations from the itertools module.

#### Question 13:
Write a Python program to remove duplicate elements from a list that is stored as a set.

In [17]:
my_list = [1, 2, 3, 2, 4, 5, 1]
unique_elements = list(set(my_list))
print(unique_elements)
# Output: [1, 2, 3, 4, 5]

[1, 2, 3, 4, 5]


#### Question 14:
Create a function that checks if a set is a frozen set.

In [18]:
def is_frozen_set(my_set):
    return isinstance(my_set, frozenset)

frozen_set_example = frozenset({1, 2, 3})
result = is_frozen_set(frozen_set_example)
print(result)
# Output: True

True


#### Explanation:


1. Define the `is_frozen_set` function:
   ```python
   def is_frozen_set(my_set):
       return isinstance(my_set, frozenset)
   ```
   - The function takes a single argument `my_set`.
   - It uses the `isinstance` function to check if `my_set` is an instance of the `frozenset` class.
   - The function returns `True` if it's a frozen set and `False` otherwise.

2. Create an example of a frozen set:
   ```python
   frozen_set_example = frozenset({1, 2, 3})
   ```
   - `frozenset({1, 2, 3})` creates a frozen set with elements 1, 2, and 3.

3. Call the `is_frozen_set` function with the frozen set:
   ```python
   result = is_frozen_set(frozen_set_example)
   ```

4. Print the result:
   ```python
   print(result)
   ```
   - The output will be:
     ```
     True
     ```
     This indicates that the provided set (`frozen_set_example`) is indeed a frozen set.

A frozen set in Python is an immutable version of a set, and it is created using the `frozenset` constructor. The `is_frozen_set` function checks if a given set is a frozen set by using the `isinstance` function.

#### Question 15:
Write a Python program to find the largest and smallest elements in a set.

In [19]:
my_set = {10, 5, 3, 8, 15}
largest_element = max(my_set)
smallest_element = min(my_set)
print(f"Largest Element: {largest_element}, Smallest Element: {smallest_element}")
# Output: Largest Element: 15, Smallest Element: 3

Largest Element: 15, Smallest Element: 3


#### `PART - III`

#### Question 1:
Finding Unique Words: Given a text file, write a Python program to find the number of unique words it contains.

In [21]:
def unique_words(filename):
  with open(filename, 'r') as f:
    words = set(f.read().lower().split())
  return len(words)

#Note : Dont forget to create "my_text.txt" file
unique_words("my_text.txt")  # Returns the number of unique words in the file


25

#### Question 2:
Common Interests: You have two sets representing the hobbies of Mary and John. Write a program to find all the hobbies they share.

In [22]:
mary_hobbies = {"reading", "painting", "cooking"}
john_hobbies = {"hiking", "cooking", "photography"}

shared_hobbies = mary_hobbies.intersection(john_hobbies)
print(f"Shared hobbies: {shared_hobbies}") # Prints "cooking"


Shared hobbies: {'cooking'}


#### Question 3:
Duplicate Movie Titles: You have a list of movie titles from different years. Write a program to find all titles that appear more than once, regardless of year.

In [23]:
movie_titles = [
  ("The Godfather", 1972),
  ("The Shining", 1980),
  ("The Godfather Part II", 1974),
  ("Star Wars", 1977),
  ("Star Wars: Episode V - The Empire Strikes Back", 1980),
]

seen_titles = set()
duplicates = set()
for title, year in movie_titles:
  if title in seen_titles:
    duplicates.add(title)
  else:
    seen_titles.add(title)

print(f"Duplicate titles: {duplicates}") # Prints "The Godfather"


Duplicate titles: set()


#### Question 4:
Restaurant Recommender: You have a set of cuisines liked by a user and a list of restaurants offering different cuisines. Write a program to recommend restaurants based on the user's preferences.

In [24]:
user_cuisines = {"Italian", "Thai"}
restaurants = [
  ("La Cucina", "Italian"),
  ("Spice of Asia", "Thai"),
  ("Taco Fiesta", "Mexican"),
]

recommended_restaurants = {restaurant[0] for restaurant in restaurants if restaurant[1] in user_cuisines}
print(f"Recommended restaurants: {recommended_restaurants}") # Prints "La Cucina", "Spice of Asia"


Recommended restaurants: {'Spice of Asia', 'La Cucina'}


#### Question 5:
Missing Numbers: You have a set containing consecutive numbers except for one missing number. Write a program to find the missing number.

In [25]:
numbers = set(range(1, 11))
numbers.remove(5)

missing_number = sum(range(1, 11)) - sum(numbers)
print(f"Missing number: {missing_number}") # Prints 5


Missing number: 5


#### Question 6:
Anagram Detector: Write a program to efficiently check if two words are anagrams (contain the same letters in different orders).

In [26]:
def is_anagram(word1, word2):
  word1_set = set(word1.lower())
  word2_set = set(word2.lower())
  return word1_set == word2_set

is_anagram("silent", "listen") # Returns True
is_anagram("apple", "grape") # Returns False


False

### Homework Assignment on Sets in Python

#### Question 1:
Write a Python program to create an empty set.

#### Question 2:
Create a function that checks if two sets are disjoint.

#### Question 3:
Write a Python program to find the union of two sets.

#### Question 4:
Create a function that returns the symmetric difference of two sets.

#### Question 5:
Write a Python program to remove elements from a set that are common to both sets.

#### Question 6:
Create a function that calculates the Jaccard similarity coefficient between two sets.

#### Question 7:
Write a Python program to find the intersection of multiple sets.

#### Question 8:
Create a function that generates a power set of a given set.

#### Question 9:
Write a Python program to perform a set operation on a list of sets.

#### Question 10:
Create a function that checks if a set is a subset of another set.

Feel free to assign these questions to your students for their homework.

### 9. Interesting Facts

- `Mathematical Roots`: Sets in Python are inspired by mathematical sets, making them a powerful tool for mathematical operations in programming.
- Sets are implemented using hash tables, making membership tests extremely efficient.
- The `frozenset` data type is an immutable version of a set.