In [1]:
# !pip install rich
from rich import print

## <span style='color: blue'>Learn Python</span> - Sets

- Creating Sets
- Adding & Removing Elements
- Set Operations (Union, Difference & Intersection)
- Keywords & Functions
- Subset Operators

<span style='color: blue'>**Creating Sets**</span>

<span style='color: blue'>**Sets**</span> are <span style='color: magenta'>**unordered**</span> collections of <span style='color: magenta'>**unique**</span> elements, separated by <span style='color: magenta'>**commas**</span>, that can be created using the <span style='color: magenta'>**set()**</span> function.

<span style='color: magenta'>**NOTE:**</span> that although the set uses curly braces an empty set cannot be created by using curly braces like so:
```python
    # Incorrect - This will create an empty dictionary.
    new_set = {}
    
    # Correct
    new_set = set()
```

<span style='color: blue'>**Adding & Removing Elements**</span>

Adding elements to a <span style='color: blue'>**set**</span> can be done with the <span style='color: magenta'>**add()**</span> method.

In [2]:
# Create a populated set
cars = {'Aston Martin', 'Bentley', 'Ferrari'}

# Add a value with .add() method
cars.add('Lamborghini')

print(cars)

The <span style='color: blue'>**set**</span> method <span style='color: magenta'>**update()**</span> can add the elements from other <span style='color: magenta'>**sets**</span>, <span style='color: magenta'>**lists**</span>, <span style='color: magenta'>**tuples**</span> and <span style='color: magenta'>**dictionaries**</span>.

In [3]:
# Create a populated set
cars = {'Aston Martin', 'Bentley', 'Ferrari'}

# Add more values with .update() and a set
cars.update({'Lexus', 'Porsche'})

# Add more values with .update() and a list
cars.update(['BMW', 'Jaguar'])

print(cars)

The <span style='color: blue'>**set**</span> method <span style='color: magenta'>**update()**</span> can take more than one data container as parameters.

In [4]:
# Create a populated set
cars = {'Aston Martin', 'Bentley', 'Ferrari'}

# Add more values with .update() and a list and a set
cars.update(['Lexus', 'Porsche'], {'BMW', 'Rolls-Royce'})

print(cars)

Python <span style='color: blue'>**sets**</span> will not allow <span style='color: magenta'>**duplicates**</span>.  <span style='color: magenta'>**Updating**</span> with a value that already exists in the <span style='color: blue'>**set**</span> will have no effect.

In [5]:
# Create a populated set
muppets = {'Kermit', 'Miss Piggy', 'Fozzie Bear', 'Gonzo'}

# Update with a muppet name that already exists in the set
muppets.update({'Gonzo'})

# Check update has had no effect
print(muppets)


The fact that <span style='color: blue'>**sets**</span> do <span style='color: magenta'>**not allow duplicates**</span> makes then ideal for removing duplicates from other data containers such as lists.

In [6]:
# Define a large list
my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9] * 10000

In [7]:
# Use a list comprehension to remove duplcates and timeit
unique_list = []
%timeit [unique_list.append(x) for x in my_list if x not in unique_list]

2.2 ms ± 54 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [8]:
# Use the dict.fromkeys() method to remove duplicate and timeit
%timeit unique_list = list(dict.fromkeys(my_list))

623 µs ± 10.4 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [9]:
# Use set casting to remove duplicates and timeit
%timeit list(set(my_list))

561 µs ± 3.97 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


Elements can be <span style='color: magenta'>**removed**</span> from <span style='color: blue'>**sets**</span> in a number of different ways.

The <span style='color: blue'>**remove()**</span> method will remove an element from a set.  <span style='color: magenta'>**Note**</span>: if the element is not found an <span style='color: magenta'>**error**</span> is raised.

In [10]:
# Create a populated set
fraggles = {"Gobo", "Mokey", "Red", "Wembley"}

# Remove an element using .remove()
fraggles.remove("Mokey")

print(fraggles)

The <span style='color: blue'>**discard()**</span> method is similar to the <span style='color: magenta'>**remove**</span> however it <span style='color: magenta'>**will not**</span> raise an <span style='color: magenta'>**error**</span>.

In [11]:
# Create a populated set
fraggles = {"Gobo", "Mokey", "Red", "Wembley"}

# Remove an element using .discard()
fraggles.discard("Boober")

print(fraggles)


The <span style='color: blue'>**pop()**</span> removes an <span style='color: magenta'>**arbitrary element**</span> from the set and <span style='color: magenta'>**returns**</span> the removed element. Since sets are <span style='color: magenta'>**unordered**</span>, the removed element is not necessarily the last element added to the set.

In [12]:
# Create a populated list
fraggles = {"Gobo", "Mokey", "Red", "Wembley"}

# Remove an element using .pop()
removed_fraggle = fraggles.pop()

print(f'{fraggles = }')
print(f'{removed_fraggle = }')

The <span style='color: blue'>**clear()**</span> method <span style='color: magenta'>**removes**</span> all the elements from the <span style='color: blue'>**set**</span>

In [13]:
# Create a populated list
fraggles = {"Gobo", "Mokey", "Red", "Wembley"}

# Use .clear() to remove all elements
fraggles.clear()

print(fraggles)

<span style='color: blue'>**Set Operations**</span> - Union, Difference & Intersection

The <span style='color: blue'>**union()**</span> method returns a set of <span style='color: magenta'>**all elements**</span> in both sets, without any duplicates.

In [14]:
# Create two populated sets
cats_1 = {'Garfield', 'Tom', 'Felix', 'Hello Kitty', 'Sylvester'}
cats_2 = {'Tom', 'Sylvester', 'Tweety', 'Top Cat', 'Pink Panther'}

# Use the .union() method
union = cats_1.union(cats_2)
print(union)

The <span style='color: blue'>**difference()**</span> method returns a set of all elements that are <span style='color: magenta'>**in one set**</span> but <span style='color: magenta'>**not in the other**</span>.

In [15]:
# Create two populated sets
cats_1 = {'Garfield', 'Tom', 'Felix', 'Hello Kitty', 'Sylvester'}
cats_2 = {'Tom', 'Sylvester', 'Tweety', 'Top Cat', 'Pink Panther'}

# Use the .difference() method
diff = cats_1.difference(cats_2)
print(diff)

The <span style='color: blue'>**symmetric_difference()**</span> of two sets is the set of elements that are in <span style='color: magenta'>**either**</span> of the sets, but <span style='color: magenta'>**not in both**</span>.

In [16]:
# Create two populated sets
cats_1 = {'Garfield', 'Tom', 'Felix', 'Hello Kitty', 'Sylvester'}
cats_2 = {'Tom', 'Sylvester', 'Tweety', 'Top Cat', 'Pink Panther'}

# Use the .symmetric_difference() method
sym_diff = cats_1.symmetric_difference(cats_2)
print(sym_diff)

The <span style='color: blue'>**intersection()**</span> method returns a set of all elements that are <span style='color: magenta'>**in both sets**</span>.

In [17]:
# Create two populated sets
cats_1 = {'Garfield', 'Tom', 'Felix', 'Hello Kitty', 'Sylvester'}
cats_2 = {'Tom', 'Sylvester', 'Tweety', 'Top Cat', 'Pink Panther'}

# Use the .intersection() method
inter = cats_1.intersection(cats_2)
print(inter)

The <span style='color: blue'>**intersection_update()**</span> method <span style='color: magenta'>**differs**</span> from <span style='color: blue'>**intersection()**</span> in that it is an <span style='color: magenta'>**inplace**</span> operation.

In [18]:
# Create two populated sets
cats_1 = {'Garfield', 'Tom', 'Felix', 'Hello Kitty', 'Sylvester'}
cats_2 = {'Tom', 'Sylvester', 'Tweety', 'Top Cat', 'Pink Panther'}

# Use the .intersection_update() method
cats_1.intersection_update(cats_2)
print(cats_1)

Just like the aforementioned method, there are also <span style='color: blue'>**difference_update()**</span> and <span style='color: blue'>**symmetric_difference_update()**</span> methods that differ from their <span style='color: magenta'>**non-update counterparts**</span> only in that they perform <span style='color: magenta'>**in-place**</span> operations.

The <span style='color: blue'>**set operations**</span> have <span style='color: magenta'>**mathematical operators**</span> that can be used as <span style='color: magenta'>**shorthand**</span>.

```python
    # union using |
    union_set = set1 | set2
    
    # difference using -
    difference_set = set1 - set2
```

```python
    # symmetric difference using ^
    symmetric_difference_set = set1 ^ set2
    
    # intersection using &
    intersection_set = set1 & set2
```

Python <span style='color: blue'>**sets**</span> are highly <span style='color: magenta'>**efficient**</span> and <span style='color: magenta'>**fast**</span> for comparing data containers. Using set operators can offer <span style='color: magenta'>**elegant solutions**</span> to complex problems that would otherwise require slower and more complicated functions

In [19]:
# Define the favorite, Marvel, and DC superhero lists
favorite_superheroes = ['Spider-Man', 'Iron Man', 'Batman', 'Wonder Woman']
marvel_superheroes = ['Iron Man', 'Spider-Man', 'Thor', 'Captain America']
dc_superheroes = ['Superman', 'Batman', 'Wonder Woman', 'Flash']

# Do intersections in two different ways
favorite_marvel_superheros = set(favorite_superheroes).intersection(marvel_superheroes)
favorite_dc_superheros = set(favorite_superheroes) & set(dc_superheroes)

print(f'{favorite_marvel_superheros = }')
print(f'{favorite_dc_superheros = }')


<span style='color: blue'>**Keywords & Functions**</span>

There are a number of useful Python <span style='color: blue'>**keywords**</span> and <span style='color: blue'>**functions**</span> that can be used with <span style='color: blue'>**sets**</span>

<span style='color: blue'>**In**</span> checks set membership, returning a <span style='color: magenta'>**Boolean**</span>.

In [20]:
# Create a populated set
dogs = {'Scooby-Doo', 'Snoopy', 'Pluto', 'Lassie', 'Beethoven'}

# Print Boolean value
print('Scooby-Doo' in dogs)
print('Droopy' in dogs)


The <span style='color: blue'>**len()**</span> function in Python can be used to find the <span style='color: magenta'>**number of elements**</span> in a set.

In [21]:
# Create populated set
looney_tunes = {'Bugs Bunny', 'Daffy Duck', 'Tweety', 'Sylvester', 'Elmer Fudd', 'Porky Pig'}

# Use len() to get length 
number_of_characters = len(looney_tunes)

print(number_of_characters)


<span style='color: blue'>**Subset Operators**</span> - issubset(), issuperset() & isdisjoint()

The <span style='color: blue'>**issubset()**</span> method is used to check whether <span style='color: magenta'>**all elements**</span> in one set are also <span style='color: magenta'>**present**</span> in another <span style='color: magenta'>**set**</span>. Returns a <span style='color: magenta'>**Boolean**</span> value.

In [22]:
# Create and poplate 2 sets
soccer_players_1 = {'Messi', 'Suarez', 'Neymar', 'Alba', 'Pique'}
soccer_players_2 = {'Suarez', 'Neymar', 'Alba'}

# Store Boolean value in variable
is_subset = soccer_players_2.issubset(soccer_players_1)

print(is_subset)

The <span style='color: blue'>**issuperset()**</span> method is used to check whether a set contains <span style='color: magenta'>**all the elements**</span> of <span style='color: magenta'>**another set**</span>. It returns a <span style='color: magenta'>**Boolean**</span> response.

In [23]:
# Create and poplate 2 sets
soccer_players_1 = {'Messi', 'Suarez', 'Neymar', 'Alba', 'Pique'}
soccer_players_2 = {'Suarez', 'Neymar', 'Alba'}

# Store Boolean value in variable
is_superset = soccer_players_1.issuperset(soccer_players_2)

# Print the result
print(is_superset)

The <span style='color: blue'>**isdisjoint()**</span> method is used to check whether two sets have <span style='color: magenta'>**any elements**</span> in common. It returns a <span style='color: magenta'>**Boolean**</span> response.

In [24]:
# Create and poplate 2 sets
soccer_players_1 = {'Messi', 'Suarez', 'Neymar', 'Alba', 'Pique'}
soccer_players_2 = {'Ronaldo', 'Modric', 'Benzema', 'Casemiro', 'Kroos'}

# Store Boolean value in variable
is_disjoint = soccer_players_1.isdisjoint(soccer_players_2)

print(is_disjoint)

A <span style='color: blue'>**true**</span> response means the sets have <span style='color: magenta'>**no elements in common**</span>.