# Sets - Advanced Operations and Methods

## Overview

This notebook covers advanced set topics that build upon the basics:
- **Set comparison methods** - subset, superset, and disjoint relationships
- **Mathematical set operations** - union (∪), intersection (∩), difference (-), symmetric difference (⊕)
- **In-place update operations** - modifying sets with |=, &=, -=, ^= operators
- **Set comprehensions** - efficient set creation with filtering and transformations
- **Frozensets** - immutable sets for use as dictionary keys and set elements
- **Practical examples** - real-world applications of set operations
- **Performance considerations** - when and why sets are faster than lists


*Note: Basic set methods (add, remove, discard, etc.) are covered in the basics notebook.*---


## Set Comparison Methods

In [None]:
# Creating sets for comparison examples
set_a = {1, 2, 3, 4, 5}
set_b = {3, 4, 5, 6, 7}
set_c = {1, 2, 3}
set_d = {1, 2, 3, 4, 5, 6, 7, 8}

print(f"Set A: {set_a}")
print(f"Set B: {set_b}")
print(f"Set C: {set_c}")
print(f"Set D: {set_d}")

print("\n=== Subset/Superset Tests ===")
# issubset() - checks if all elements are in another set
print(f"C is subset of A: {set_c.issubset(set_a)}")
print(f"A is subset of C: {set_a.issubset(set_c)}")
print(f"A is subset of D: {set_a.issubset(set_d)}")

# issuperset() - checks if contains all elements of another set
print(f"\nA is superset of C: {set_a.issuperset(set_c)}")
print(f"C is superset of A: {set_c.issuperset(set_a)}")
print(f"D is superset of A: {set_d.issuperset(set_a)}")

# isdisjoint() - checks if no common elements
print(f"\nA and B are disjoint: {set_a.isdisjoint(set_b)}")
set_e = {10, 11, 12}
print(f"A and E are disjoint: {set_a.isdisjoint(set_e)}")

## Mathematical Set Operations

These are the classic set theory operations that work between two or more sets. Each operation can be performed using either operators or method calls:

- **Union (∪)**: All unique elements from both sets - use `|` or `.union()`
- **Intersection (∩)**: Common elements in both sets - use `&` or `.intersection()`
- **Difference (-)**: Elements in first set but not second - use `-` or `.difference()`
- **Symmetric Difference (⊕)**: Elements in either set but not both - use `^` or `.symmetric_difference()`

In [None]:
# Sample sets for mathematical operations
math_students = {"Alice", "Bob", "Charlie", "Diana", "Eve"}
physics_students = {"Bob", "Charlie", "Frank", "Grace", "Henry"}
chemistry_students = {"Alice", "Charlie", "Grace", "Ivan", "Jack"}

print(f"Math students: {math_students}")
print(f"Physics students: {physics_students}")
print(f"Chemistry students: {chemistry_students}")

In [None]:
# UNION - all unique elements from both sets
print("=== UNION OPERATIONS ===")

# Using | operator
all_students = math_students | physics_students
print(f"Math ∪ Physics (using |): {all_students}")

# Using union() method
all_students = math_students.union(physics_students)
print(f"Math ∪ Physics (using union()): {all_students}")

# Union of multiple sets
all_science_students = math_students.union(physics_students, chemistry_students)
print(f"\nAll science students: {all_science_students}")
print(f"Total unique students: {len(all_science_students)}")

In [None]:
# INTERSECTION - common elements in both sets
print("=== INTERSECTION OPERATIONS ===")

# Using & operator
math_and_physics = math_students & physics_students
print(f"Math ∩ Physics (using &): {math_and_physics}")

# Using intersection() method
math_and_physics = math_students.intersection(physics_students)
print(f"Math ∩ Physics (using intersection()): {math_and_physics}")

# Intersection of multiple sets
all_three_subjects = math_students.intersection(physics_students, chemistry_students)
print(f"\nStudents in all three subjects: {all_three_subjects}")

# Students taking both math and chemistry
math_and_chemistry = math_students & chemistry_students
print(f"Math ∩ Chemistry: {math_and_chemistry}")

In [None]:
# DIFFERENCE - elements in first set but not in second
print("=== DIFFERENCE OPERATIONS ===")

# Using - operator
only_math = math_students - physics_students
print(f"Only Math (using -): {only_math}")

# Using difference() method
only_physics = physics_students.difference(math_students)
print(f"Only Physics (using difference()): {only_physics}")

# Students taking math but not chemistry
math_not_chemistry = math_students - chemistry_students
print(f"\nMath but not Chemistry: {math_not_chemistry}")

# Multiple differences
only_math_exclusive = math_students.difference(physics_students, chemistry_students)
print(f"Only Math (exclusive): {only_math_exclusive}")

In [None]:
# SYMMETRIC DIFFERENCE - elements in either set but not in both
print("=== SYMMETRIC DIFFERENCE OPERATIONS ===")

# Using ^ operator
math_xor_physics = math_students ^ physics_students
print(f"Math ⊕ Physics (using ^): {math_xor_physics}")

# Using symmetric_difference() method
math_xor_physics = math_students.symmetric_difference(physics_students)
print(f"Math ⊕ Physics (using symmetric_difference()): {math_xor_physics}")

# Students taking exactly one of math or chemistry
math_xor_chemistry = math_students ^ chemistry_students
print(f"\nMath ⊕ Chemistry: {math_xor_chemistry}")

# Verification: symmetric difference = (A - B) ∪ (B - A)
verification = (math_students - physics_students) | (physics_students - math_students)
print(f"\nVerification (A-B)∪(B-A): {verification}")
print(f"Same as A⊕B: {verification == math_xor_physics}")

## Update Operations (Modify Sets In-Place)

These methods modify the original set instead of creating a new one:

In [None]:
# In-place update operations
set1 = {1, 2, 3, 4}
set2 = {3, 4, 5, 6}
set3 = {5, 6, 7, 8}

print(f"Original set1: {set1}")
print(f"set2: {set2}")
print(f"set3: {set3}")

# update() or |= - union update
set1_copy = set1.copy()
set1_copy.update(set2)
print(f"\nAfter set1.update(set2): {set1_copy}")

set1_copy = set1.copy()
set1_copy |= set2
print(f"After set1 |= set2: {set1_copy}")

# intersection_update() or &= - intersection update
set1_copy = set1.copy()
set1_copy.intersection_update(set2)
print(f"\nAfter set1.intersection_update(set2): {set1_copy}")

set1_copy = set1.copy()
set1_copy &= set2
print(f"After set1 &= set2: {set1_copy}")

# difference_update() or -= - difference update
set1_copy = set1.copy()
set1_copy.difference_update(set2)
print(f"\nAfter set1.difference_update(set2): {set1_copy}")

set1_copy = set1.copy()
set1_copy -= set2
print(f"After set1 -= set2: {set1_copy}")

# symmetric_difference_update() or ^= - symmetric difference update
set1_copy = set1.copy()
set1_copy.symmetric_difference_update(set2)
print(f"\nAfter set1.symmetric_difference_update(set2): {set1_copy}")

set1_copy = set1.copy()
set1_copy ^= set2
print(f"After set1 ^= set2: {set1_copy}")

## Set Comprehensions

Like list comprehensions, but for sets:

In [None]:
# Basic set comprehension
squares = {x**2 for x in range(10)}
print(f"Squares: {squares}")

# Set comprehension with condition
even_squares = {x**2 for x in range(10) if x % 2 == 0}
print(f"Even squares: {even_squares}")

# Processing strings
words = ["hello", "world", "python", "programming"]
first_letters = {word[0].upper() for word in words}
print(f"\nFirst letters: {first_letters}")

# Unique lengths
word_lengths = {len(word) for word in words}
print(f"Word lengths: {word_lengths}")

# More complex example: unique vowels in words
sentences = ["The quick brown fox", "jumps over the lazy dog"]
all_vowels = {char.lower() for sentence in sentences 
              for char in sentence 
              if char.lower() in 'aeiou'}
print(f"\nUnique vowels in sentences: {all_vowels}")

In [None]:
# Set comprehension for data processing
# Example: Processing a list of dictionaries
students = [
    {"name": "Alice", "grade": 85, "subject": "Math"},
    {"name": "Bob", "grade": 92, "subject": "Physics"},
    {"name": "Charlie", "grade": 78, "subject": "Math"},
    {"name": "Diana", "grade": 88, "subject": "Chemistry"},
    {"name": "Eve", "grade": 95, "subject": "Physics"}
]

# Unique subjects
subjects = {student["subject"] for student in students}
print(f"Subjects offered: {subjects}")

# Students with high grades (90+)
top_students = {student["name"] for student in students if student["grade"] >= 90}
print(f"Top students (90+): {top_students}")

# Grade categories
grade_categories = {"A" if s["grade"] >= 90 else "B" if s["grade"] >= 80 else "C" 
                   for s in students}
print(f"Grade categories: {grade_categories}")

## Frozensets - Immutable Sets

Frozensets are immutable versions of sets that can be used as dictionary keys or elements in other sets:

In [None]:
# Creating frozensets
frozen_colors = frozenset(["red", "green", "blue"])
frozen_numbers = frozenset(range(1, 6))

print(f"Frozen colors: {frozen_colors}")
print(f"Frozen numbers: {frozen_numbers}")
print(f"Type: {type(frozen_colors)}")

# Frozensets support all query operations
print(f"\n'red' in frozen_colors: {'red' in frozen_colors}")
print(f"Length: {len(frozen_colors)}")

# Set operations work with frozensets
regular_set = {"blue", "yellow", "purple"}
union_result = frozen_colors | regular_set
print(f"\nUnion with regular set: {union_result}")
print(f"Result type: {type(union_result)}")

# Frozensets are hashable - can be used as dictionary keys
team_scores = {
    frozenset(["Alice", "Bob"]): 85,
    frozenset(["Charlie", "Diana"]): 92,
    frozenset(["Eve", "Frank"]): 78
}

print(f"\nTeam scores: {team_scores}")

# Accessing team score
team_key = frozenset(["Alice", "Bob"])
print(f"Score for Alice & Bob: {team_scores[team_key]}")

In [None]:
# Frozensets can be elements in sets
set_of_sets = {
    frozenset([1, 2, 3]),
    frozenset([4, 5, 6]),
    frozenset([1, 2, 3]),  # duplicate, will be ignored
    frozenset([7, 8, 9])
}

print(f"Set of frozensets: {set_of_sets}")
print(f"Number of unique frozensets: {len(set_of_sets)}")

# This would cause an error with regular sets
try:
    invalid = {{1, 2, 3}, {4, 5, 6}}  # regular sets are not hashable
except TypeError as e:
    print(f"\nError with regular sets in set: {e}")

# Frozensets cannot be modified
try:
    frozen_colors.add("yellow")  # This will raise an error
except AttributeError as e:
    print(f"Error modifying frozenset: {e}")

## Practical Examples and Use Cases

In [None]:
# Example 1: Simple Text Analysis
# Finding common and unique words between documents
doc1_words = set("Python is great for data science".lower().split())
doc2_words = set("Python is good for web development".lower().split())

print(f"Document 1 words: {doc1_words}")
print(f"Document 2 words: {doc2_words}")
print(f"Common words: {doc1_words & doc2_words}")
print(f"Unique to doc1: {doc1_words - doc2_words}")
print(f"Unique to doc2: {doc2_words - doc1_words}")
print(f"All unique words: {doc1_words | doc2_words}")

In [None]:
# Example 2: Finding Mutual Friends
alice_friends = {"Bob", "Charlie", "Diana"}
bob_friends = {"Alice", "Charlie", "Eve"}
charlie_friends = {"Alice", "Bob", "Diana", "Frank"}

print(f"Alice's friends: {alice_friends}")
print(f"Bob's friends: {bob_friends}")
print(f"Charlie's friends: {charlie_friends}")

# Find mutual friends
alice_bob_mutual = alice_friends & bob_friends
alice_charlie_mutual = alice_friends & charlie_friends
bob_charlie_mutual = bob_friends & charlie_friends

print(f"\nAlice & Bob mutual friends: {alice_bob_mutual}")
print(f"Alice & Charlie mutual friends: {alice_charlie_mutual}")
print(f"Bob & Charlie mutual friends: {bob_charlie_mutual}")

# Who knows everyone?
all_people = {"Alice", "Bob", "Charlie", "Diana", "Eve", "Frank"}
alice_knows = alice_friends | {"Alice"}
bob_knows = bob_friends | {"Bob"}
charlie_knows = charlie_friends | {"Charlie"}

print(f"\nAlice knows: {len(alice_knows)} people")
print(f"Bob knows: {len(bob_knows)} people") 
print(f"Charlie knows: {len(charlie_knows)} people")

## Performance Considerations

In [None]:
# Performance comparison: sets vs lists for membership testing
import time
import random

# Create test data
size = 5000000
large_list = list(range(size))
large_set = set(range(size))
search_value = random.randint(0, size-1)

# Time list search
start = time.time()
result1 = search_value in large_list
list_time = time.time() - start

# Time set search  
start = time.time()
result2 = search_value in large_set
set_time = time.time() - start

print(f"Searching for {search_value} in {size:,} elements:")
print(f"List search: {list_time:.6f} seconds")
print(f"Set search: {set_time:.6f} seconds")

if set_time > 0:
    speedup = list_time / set_time
    print(f"Sets are {speedup:.1f}x faster for membership testing!")
else:
    print("Set search was too fast to measure - essentially instant!")