# Topic 07: Sets - Unique Collections

## Overview
Sets are unordered collections of unique elements. They're perfect for eliminating duplicates and performing mathematical set operations.

### What You'll Learn:
- Set creation and characteristics
- Set operations (union, intersection, difference)
- Set methods and membership testing
- Set comprehensions
- Frozen sets for immutable collections
- Practical applications

---

## 1. Creating Sets

Various ways to create and initialize sets:

In [None]:
# Creating sets
print("Set Creation:")
print("=" * 12)

# Empty set (note: {} creates empty dict, not set)
empty_set = set()
print(f"Empty set: {empty_set} (type: {type(empty_set)})")

# Set with initial values
numbers_set = {1, 2, 3, 4, 5}
colors_set = {'red', 'green', 'blue'}
mixed_set = {1, 'hello', 3.14, True}

print(f"Numbers set: {numbers_set}")
print(f"Colors set: {colors_set}")
print(f"Mixed set: {mixed_set}")

# Creating sets from other iterables
list_to_set = set([1, 2, 2, 3, 3, 4])  # Duplicates removed
string_to_set = set('hello')           # Unique characters
tuple_to_set = set((1, 2, 3, 2, 1))   # From tuple

print(f"\nFrom list with duplicates: {list_to_set}")
print(f"From string 'hello': {string_to_set}")
print(f"From tuple: {tuple_to_set}")

# Set comprehension
squares_set = {x**2 for x in range(1, 6)}
even_squares = {x**2 for x in range(1, 11) if x % 2 == 0}

print(f"\nSet comprehensions:")
print(f"Squares: {squares_set}")
print(f"Even squares: {even_squares}")

# Demonstrate uniqueness
duplicate_list = [1, 1, 2, 2, 3, 3, 4, 4, 5, 5]
unique_set = set(duplicate_list)
print(f"\nRemoving duplicates:")
print(f"Original list: {duplicate_list}")
print(f"Unique set: {unique_set}")
print(f"Back to list: {list(unique_set)}")

## 2. Set Characteristics and Properties

Understanding set behavior:

In [None]:
# Set characteristics
print("Set Characteristics:")
print("=" * 18)

# Unordered - no guaranteed order
my_set = {3, 1, 4, 1, 5, 9, 2, 6}
print(f"Set: {my_set} (order may vary)")
print(f"Same set again: {my_set}")

# Unique elements only
print(f"\nUniqueness:")
original_list = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
unique_set = set(original_list)
print(f"Original: {original_list}")
print(f"Unique: {unique_set}")
print(f"Count reduced from {len(original_list)} to {len(unique_set)}")

# Mutable (can add/remove elements)
mutable_set = {1, 2, 3}
print(f"\nMutability:")
print(f"Original: {mutable_set}")
mutable_set.add(4)
print(f"After add(4): {mutable_set}")
mutable_set.remove(1)
print(f"After remove(1): {mutable_set}")

# Elements must be hashable
print(f"\nHashable elements only:")
hashable_set = {1, 'hello', 3.14, (1, 2), frozenset([3, 4])}
print(f"Valid set: {hashable_set}")

# This would cause an error (unhashable types):
# invalid_set = {[1, 2], {3: 4}}  # TypeError: unhashable type
print("Cannot add lists or dicts to sets (unhashable)")

# Set as a whole is not hashable (mutable)
try:
    hash({1, 2, 3})
except TypeError as e:
    print(f"Set not hashable: {e}")

## 3. Set Methods - Adding and Removing Elements

Methods to modify sets:

In [None]:
# Adding elements to sets
print("Adding Elements to Sets:")
print("=" * 25)

# Start with empty set
fruits = set()
print(f"Initial set: {fruits}")

# add() - add single element
fruits.add('apple')
fruits.add('banana')
fruits.add('cherry')
print(f"After adding fruits: {fruits}")

# Adding duplicate (no effect)
fruits.add('apple')
print(f"After adding duplicate 'apple': {fruits}")

# update() - add multiple elements
fruits.update(['date', 'elderberry'])
print(f"After update with list: {fruits}")

fruits.update({'fig', 'grape'})
print(f"After update with set: {fruits}")

# update() with multiple iterables
fruits.update(['kiwi'], ('lemon', 'mango'), 'no')  # 'no' -> 'n', 'o'
print(f"After multiple updates: {fruits}")

# |= operator (equivalent to update)
fruits |= {'orange', 'papaya'}
print(f"After |= operator: {fruits}")

In [None]:
# Removing elements from sets
print("Removing Elements from Sets:")
print("=" * 28)

# Create test set
numbers = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
print(f"Original set: {numbers}")

# remove() - removes element (raises KeyError if not found)
numbers.remove(5)
print(f"After remove(5): {numbers}")

# discard() - removes element (no error if not found)
numbers.discard(3)
print(f"After discard(3): {numbers}")

numbers.discard(100)  # No error even though 100 not in set
print(f"After discard(100): {numbers}")

# pop() - removes and returns arbitrary element
popped = numbers.pop()
print(f"Popped element: {popped}")
print(f"After pop(): {numbers}")

# clear() - removes all elements
test_set = {1, 2, 3}
print(f"\nBefore clear: {test_set}")
test_set.clear()
print(f"After clear: {test_set}")

# Difference between remove() and discard()
demo_set = {1, 2, 3}
print(f"\nDemonstrating remove vs discard:")
print(f"Demo set: {demo_set}")

# This works
demo_set.discard(10)  # Element not in set, but no error
print(f"After discard(10): {demo_set}")

# This would raise KeyError:
try:
    demo_set.remove(10)
except KeyError:
    print("remove(10) raised KeyError - element not found")

## 4. Set Operations - Mathematical Set Theory

Python sets support mathematical set operations:

In [None]:
# Mathematical set operations
print("Mathematical Set Operations:")
print("=" * 28)

# Define sample sets
set_a = {1, 2, 3, 4, 5}
set_b = {4, 5, 6, 7, 8}
set_c = {1, 2, 3}

print(f"Set A: {set_a}")
print(f"Set B: {set_b}")
print(f"Set C: {set_c}")

# Union - elements in either set
print(f"\nUNION (elements in either set):")
union_method = set_a.union(set_b)
union_operator = set_a | set_b
print(f"A.union(B): {union_method}")
print(f"A | B: {union_operator}")
print(f"Results equal: {union_method == union_operator}")

# Intersection - elements in both sets
print(f"\nINTERSECTION (elements in both sets):")
intersection_method = set_a.intersection(set_b)
intersection_operator = set_a & set_b
print(f"A.intersection(B): {intersection_method}")
print(f"A & B: {intersection_operator}")

# Difference - elements in first set but not second
print(f"\nDIFFERENCE (in A but not B):")
difference_method = set_a.difference(set_b)
difference_operator = set_a - set_b
print(f"A.difference(B): {difference_method}")
print(f"A - B: {difference_operator}")

print(f"\nDIFFERENCE (in B but not A):")
print(f"B - A: {set_b - set_a}")

# Symmetric difference - elements in either set, but not both
print(f"\nSYMMETRIC DIFFERENCE (in either, but not both):")
sym_diff_method = set_a.symmetric_difference(set_b)
sym_diff_operator = set_a ^ set_b
print(f"A.symmetric_difference(B): {sym_diff_method}")
print(f"A ^ B: {sym_diff_operator}")

In [None]:
# Set relationships and comparisons
print("Set Relationships:")
print("=" * 17)

# Define test sets
universal = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
subset = {2, 4, 6}
superset = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
disjoint_set = {11, 12, 13}
overlapping = {5, 6, 7, 8, 9}

print(f"Universal: {universal}")
print(f"Subset: {subset}")
print(f"Superset: {superset}")
print(f"Disjoint: {disjoint_set}")
print(f"Overlapping: {overlapping}")

# Subset relationships
print(f"\nSubset relationships:")
print(f"subset <= universal: {subset <= universal} (is subset)")
print(f"subset < universal: {subset < universal} (is proper subset)")
print(f"subset.issubset(universal): {subset.issubset(universal)}")

# Superset relationships
print(f"\nSuperset relationships:")
print(f"superset >= universal: {superset >= universal} (is superset)")
print(f"superset > universal: {superset > universal} (is proper superset)")
print(f"superset.issuperset(universal): {superset.issuperset(universal)}")

# Disjoint sets (no common elements)
print(f"\nDisjoint sets:")
print(f"universal.isdisjoint(disjoint_set): {universal.isdisjoint(disjoint_set)}")
print(f"universal.isdisjoint(overlapping): {universal.isdisjoint(overlapping)}")

# Set equality
set1 = {1, 2, 3}
set2 = {3, 2, 1}  # Order doesn't matter
set3 = {1, 2, 3, 3, 2, 1}  # Duplicates ignored
print(f"\nSet equality:")
print(f"{set1} == {set2}: {set1 == set2}")
print(f"{set1} == {set3}: {set1 == set3}")

## 5. Set Comprehensions and Advanced Operations

Creating sets with comprehensions and complex operations:

In [None]:
# Set comprehensions
print("Set Comprehensions:")
print("=" * 18)

# Basic set comprehension
squares = {x**2 for x in range(1, 6)}
print(f"Squares: {squares}")

# Set comprehension with condition
even_squares = {x**2 for x in range(1, 11) if x % 2 == 0}
print(f"Even squares: {even_squares}")

# String operations
words = ['hello', 'world', 'python', 'programming']
first_letters = {word[0] for word in words}
print(f"First letters: {first_letters}")

# Unique word lengths
word_lengths = {len(word) for word in words}
print(f"Word lengths: {word_lengths}")

# Set comprehension with multiple conditions
numbers = range(1, 21)
special_numbers = {n for n in numbers if n % 3 == 0 or n % 5 == 0}
print(f"Divisible by 3 or 5: {special_numbers}")

# Nested comprehension
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
all_elements = {element for row in matrix for element in row}
print(f"All matrix elements: {all_elements}")

# Set comprehension from dictionary
grades = {'Alice': 85, 'Bob': 92, 'Charlie': 78, 'Diana': 96}
high_performers = {name for name, grade in grades.items() if grade >= 90}
print(f"High performers (≥90): {high_performers}")

In [None]:
# Advanced set operations
print("Advanced Set Operations:")
print("=" * 24)

# Working with multiple sets
sets = [
    {1, 2, 3, 4},
    {3, 4, 5, 6},
    {5, 6, 7, 8},
    {1, 3, 5, 7}
]

print(f"Multiple sets: {sets}")

# Union of all sets
all_union = set()
for s in sets:
    all_union |= s
print(f"Union of all: {all_union}")

# Alternative using union with *args
union_alt = set().union(*sets)
print(f"Union alternative: {union_alt}")

# Intersection of all sets
all_intersection = sets[0]
for s in sets[1:]:
    all_intersection &= s
print(f"Intersection of all: {all_intersection}")

# Alternative using intersection with *args
intersection_alt = set.intersection(*sets)
print(f"Intersection alternative: {intersection_alt}")

# Find elements that appear in exactly 2 sets
element_counts = {}
for s in sets:
    for element in s:
        element_counts[element] = element_counts.get(element, 0) + 1

exactly_two = {elem for elem, count in element_counts.items() if count == 2}
print(f"Elements in exactly 2 sets: {exactly_two}")

# Elements unique to each set
print(f"\nUnique elements in each set:")
for i, current_set in enumerate(sets):
    others = set()
    for j, other_set in enumerate(sets):
        if i != j:
            others |= other_set
    unique = current_set - others
    print(f"  Set {i}: {current_set} -> Unique: {unique}")

## 6. Frozen Sets - Immutable Sets

Immutable version of sets:

In [None]:
# Frozen sets
print("Frozen Sets:")
print("=" * 12)

# Creating frozen sets
frozen1 = frozenset([1, 2, 3, 4, 5])
frozen2 = frozenset({4, 5, 6, 7, 8})
frozen3 = frozenset('hello')  # From string

print(f"Frozen set 1: {frozen1}")
print(f"Frozen set 2: {frozen2}")
print(f"Frozen set 3: {frozen3}")

# Frozen sets are hashable
print(f"\nHashability:")
print(f"Hash of frozen1: {hash(frozen1)}")
print(f"Hash of frozen2: {hash(frozen2)}")

# Can be used as dictionary keys
frozen_dict = {
    frozenset([1, 2]): 'pair one',
    frozenset([3, 4]): 'pair two',
    frozenset([5, 6, 7]): 'triplet'
}
print(f"\nDictionary with frozenset keys: {frozen_dict}")

# Can be elements of regular sets
set_of_frozensets = {
    frozenset([1, 2]),
    frozenset([3, 4]),
    frozenset([1, 2])  # Duplicate, will be ignored
}
print(f"Set of frozensets: {set_of_frozensets}")

# Frozen sets support the same operations as regular sets
print(f"\nFrozenset operations:")
print(f"Union: {frozen1 | frozen2}")
print(f"Intersection: {frozen1 & frozen2}")
print(f"Difference: {frozen1 - frozen2}")
print(f"Symmetric difference: {frozen1 ^ frozen2}")

# But cannot be modified
print(f"\nImmutability:")
try:
    frozen1.add(6)
except AttributeError as e:
    print(f"Cannot modify frozenset: {e}")

# Convert between set and frozenset
regular_set = {1, 2, 3}
to_frozen = frozenset(regular_set)
back_to_set = set(to_frozen)

print(f"\nConversions:")
print(f"Regular set: {regular_set}")
print(f"To frozenset: {to_frozen}")
print(f"Back to set: {back_to_set}")

## 7. Practical Applications of Sets

Real-world use cases for sets:

In [None]:
# Practical applications
print("Practical Set Applications:")
print("=" * 27)

# 1. Remove duplicates from a list while preserving uniqueness
def remove_duplicates_ordered(lst):
    """Remove duplicates while preserving order"""
    seen = set()
    result = []
    for item in lst:
        if item not in seen:
            seen.add(item)
            result.append(item)
    return result

original_list = [1, 2, 3, 2, 4, 1, 5, 3, 6]
unique_list = remove_duplicates_ordered(original_list)
print(f"\n1. Remove duplicates:")
print(f"Original: {original_list}")
print(f"Unique (ordered): {unique_list}")
print(f"Unique (set): {list(set(original_list))}")

# 2. Find common elements between multiple lists
list1 = [1, 2, 3, 4, 5]
list2 = [4, 5, 6, 7, 8]
list3 = [3, 4, 5, 9, 10]

common_all = set(list1) & set(list2) & set(list3)
common_any_two = (set(list1) & set(list2)) | (set(list1) & set(list3)) | (set(list2) & set(list3))

print(f"\n2. Find common elements:")
print(f"List 1: {list1}")
print(f"List 2: {list2}")
print(f"List 3: {list3}")
print(f"Common in all three: {common_all}")
print(f"Common in any two: {common_any_two}")

# 3. Fast membership testing
def is_valid_user(user_id, valid_users_set):
    """Fast O(1) membership testing"""
    return user_id in valid_users_set

valid_users = set(range(1000, 2000))  # 1000 valid user IDs
test_users = [1500, 2500, 1750, 3000]

print(f"\n3. Fast membership testing:")
for user in test_users:
    valid = is_valid_user(user, valid_users)
    print(f"User {user}: {'Valid' if valid else 'Invalid'}")

# 4. Tag analysis
articles = [
    {'title': 'Python Basics', 'tags': {'python', 'programming', 'beginner'}},
    {'title': 'Advanced Python', 'tags': {'python', 'programming', 'advanced'}},
    {'title': 'Web Development', 'tags': {'web', 'html', 'css', 'javascript'}},
    {'title': 'Data Science', 'tags': {'python', 'data', 'science', 'pandas'}},
    {'title': 'Machine Learning', 'tags': {'python', 'ml', 'ai', 'science'}}
]

# Find all unique tags
all_tags = set()
for article in articles:
    all_tags |= article['tags']

# Find articles with Python
python_articles = [article for article in articles if 'python' in article['tags']]

# Find most common tags
tag_counts = {}
for article in articles:
    for tag in article['tags']:
        tag_counts[tag] = tag_counts.get(tag, 0) + 1

print(f"\n4. Tag analysis:")
print(f"All unique tags: {all_tags}")
print(f"Python articles: {len(python_articles)}")
print(f"Tag frequencies: {sorted(tag_counts.items(), key=lambda x: x[1], reverse=True)}")

In [None]:
# More practical examples
print("More Practical Examples:")
print("=" * 24)

# 5. Permission system
class User:
    def __init__(self, name, permissions):
        self.name = name
        self.permissions = set(permissions)
    
    def has_permission(self, permission):
        return permission in self.permissions
    
    def add_permission(self, permission):
        self.permissions.add(permission)
    
    def remove_permission(self, permission):
        self.permissions.discard(permission)
    
    def has_all_permissions(self, required_permissions):
        return set(required_permissions).issubset(self.permissions)

# Create users
admin = User('Admin', ['read', 'write', 'delete', 'admin'])
editor = User('Editor', ['read', 'write'])
viewer = User('Viewer', ['read'])

# Test permissions
print(f"\n5. Permission system:")
users = [admin, editor, viewer]
for user in users:
    print(f"{user.name}: {user.permissions}")
    print(f"  Can delete: {user.has_permission('delete')}")
    print(f"  Can read & write: {user.has_all_permissions(['read', 'write'])}")

# 6. Data validation
def validate_email_domains(emails, allowed_domains):
    """Validate email domains"""
    allowed_set = set(allowed_domains)
    valid_emails = []
    invalid_emails = []
    
    for email in emails:
        if '@' in email:
            domain = email.split('@')[1]
            if domain in allowed_set:
                valid_emails.append(email)
            else:
                invalid_emails.append(email)
        else:
            invalid_emails.append(email)
    
    return valid_emails, invalid_emails

test_emails = [
    'user@company.com',
    'admin@company.com', 
    'guest@external.com',
    'invalid-email',
    'user@allowed.org'
]

allowed_domains = {'company.com', 'allowed.org', 'partner.net'}
valid, invalid = validate_email_domains(test_emails, allowed_domains)

print(f"\n6. Email domain validation:")
print(f"Allowed domains: {allowed_domains}")
print(f"Valid emails: {valid}")
print(f"Invalid emails: {invalid}")

# 7. Graph algorithms - find connected components
def find_connected_components(edges):
    """Find connected components in an undirected graph"""
    # Build adjacency list
    graph = {}
    vertices = set()
    for u, v in edges:
        vertices.update([u, v])
        if u not in graph:
            graph[u] = set()
        if v not in graph:
            graph[v] = set()
        graph[u].add(v)
        graph[v].add(u)
    
    visited = set()
    components = []
    
    def dfs(vertex, component):
        visited.add(vertex)
        component.add(vertex)
        for neighbor in graph.get(vertex, set()):
            if neighbor not in visited:
                dfs(neighbor, component)
    
    for vertex in vertices:
        if vertex not in visited:
            component = set()
            dfs(vertex, component)
            components.append(component)
    
    return components

# Test graph
graph_edges = [(1, 2), (2, 3), (4, 5), (6, 7), (7, 8)]
components = find_connected_components(graph_edges)

print(f"\n7. Graph connected components:")
print(f"Edges: {graph_edges}")
print(f"Connected components: {components}")

## Summary

In this notebook, you learned about:

✅ **Set Creation**: Various ways to create sets and handle duplicates  
✅ **Set Characteristics**: Unordered, unique, mutable, hashable elements only  
✅ **Set Methods**: Adding, removing, and querying elements  
✅ **Set Operations**: Union, intersection, difference, symmetric difference  
✅ **Set Relationships**: Subset, superset, disjoint relationships  
✅ **Set Comprehensions**: Concise way to create sets  
✅ **Frozen Sets**: Immutable sets for special use cases  
✅ **Practical Applications**: Real-world problems solved with sets  

### Key Takeaways:
1. Sets automatically eliminate duplicates
2. O(1) average case for membership testing
3. Perfect for mathematical set operations
4. Elements must be hashable
5. Frozen sets can be used as dictionary keys
6. Great for fast lookups and deduplication

### Next Topic: 08_dictionaries.ipynb
Learn about key-value pairs and associative arrays.