# 🐍 Python set Data Type — Complete Guide

- Unordered collection of unique elements.
- Mutable (you can add/remove items).
- No duplicate values allowed.
- Elements must be hashable (numbers, strings, tuples of immutables). Lists/dicts cannot be inside sets.

#### Creating sets

In [4]:
# --- Creating an Empty Set ---

s1 = set()           # ✅ Correct way to create an empty set
print(type(s1))      # <class 'set'>

# s2 = {}            # ❌ This creates an empty dictionary, NOT a set
print(type({}))      # <class 'dict'>

# --- Creating a Set with Values ---

s2 = {1, 2, 3, 3}   # Duplicates are automatically removed in sets
print(s2)            # Output: {1, 2, 3}

# Sets are unordered collections of unique elements.

# --- Creating Sets from Iterables ---

s3 = set([1, 2, 2, 3])    # From a list: duplicates removed, order not guaranteed
print(s3)                 # Output: {1, 2, 3}

s4 = set("banana")        # From a string: characters become elements, duplicates removed
print(s4)                 # Output: {'b', 'a', 'n'} (order may vary)

# Note: Sets do not preserve insertion order (before Python 3.7). From Python 3.7+, they preserve insertion order but should not be relied upon for ordering.


<class 'set'>
<class 'dict'>
{1, 2, 3}
{1, 2, 3}
{'a', 'b', 'n'}


#### Properties of sets
- Unordered → no indexing or slicing.
- Unique elements → duplicates auto-dropped.
- Mutable → add/remove items.
- Unhashable elements not allowed → you can’t put lists/dicts inside a set.

In [27]:
s = {1, 2, 3}

# Sets are unordered collections and do NOT support indexing or slicing
# print(s[0])          # ❌ TypeError: 'set' object is not subscriptable

# You cannot access elements of a set by index because sets don't have order
# To access elements, you can iterate over the set instead:
for element in s:
    print(element)     # Prints each element, order not guaranteed

# --- Trying to create a set with an unhashable element ---

try:
    bad = {1, [2, 3]}  # ❌ TypeError because lists are unhashable (mutable)
except TypeError as e:
    print("Error:", e)  # Output: unhashable type: 'list'

1
2
3
Error: unhashable type: 'list'


#### Membership test

In [28]:
nums = {1, 2, 3}

# Check if an element is in the set
print(2 in nums)         # Output: True

# Check if an element is NOT in the set
print(5 not in nums)     # Output: True

# --- Why use sets for membership tests? ---
# Sets provide average O(1) time complexity for membership tests,
# which means they are much faster than lists, especially for large collections.

# Example: If you used a list, membership test is O(n)
lst = [1, 2, 3]
print(2 in lst)          # True, but slower for large lists


True
True
True


#### Set algebra (union, intersection, difference, symmetric difference)

In [43]:
a = {1, 2, 3}
b = {3, 4, 5}

# --- Union (all unique elements from both sets) ---
print(a | b)           # Output: {1, 2, 3, 4, 5}
print(a.union(b))      # Same as above

# Explanation:
# Union combines all elements in 'a' and 'b', but no duplicates.

# --- Intersection (elements common to both sets) ---
print(a & b)           # Output: {3}
print(a.intersection(b))  # Same as above

# Explanation:
# Intersection returns only elements present in both 'a' and 'b'.

# --- Difference (elements in one set but NOT in the other) ---
print(a - b)           # Output: {1, 2}  (in 'a' but not in 'b')
print(b - a)           # Output: {4, 5}  (in 'b' but not in 'a')
print(a.difference(b))  # Same as a - b

# Explanation:
# Difference gives elements unique to the first set.

# --- Symmetric difference (elements in either set, but NOT both) ---
print(a ^ b)           # Output: {1, 2, 4, 5}
print(a.symmetric_difference(b))  # Same as above

# Explanation:
# Symmetric difference returns elements that are in either 'a' or 'b' but not in both.

{1, 2, 3, 4, 5}
{1, 2, 3, 4, 5}
{3}
{3}
{1, 2}
{4, 5}
{1, 2}
{1, 2, 4, 5}
{1, 2, 4, 5}


#### Subset, superset, disjoint checks

In [50]:
x = {1, 2}
y = {1, 2, 3, 4}

# --- Subset checks ---
print(x <= y)            # True — all elements of x are in y (x is a subset of y)
print(x.issubset(y))     # Same as above, more explicit method

# --- Superset checks ---
print(y >= x)            # True — y contains all elements of x (y is a superset of x)
print(y.issuperset(x))   # Same as above, more explicit method

# --- Disjoint sets ---
a = {1, 2}
b = {3, 4}
print(a.isdisjoint(b))   # True — no elements in common between a and b


True
True
True
True
True


#### Iterating through sets

In [52]:
s = {"apple", "banana", "cherry"}

# Iterating over a set
for item in s:
    print(item)

# Output example (order may vary):
# banana
# apple
# cherry

apple
banana
cherry


#### Conversion (list, tuple, string ↔ set)

In [53]:
lst = [1, 2, 2, 3, 4]

# Convert list to set to remove duplicates (deduplication)
unique = set(lst)        # {1, 2, 3, 4}
print(unique)

# Convert back to list (order may change since sets are unordered)
back_to_list = list(unique)
print(back_to_list)

# --- Working with strings and sets ---

s = set("hello")         # Convert string to set of unique characters {'h', 'e', 'l', 'o'}
print(s)

# Convert set back to string by joining sorted characters
back_to_str = "".join(sorted(s))  # 'ehlo'
print(back_to_str)


{1, 2, 3, 4}
[1, 2, 3, 4]
{'h', 'o', 'e', 'l'}
ehlo


#### Advanced use cases

In [55]:
# 1. Deduplicate a list while preserving the original order
def dedupe(lst):
    seen = set()   # To track items we've already encountered
    out = []       # Output list with unique items in order
    for x in lst:
        if x not in seen:
            seen.add(x)   # Mark this item as seen
            out.append(x) # Append it to output only once
    return out

print(dedupe([1, 2, 2, 3, 1, 4]))  # Output: [1, 2, 3, 4]
# Explanation: Unlike using set(), this function keeps the first occurrence order.

# 2. Find common users between two datasets using set intersection
users1 = {"dhiraj", "pooja", "alex"}
users2 = {"alex", "maria"}

print(users1 & users2)  # Output: {'alex'}
# Explanation: '&' returns elements common to both sets.

# 3. Detect missing items by set difference
all_ids = {1, 2, 3, 4, 5}  # Complete set of IDs expected
present = {2, 3, 5}        # IDs actually present

missing = all_ids - present  # IDs missing from present
print(missing)               # Output: {1, 4}
# Explanation: '-' returns items in first set but not in second.


[1, 2, 3, 4]
{'alex'}
{1, 4}


#### Pitfalls

In [56]:
# 1. Empty set literal
s = {}            # ❌ This creates an empty dictionary, NOT a set!
s = set()         # ✅ Correct way to create an empty set

# 2. Unhashable elements cannot be added to a set
try:
    bad = {[1, 2], 3}   # ❌ Raises TypeError because list is unhashable (mutable)
except TypeError as e:
    print("Error:", e)  # Output: unhashable type: 'list'

# 3. Non-deterministic pop()
s = {1, 2, 3}
print(s.pop())    # Removes and returns an arbitrary element (order is unpredictable)
print(s)          # Remaining elements in the set


Error: unhashable type: 'list'
1
{2, 3}
