# Sets, Hashing & Frozenset in Python

> **Author:** Muhammad Hammad ([GitHub](https://github.com/DevHammad0))

---

## 1. The Set Data Type

A **set** in Python is a built-in data type for storing unordered collections of **unique** and **immutable** elements. Unlike lists or tuples, sets do not allow duplicates.

**Key Properties:**
- Unordered (no guaranteed order)
- Unindexed (no positions)
- Mutable (can add or remove items)
- Elements must be immutable (e.g., numbers, strings, tuples)

Other Python collections: `list`, `tuple`, `dict`.

In [2]:
my_set = {1, 2, 3, 4, [1,2,3]}
print(my_set)

TypeError: unhashable type: 'list'

In [3]:
# Creating sets
my_set = {123, 452, 5, 6}
my_set2 = set([123, 452, 5, 6])
unknown = {}           # This creates an empty dict, not a set!
empty_set = set()      # Correct way to create an empty set

print("my_set:", my_set)
print("my_set2:", my_set2)
print("type(unknown):", type(unknown))
print("type(empty_set):", type(empty_set))
print("my_set == my_set2:", my_set == my_set2)

my_set: {123, 452, 5, 6}
my_set2: {123, 452, 5, 6}
type(unknown): <class 'dict'>
type(empty_set): <class 'set'>
my_set == my_set2: True


In [4]:
print("id of my_set",id(my_set))
print("id of my_set2",id(my_set2))

id of my_set 135546145367072
id of my_set2 135546145367296


## 1.1 Sets Only Hold Immutable Objects

You can only store immutable (hashable) elements in a set. Lists or dictionaries are not allowed as set elements.

In [5]:
# Attempting to put a list in a set raises an error

my_set = {[123, 452, 5, 6]}
print(my_set)

TypeError: unhashable type: 'list'

## 1.2 Sets Can Hold Multiple Data Types

As long as the elements are immutable, you can mix types!

In [6]:
multi_type_set = {7, 9.0, False, True, "Hello! World", (1, 5, 9, 'hi')}
print(multi_type_set)# unordered

{False, True, (1, 5, 9, 'hi'), 7, 'Hello! World', 9.0}


## 1.3 Sets are Unordered

Order is not preserved and can change as elements are added/removed.

In [7]:
set2 = {'Java', 'Python', 'JavaScript', 'java'}
print(set2)

{'Java', 'Python', 'JavaScript', 'java'}


> Internally, Python stores set elements by their hash values. The order is not predictable or stable.
---

## 1.4 Sets Cannot Be Changed by Index

You cannot modify set items by index. You can only add or remove elements.

In [13]:
my_set = {1, 2, 3, 4, 5}
print(my_set)

my_set[0] = 10


{1, 2, 3, 4, 5}


TypeError: 'set' object does not support item assignment

### Add, Remove and Update Elements

In [14]:
print(my_set)

{1, 2, 3, 4, 5}


In [15]:
my_set.add(6)  # Add element
print("Added 6:", my_set)

Added 6: {1, 2, 3, 4, 5, 6}


In [16]:
my_set.remove(3)  # Remove element (raises error if not present)
print("Removed 3:", my_set)

Removed 3: {1, 2, 4, 5, 6}


In [17]:
my_set.discard('A')  # Discard (no error if not present)
print("Discarded 'A':", my_set)

Discarded 'A': {1, 2, 4, 5, 6}


In [18]:
my_set.update([7, 8, 9, "Hello"])   # Add multiple items
print("Added multiple:", my_set)

Added multiple: {1, 2, 4, 5, 6, 7, 8, 9, 'Hello'}


In [19]:
my_set.difference_update({8, 9, "Hello"})   # Remove multiple elements at once
print("Removed multiple:", my_set)

Removed multiple: {1, 2, 4, 5, 6, 7}


### Set Union

Combine two sets using `union()` or the `|` operator.

In [20]:
set1 = {1, 2, 3, 5}
set2 = {1, 5, 6, 7}
print("Using union() method:", set1.union(set2))
print("Using | operator:", set1 | set2)

Using union() method: {1, 2, 3, 5, 6, 7}
Using | operator: {1, 2, 3, 5, 6, 7}


**Note:** Sets always store unique elements. Adding a duplicate does nothing.

In [21]:
my_set = {1, 2, 3, 4, 5, "Hello! World"}
my_set.add(2)
my_set.add("Hello! World")
print(my_set)

{1, 2, 3, 4, 5, 'Hello! World'}


### Remove vs Discard

- `remove(x)`: Removes x, raises KeyError if missing
- `discard(x)`: Removes x if present, does nothing if missing

In [23]:
my_set = {1, 2, 3}

# my_set.remove(4)    # Raises KeyError

my_set.discard(4)  # No error
print(my_set)

{1, 2, 3}


### Pop Method

Removes and returns an arbitrary element.

In [24]:
my_set = {1, 2, 3}
print("Before pop:", my_set)
removed = my_set.pop()
print("Popped value:", removed)
print("After pop:", my_set)

Before pop: {1, 2, 3}
Popped value: 1
After pop: {2, 3}


### Set All Methods

In [25]:
[i for i in dir(my_set) if not i.startswith('__')]

['add',
 'clear',
 'copy',
 'difference',
 'difference_update',
 'discard',
 'intersection',
 'intersection_update',
 'isdisjoint',
 'issubset',
 'issuperset',
 'pop',
 'remove',
 'symmetric_difference',
 'symmetric_difference_update',
 'union',
 'update']

In [4]:
my_set = {1, 2, 3, 4, 5}
print(my_set)

{1, 2, 3, 4, 5}


In [5]:
dir(my_set)

['__and__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__iand__',
 '__init__',
 '__init_subclass__',
 '__ior__',
 '__isub__',
 '__iter__',
 '__ixor__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__or__',
 '__rand__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__ror__',
 '__rsub__',
 '__rxor__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__xor__',
 'add',
 'clear',
 'copy',
 'difference',
 'difference_update',
 'discard',
 'intersection',
 'intersection_update',
 'isdisjoint',
 'issubset',
 'issuperset',
 'pop',
 'remove',
 'symmetric_difference',
 'symmetric_difference_update',
 'union',
 'update']

In [6]:
my_list = [1, 2, 3, 4, 5]
print(my_list)

[1, 2, 3, 4, 5]


In [7]:
dir(my_list)

['__add__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

In [8]:
hash(my_list)

TypeError: unhashable type: 'list'

In [9]:
hash(my_set)

TypeError: unhashable type: 'set'

In [28]:
help(set.discard)

Help on method_descriptor:

discard(...) unbound builtins.set method
    Remove an element from a set if it is a member.
    
    Unlike set.remove(), the discard() method does not raise
    an exception when an element is missing from the set.



In [2]:
my_tuple = (1,2,3)

In [3]:
hash(my_tuple)

529344067295497451

In [10]:
x =5
hash(x)

5

# 2. What is Hashing?

**Hashing** is a process in computer science where data (like a string or number) is converted into a fixed-size numerical value, called a **hash value** or **hash code**. This value is calculated using a hash function.

In Python, immutable objects (like strings, numbers, and tuples) have a built-in hash value. Sets and dictionaries use these hash values to determine where to store elements in memory. This allows for very fast lookup, insertion, and deletion operations.

### Example: Hash Values
Let's see the hash values for strings:

In [11]:
a = "Hello! World"
b = "Hello! World"
print("id(a):", id(a))
print("id(b):", id(b))
print("hash(a):", hash(a))
print("hash(b):", hash(b))

id(a): 133183602659312
id(b): 133184022566832
hash(a): -6007152737598741160
hash(b): -6007152737598741160


- Notice how two equal strings have the same hash value.
- Hashing enables **O(1)** average-time complexity for set and dict operations.
- Only immutable objects are hashable and can be used as set elements or dict keys.

Trying to use a set (which is mutable) as a dictionary key will fail:

In [12]:
y_set = {1, 2, 3}
my_dict = {y_set: "Hello!"}  # Error: set is not hashable!

TypeError: unhashable type: 'set'

# 3. The Inner Working of Sets (and Why Set Order Changes)

Sets use hash tables internally. Elements are stored according to their hash values. When you add or remove elements, the internal order may change (this is called **rehashing**).

In [None]:
my_set = {10, 3, 5, 8}
print("Initial:", my_set)
my_set.add(20)
print("After adding 20:", my_set)
my_set.remove(10)
print("After removing 10:", my_set)

# 4. Frozenset

A **frozenset** is an immutable version of a set:
- Cannot add or remove elements
- Hashable (can be used as dict keys or set elements)
- Created with `frozenset([iterable])`

In [14]:
fset = frozenset([1, 2, 3, 4, "hello"])
print(fset)
# Uncommenting the next line will raise an error:
# fset.add(5)

frozenset({1, 2, 3, 4, 'hello'})


### Frozenset as Dictionary Key

In [15]:
fset = frozenset([1, 2, 3])
my_dict = {fset: "Value"}
print(my_dict)

{frozenset({1, 2, 3}): 'Value'}


In [16]:
[i for i in dir(fset) if not i.startswith('__')]

['copy',
 'difference',
 'intersection',
 'isdisjoint',
 'issubset',
 'issuperset',
 'symmetric_difference',
 'union']

# 5. Sets vs Frozensets: Comparison Table

| Feature         | Set           | Frozenset      |
| --------------- | ------------- | -------------- |
| Mutable         | Yes           | No             |
| Hashable        | No            | Yes            |
| Can be dict key | No            | Yes            |
| Can be changed  | Yes           | No             |
| Syntax          | set(), {}     | frozenset()    |

# 6. Garbage Collection (GC) in Python (Advanced)

Python automatically manages memory using **garbage collection**. When objects like sets or frozensets are no longer referenced, Python frees their memory automatically.

### How Does It Work?
- When you create an object, like a set {1, 2, 3}, it’s stored in memory.
- Python keeps track of whether an object is still being used by checking if anything in your program (like a variable) is referencing it.
- If an object has no references (nothing points to it anymore), it’s like trash sitting around. The garbage collector finds it and removes it, freeing up memory.

# 7. Summary

- Sets are unordered, mutable collections of unique, immutable elements.
- Frozensets are immutable and can be used as dict keys.
- Hashing enables efficient set operations.
- Use sets for changeable collections, frozensets for fixed, hashable collections.
- Python cleans up unused objects automatically via garbage collection.

## 📝 Assignment: Explore Remaining Set Methods in Python

In this assignment, you are required to explore and demonstrate the **Python set methods** that we did **not cover in class**.

### ✅ Instructions:

1. Below is the list of **all set methods** you want to explore.
2. You must **use each of the following methods one by one**:
   - `clear`
   - `copy`
   - `difference`
   - `intersection`
   - `intersection_update`
   - `isdisjoint`
   - `issubset`
   - `issuperset`
   - `symmetric_difference`
   - `symmetric_difference_update`
3. For **each method**, do the following:
   - Write a **one-line explanation** in a **Markdown/text cell**.
   - Show a **simple example** of how the method works in a **code cell**.

> ⚠️ Methods that we **already covered in class** (you do NOT need to include them again):
> - `add`
> - `pop`
> - `update`
> - `difference_update`
> - `discard`
> - `remove`
> - `union`

---

🔁 **Goal**: After completing this, you should be confident using all built-in set methods in Python and understand when to apply each.

📌 Submit your completed notebook as instructed.

Happy coding! 🐍✨
"""