# Introduction to Sets in Python

## üî• Python Sets

**Sets in Python are a powerful built-in data structure designed for handling unique, unordered collections of elements.** Although beginners sometimes overlook them in favor of lists, understanding sets is essential for writing efficient, clean, and optimized Python code.

This article covers everything you need to know about sets ‚Äî from what they are, how they work, their internal implementation, all operations, and real-world use cases.

## üß© What Is a Set in Python?

- A set in Python is an **unordered, mutable, and unique**-collection data structure.

### ‚úî Unordered
- Items in a set do not have positions **(no indexing)**.
- The interpreter decides the **internal order (based on hashing)**.

### ‚úî Mutable

- You can **add or remove items** after creating a set.

### ‚úî Unique Items

- A set **automatically removes duplicates**.

## üõ†Ô∏è How to Create a Set
### 1. Using curly braces {}
```python
s = {1, 2, 3}
```

### 2. Using the set() constructor
```python
s = set([1, 2, 3])
```

### ‚ùó Creating an empty set requires set()
```python
s = {}      # ‚ùå This creates an empty dict
s = set()   # ‚úî Correct empty set
```

## üß± **What Can a Set Contain?**

A set can contain **only hashable (immutable)** objects:

- ‚úî integers
- ‚úî floats
- ‚úî strings
- ‚úî tuples

- ‚ùå lists
- ‚ùå dictionaries
- ‚ùå other sets

**Why?**
Because these are **mutable ‚Üí their hash can change ‚Üí breaks set integrity**.

However, it can contain:

- ‚úî `frozenset()` (because it is immutable)

Example:

```python
s = {1, "hello", (2,3), frozenset([4,5])}
```

## ‚öôÔ∏è **Internal Working of Sets (Important)**

Sets are implemented as **hash tables**, similar to dictionaries.

Each item is:

* hashed using Python‚Äôs hash function
* placed in a "bucket" based on this hash

This gives sets two superpowers:

### ‚≠ê O(1) average lookup time

Checking membership is extremely fast:

```python
if 10 in s:
    ...
```

### ‚≠ê Automatic duplicate removal

Two equal objects have the same hash ‚Üí stored once.

## üß∞ Basic Set Operations

In [1]:
# add an item
s = set()
s.add(10)
s

{10}

In [2]:
# add multiple items 
s.update([4,5,'a'])
s

{10, 4, 5, 'a'}

In [None]:
# remove elements - throws error if element is not found in set
try:
    s.remove(10)
except Exception as e:
    print(e)
    
s

{4, 5, 'a'}

In [5]:
# Discard (safe ‚Äî no error)
s.discard('a')
s

{4, 5}

In [6]:
# Pop (removes a random element)
s.pop()
s

{5}

In [7]:
# clear 
s.clear()
s

set()

## üîó Set Mathematical Operations

Python sets behave like mathematical sets.

In [8]:
a = {1,2,3}
b = {3,4,5}

In [9]:
# Union (| or union())
a|b

{1, 2, 3, 4, 5}

In [10]:
# Intersection (& or intersection())
a&b

{3}

In [11]:
# Difference (- or difference())
a-b

{1, 2}

In [12]:
# Symmetric Difference (^ or symmetric_difference())
a^b

{1, 2, 4, 5}

## ‚öñÔ∏è Comparison Operations

In [13]:
# subset
a.issubset(b)

False

In [15]:
# superset
a.issuperset(b)

False

In [16]:
# disjoint 
a.isdisjoint(b)

False

## üìö Full List of Set Methods

| Method                   | Purpose                      |
| ------------------------ | ---------------------------- |
| `add()`                  | Add an element               |
| `update()`               | Add multiple elements        |
| `remove()`               | Remove (error if missing)    |
| `discard()`              | Remove (no error)            |
| `pop()`                  | Remove random element        |
| `clear()`                | Empty the set                |
| `union()`                | Combine sets                 |
| `intersection()`         | Common items                 |
| `difference()`           | Items in A not B             |
| `symmetric_difference()` | Items in A or B but not both |
| `issubset()`             | Subset check                 |
| `issuperset()`           | Superset check               |
| `isdisjoint()`           | Check if intersection empty  |


## üß† **Advanced Concepts**

### 1. Using Sets for Fast Membership Testing

```python
allowed = {"png", "jpg", "jpeg"}

if ext in allowed:
    print("Valid file")
```

This is MUCH faster than using a list.

---

### 2. Set Comprehensions

```python
s = {x*x for x in range(5)}
# {0, 1, 4, 9, 16}
```

---

### 3. Removing Duplicates from a List

```python
nums = [1,2,2,3,3,3]
unique = list(set(nums))
```

---

### 4. Finding Unique Words in Text

```python
text = "python is great and python is easy"
unique_words = set(text.split())
```

---

### 5. Comparing Two Lists Efficiently

```python
common = set(list1) & set(list2)
```

---

### 6. Creating Sets of Sets Using `frozenset`

```python
s = {frozenset([1,2]), frozenset([3,4])}
```


## ‚ö†Ô∏è **Common Mistakes with Sets**

### 1. Trying to index a set

```python
s = {1,2,3}
s[0]   # ‚ùå Error ‚Äî sets have no order
```

### 2. Using mutable items in a set

```python
s = { [1,2] }  # ‚ùå Error: unhashable type: 'list'
```

### 3. Expecting a consistent order

```python
print({1,2,3})  
# may print {1,3,2}
```


## üÜö **Set vs List vs Tuple**

| Feature           | List  | Tuple | Set   |
| ----------------- | ----- | ----- | ----- |
| Ordered           | ‚úî Yes | ‚úî Yes | ‚ùå No  |
| Mutable           | ‚úî Yes | ‚ùå No  | ‚úî Yes |
| Allows duplicates | ‚úî Yes | ‚úî Yes | ‚ùå No  |
| Fast lookup       | ‚ùå No  | ‚ùå No  | ‚úî Yes |
| Hashable          | ‚ùå No  | ‚úî Yes | ‚ùå No  |


## üåç **Real-world Use Cases of Sets**

### üîπ 1. Removing duplicates from large datasets

### üîπ 2. Membership checks (e.g., valid tokens, whitelists)

### üîπ 3. Graph algorithms (neighbors, visited nodes)

### üîπ 4. Comparing user permissions

### üîπ 5. Deduping rows in ETL pipelines

### üîπ 6. NLP (unique vocabulary extraction)

### üîπ 7. Parsing logs for unique IP addresses

### üîπ 8. Detecting overlaps in schedules, time slots


# üìù **Summary**

Python sets are:

* **unordered**
* **mutable**
* **duplicate-free**
* **hash-table based**
* **fast for membership checks**
* **powerful for mathematical set operations**

They are ideal whenever:

* You need uniqueness
* You need fast lookup
* Order doesn‚Äôt matter

Understanding sets (and their immutable sibling `frozenset`) is crucial for writing efficient and Pythonic code.

---