# 🧵 Python Regex

In [None]:
import re

## 🧩 1. Basic Functions

| Function | Description |
|---------|-------------|
| `re.search()` | Finds the **first match** |
| `re.findall()` | Returns **all matches** |
| `re.match()` | Matches **only at start** |
| `re.sub()` | **Replaces** matches |
| `re.split()` | **Splits** on pattern |

In [None]:
text = "Email: test123@example.com, phone: 9876543210"

re.search(r'\d+', text)      # First number
re.findall(r'\w+@\w+\.\w+', text)  # All emails
re.sub(r'\d+', '###', text)  # Replace all numbers
re.split(r',\s*', text)      # Split on comma

## 🔡 2. Metacharacters

| Symbol | Meaning | Example | Matches |
|--------|---------|---------|---------|
| `.`     | Any char except newline | `a.c` | `abc`, `a7c` |
| `^`     | Start of string         | `^Hi` | `Hi there` |
| `$`     | End of string           | `end$`| `The end` |
| `[]`    | Match any char in set   | `[aeiou]` | vowels |
| `[^]`   | Not in set              | `[^0-9]` | non-digits |

## 🔢 3. Quantifiers

| Symbol | Meaning | Example | Matches |
|--------|---------|---------|---------|
| `*` | 0 or more | `ab*` | `a`, `ab`, `abbb` |
| `+` | 1 or more | `ab+` | `ab`, `abbb` |
| `?` | 0 or 1    | `ab?` | `a`, `ab` |
| `{n}` | Exactly n | `a{3}` | `aaa` |
| `{n,}`| n or more | `a{2,}` | `aa`, `aaa` |
| `{n,m}`| Between n and m | `a{2,4}` | `aa`, `aaa`, `aaaa` |

## 🔠 4. Character Classes

| Class | Matches |
|-------|---------|
| `\d`  | Digit (0-9) |
| `\D`  | Non-digit |
| `\w`  | Word (a-z, A-Z, 0-9, _) |
| `\W`  | Non-word |
| `\s`  | Whitespace |
| `\S`  | Non-whitespace |

In [None]:
re.findall(r'\d+', "My pin is 400088")  # ['400088']

## 🧪 5. Grouping & Alternation

### 🔹 Grouping: `( )`

In [None]:
re.findall(r'(ab)+', "ababab")

### 🔹 Alternation: `|`

In [None]:
re.findall(r'cat|dog', "I have a cat and a dog.")

## ⚙️ 6. Anchors

| Anchor | Meaning |
|--------|---------|
| `^`    | Start of string |
| `$`    | End of string |
| `\b`   | Word boundary |
| `\B`   | Non-word boundary |

In [None]:
re.findall(r'\bcat\b', "dogcat cats cat.")

## 🧼 7. Substitution & Cleaning

In [None]:
text = "Price: ₹500, ₹600, ₹750"
cleaned = re.sub(r'[₹,]', '', text)
print(cleaned)  # 'Price: 500 600 750'

## 🎯 8. Common Use Cases

In [None]:
# ✅ Validate Email
re.match(r'^\w+[\w.-]*@\w+\.\w{2,4}$', 'test@example.com')

In [None]:
# ✅ Extract Phone Numbers
re.findall(r'\b\d{10}\b', "Call me at 9876543210 or 8123456789")

In [None]:
# ✅ Remove Special Characters
re.sub(r'[^\w\s]', '', "Hello @#World!!")  # 'Hello World'

## 📌 Tips
- Use `r''` (raw strings) to avoid escape confusion.
- Debug regex at [regex101.com](https://regex101.com)

## 📚 Optional: Compile Patterns (for repeated use)

In [None]:
pattern = re.compile(r'\d+')
pattern.findall("There are 24 apples and 30 bananas")