## part 2 RegEx
## ðŸ”¤ RegEx Shorthand Character Classes

In Regular Expressions, shorthand symbols are used to represent **groups of characters** that often appear in text, such as digits, letters, or spaces.  
They make patterns shorter and easier to read.

Below are the most common shorthand patterns and what they do ðŸ‘‡


### ðŸ§© `\d` â†’ **Digit**
Represents **any number from 0 to 9**.  

Useful when you want to find numbers such as ages, phone numbers, or IDs.

In [1]:
import re

text = "Room 12, Floor 3"

print(re.findall(r'\d', text))    # ['1', '2', '3']
print(re.findall(r'\d+', text))   # ['12', '3'] â†’ '+' means one or more digits

['1', '2', '3']
['12', '3']


### ðŸ§© '\w' â†’ **Word Character**
Represents letters (aâ€“z, Aâ€“Z), digits (0â€“9), and underscore (_).

Used to match words, variable names, or identifiers.

In [None]:
text = "User_123 joined!"

print(re.findall(r'\w', text))     # ['U','s','e','r','_','1','2','3','j','o','i','n','e','d']
print(re.findall(r'\w+', text))    # ['User_123', 'joined']


### ðŸ§© '\s' â†’ **Whitespace**

Represents any space, tab, or newline character.

Used to locate or remove extra spaces or formatting.

In [8]:
text = "Hello   world"

print(re.findall(r'\s', text))     # [' ', ' ', ' ']
print(re.sub(r'\s+', ' ', text))   # 'Hello world' â†’ replaces multiple spaces with one


[' ', ' ', ' ']
Hello world


### ðŸ§© '\D' â†’ **Non-Digit**

Matches any character that is not a number.

Useful when you want to exclude numbers from a search.

In [9]:
text = "A1B2C3"

print(re.findall(r'\D', text))     # ['A','B','C']


['A', 'B', 'C']


### ðŸ§© '\W' â†’ **Non-Word Character**

Matches characters that are not letters, digits, or underscores, such as punctuation and symbols.

In [10]:
text = "Hi! How_are-you?"

print(re.findall(r'\W', text))     # ['!', ' ', '-', '?']


['!', ' ', '-', '?']


### ðŸ§© '\S' â†’ **Non-Whitespace**

Matches any character that is not a space, tab, or newline.

In [11]:
text = "A B\tC"

print(re.findall(r'\S', text))     # ['A','B','C']


['A', 'B', 'C']


### âœ… RegEx Shorthand Summary Table

| Pattern | Meaning | Example Match | Description |
|----------|----------|----------------|--------------|
| `\d` | Digit (0â€“9) | `'3'` in `'Room3'` | Matches any single numeric digit |
| `\w` | Word character (letter, digit, or underscore) | `'A'`, `'1'`, `'_'` | Used for words, variable names, or identifiers |
| `\s` | Whitespace (space, tab, newline) | `' '` or `'\t'` | Matches any kind of blank space |
| `\D` | Non-digit | `'A'`, `'%'` | Matches any character that is **not a number** |
| `\W` | Non-word character | `'@'`, `'!'`, `' '` | Matches punctuation, symbols, and spaces |
| `\S` | Non-whitespace | `'A'`, `'9'`, `'_'` | Matches all visible (non-space) characters |


## ðŸ”¢ RegEx Quantifiers

Quantifiers define **how many times** a character or pattern should appear in a string.  
They help match repeating characters, digits, or words.

---

### ðŸ§© Common Quantifiers

| Quantifier | Meaning | Example Pattern | Matches | Description |
|-------------|----------|----------------|----------|--------------|
| `.` | Any **single character** (except newline) | `a.b` | `acb`, `a1b`, `a-b` | The dot matches **any** character between `a` and `b` |
| `+` | **One or more** repetitions | `\d+` | `123`, `4567` | Matches **at least one** digit |
| `*` | **Zero or more** repetitions | `\w*` | `abc`, `123`, *(empty string)* | Matches any number of word characters, even none |
| `?` | **Zero or one** repetition | `colou?r` | `color`, `colour` | Makes the preceding character optional |
| `{n}` | **Exactly n** repetitions | `\d{4}` | `2025` | Matches exactly 4 digits |
| `{n,}` | **At least n** repetitions | `\d{2,}` | `12`, `1234`, `12345` | Matches 2 or more digits |
| `{n,m}` | Between **n and m** repetitions | `\d{2,4}` | `12`, `123`, `1234` | Matches 2 to 4 digits |

---

### ðŸ’¡ Example in Python




In [None]:
import re

text = "nasrin +912034 1234-5678-9872-2341 bita +9123039 @bitabita"

# Match all 4-digit sequences
print(re.findall(r'\d{4}', text))
# Output: ['1234', '5678', '9872', '2341']

# Match phone numbers starting with +91 followed by digits
print(re.findall(r'\+91\d+', text))
# Output: ['+912034', '+9123039']