```markdown
# Python Regular Expression Special Characters

Below is a table describing the special characters used in Python's regular expressions (via the `re` module) and their purposes.

| Special Character | Description | Example | Example Explanation |
|-------------------|-------------|---------|---------------------|
| `.` | Matches any single character except newline (`\n`). | `a.b` | Matches `axb`, `a1b`, `a@b`, etc., but not `ab` or `a\nb`. |
| `^` | Matches the start of a string. | `^abc` | Matches `abc` at the beginning of a string, e.g., `abcde`, but not `xabc`. |
| `$` | Matches the end of a string. | `xyz$` | Matches `xyz` at the end of a string, e.g., `wxyz`, but not `xyza`. |
| `*` | Matches 0 or more repetitions of the preceding pattern. | `ab*` | Matches `a`, `ab`, `abb`, `abbb`, etc. |
| `+` | Matches 1 or more repetitions of the preceding pattern. | `ab+` | Matches `ab`, `abb`, `abbb`, etc., but not `a`. |
| `?` | Matches 0 or 1 repetition of the preceding pattern. | `ab?` | Matches `a` or `ab`. |
| `{m}` | Matches exactly `m` repetitions of the preceding pattern. | `a{3}` | Matches `aaa`, but not `aa` or `aaaa`. |
| `{m,n}` | Matches between `m` and `n` repetitions of the preceding pattern (inclusive). | `a{2,4}` | Matches `aa`, `aaa`, or `aaaa`, but not `a` or `aaaaa`. |
| `[]` | Defines a character class; matches any single character within the brackets. | `[abc]` | Matches `a`, `b`, or `c`. |
| `[^]` | Matches any single character not in the brackets. | `[^abc]` | Matches any character except `a`, `b`, or `c`. |
| `|` | Matches either the pattern before or after the `|`. | `a|b` | Matches `a` or `b`. |
| `()` | Groups patterns together; captures the matched substring. | `(abc)` | Matches `abc` and captures it as a group. |
| `\` | Escapes a special character to treat it as a literal or denotes a special sequence. | `\.` | Matches a literal `.` (dot). |
| `\d` | Matches any digit (0-9). | `\d` | Matches `5` in `a5b`. |
| `\D` | Matches any non-digit. | `\D` | Matches `a` or `b` in `a5b`. |
| `\w` | Matches any word character (letters, digits, underscore). | `\w` | Matches `a`, `5`, or `_` in `a5_`. |
| `\W` | Matches any non-word character. | `\W` | Matches `@` or `#` in `a@5`. |
| `\s` | Matches any whitespace character (space, tab, newline, etc.). | `\s` | Matches a space in `a b`. |
| `\S` | Matches any non-whitespace character. | `\S` | Matches `a` or `b` in `a b`. |
| `\b` | Matches a word boundary. | `\bword\b` | Matches `word` in `word space` but not in `sword` or `words`. |
| `\B` | Matches a non-word boundary. | `\Bword\B` | Matches `word` in `swordplay` but not in `word space`. |
| `\A` | Matches the start of a string (like `^` but ignores multiline mode). | `\Aabc` | Matches `abc` at the start of the string. |
| `\Z` | Matches the end of a string before the final newline (if any). | `xyz\Z` | Matches `xyz` at the end of `xyz` or `xyz\n`. |
| `\z` | Matches the absolute end of a string. | `xyz\z` | Matches `xyz` at the end of `xyz` but not `xyz\n`. |

## Notes
- Use raw strings (e.g., `r"pattern"`) in Python to avoid escaping backslashes.
- The `re` module provides functions like `re.match()`, `re.search()`, `re.findall()`, etc., to work with these patterns.
- For more complex patterns, combine these special characters, e.g., `r"\d{2}-\d{2}-\d{4}"` for dates like `12-31-2023`.
```

### 1. (Dot – Any Character)
Meaning: Matches any single character except newline \n.

In [1]:
import re
result = re.findall(r'a.b', 'aab abb acb a\nb')
print(result)  # Output: ['aab', 'abb', 'acb']

['aab', 'abb', 'acb']


2. ### ^ (Caret – Start of String)
Meaning: Matches if the string starts with the given pattern.

Example:

In [2]:
result = re.findall(r'^Hello', 'Hello World')
print(result)  # Output: ['Hello']

['Hello']


### 3. $ (Dollar – End of String)
Meaning: Matches if the string ends with the given pattern.
 

In [3]:
result = re.findall(r'World$', 'Hello World')
print(result)  # Output: ['World']

['World']


### 4. * (Asterisk – Zero or More Repetitions)
Meaning: Matches 0 or more of the preceding character.


In [4]:
result = re.findall(r'ab*', 'ab abb abbb ac')
print(result)  # Output: ['ab', 'abb', 'abbb', 'a']

['ab', 'abb', 'abbb', 'a']


### 5. + (Plus – One or More Repetitions)
Meaning: Matches 1 or more of the preceding character.

In [5]:
result = re.findall(r'ab+', 'ab abb abbb ac')
print(result)  # Output: ['ab', 'abb', 'abbb']

['ab', 'abb', 'abbb']


### 6. ? (Question Mark – Zero or One)
Meaning: Matches 0 or 1 of the preceding character.
Also used for non-greedy matches when combined with * or +.

In [6]:
result = re.findall(r'ab?', 'ab abb ac')
print(result)  # Output: ['ab', 'ab', 'a']

['ab', 'ab', 'a']


### 7. {} (Braces – Specific Repetition)
Meaning: Matches exact or range of repetitions.

Examples:

In [7]:
re.findall(r'a{3}', 'aa aaaa aaa')       # ['aaa']
re.findall(r'a{2,4}', 'a aa aaa aaaa')   # ['aa', 'aaa', 'aaaa']

['aa', 'aaa', 'aaaa']

### 8. [] (Square Brackets – Character Set)
Meaning: Matches any one of the characters inside.

Example:

In [8]:
re.findall(r'[abc]', 'apple boy cat')  # ['a', 'b', 'c', 'a']

['a', 'b', 'c', 'a']

In [9]:
#You can also use ranges:
re.findall(r'[a-z]', 'ABC def XYZ')  # ['d', 'e', 'f']

['d', 'e', 'f']

### 9. \ (Backslash – Escape Character / Special Sequences)
Meaning: Used to escape characters or signal special sequences like \d, \w, etc.

Example (escape):

In [10]:
re.findall(r'\.', 'file.txt')  # ['.']

['.']

In [11]:
# Example (special):
re.findall(r'\d+', 'Age 24 and 35')  # ['24', '35']

['24', '35']

### 10. | (Pipe – OR)
Meaning: Acts like a logical OR.

Example:

In [12]:
re.findall(r'cat|dog', 'cat and dog')  # ['cat', 'dog']

['cat', 'dog']

### 11. () (Parentheses – Grouping and Capturing)
Meaning: Groups expressions and captures matched groups.

Example:

In [13]:
match = re.search(r'(a+)(b+)', 'aaabbb')
print(match.groups())  # ('aaa', 'bbb')

('aaa', 'bbb')


# Special Sequences in Regex
Sequence	Meaning
- \d	Matches any digit (0–9)
- \D	Matches any non-digit
- \w	Matches any word character (a–z, A–Z, 0–9, _)
- \W	Matches any non-word character
- \s	Matches any whitespace (space, tab, newline)
- \S	Matches any non-whitespace character
- \b	Matches a word boundary
- \B	Matches a non-word boundary
- \\	Escapes a backslash

Let’s go through each one step by step with examples.

### 1. \d – Digit
Meaning: Matches any digit from 0 to 9.

Example:

In [14]:
import re
re.findall(r'\d+', 'My phone number is 12345')  # ['12345']

['12345']

### 2. \D – Non-Digit
Meaning: Matches any character that is not a digit.

Example:

In [15]:
re.findall(r'\D+', 'Room 101')  # ['Room ']

['Room ']

### 3. \w – Word Character
Meaning: Matches any alphanumeric character plus underscore: [a-zA-Z0-9_]

Example:

In [16]:
re.findall(r'\w+', 'Hello_world 123!')  # ['Hello_world', '123']


['Hello_world', '123']

### 4. \W – Non-Word Character
Meaning: Matches anything not a word character.

Example:

In [17]:
re.findall(r'\W+', 'Hello world! 123')  # [' ', '! ']

[' ', '! ']

### 5. \s – Whitespace
Meaning: Matches spaces, tabs, newlines, etc.

Example:

In [18]:
re.findall(r'\s+', 'Hello   world\tPython')  # ['   ', '\t']

['   ', '\t']

### 6. \S – Non-Whitespace
Meaning: Matches any character that is not a whitespace.

Example:

In [19]:
re.findall(r'\S+', 'Line 1\nLine 2')  # ['Line', '1', 'Line', '2']

['Line', '1', 'Line', '2']

### 7. \b – Word Boundary
Meaning: Matches the boundary between a word and a non-word character.

Example:

In [20]:
re.findall(r'\bcat\b', 'cat scatter catalog')  # ['cat']

['cat']

### 8. \B – Non-Word Boundary
Meaning: Matches inside a word, not at the beginning or end.

Example:

In [23]:
re.findall(r'\Bcat\B', 'scatter catalog cat')  # ['cat']

#This finds "cat" within words like "scatter" or "catalog".

['cat']

### 9. \\ – Escaping a Backslash
Meaning: Matches a literal backslash \.

Example:

In [24]:
re.findall(r'\\', 'folder\\path')  # ['\\']

['\\']

---
---

# MORE EXAMPLES


### 1. Extract All Digits
Task: Match all numbers from a string.

Input:

In [28]:
text = "My phone is 080-1234-5678 and zip is 100001"
import re
print(re.findall(r'\d+', text))

['080', '1234', '5678', '100001']


### 2. Match All Whole Words
Task: Extract all full words (alphanumeric + underscore).


In [30]:
text = "Welcome_to regex101! Learn regex at 100% efficiency."
print(re.findall(r'\w+', text))


['Welcome_to', 'regex101', 'Learn', 'regex', 'at', '100', 'efficiency']


###  3. Get Words That Start With Capital Letter
Task: Match all words starting with an uppercase letter.

In [31]:
text = "Python and JavaScript are Programming Languages."
print(re.findall(r'\b[A-Z]\w*', text))

['Python', 'JavaScript', 'Programming', 'Languages']


### 4. Extract Emails From Text
Task: Extract all email addresses.

In [32]:
text = "Contact us at support@example.com or info@domain.co.uk"
print(re.findall(r'\b[\w.-]+@[\w.-]+\.\w+', text))

['support@example.com', 'info@domain.co.uk']


### 5. Extract Hashtags
Task: Extract all hashtags from a tweet.

In [33]:
text = "Loving the #Python community! #Coding #Regex101"
print(re.findall(r'#\w+', text))

['#Python', '#Coding', '#Regex101']
