# Comprehensive Regular Expressions Tutorial

This tutorial covers regular expressions from basics to advanced topics. Each section contains detailed explanations, examples, and interactive exercises.

## Table of Contents

1. Introduction to Regular Expressions
2. Basic Symbols
3. Character Classes
4. Quantifiers
5. Anchors
6. Grouping and Capturing
7. Alternation
8. Escaping Special Characters
9. Lookahead and Lookbehind
10. Non-Capturing Groups
11. Flags
12. Greedy vs. Non-Greedy Matching
13. Word Boundaries
14. Start and End of String
15. Matching Digits and Non-Digits
16. Matching Whitespace and Non-Whitespace
17. Matching Word Characters and Non-Word Characters
18. Backreferences
19. Named Groups
20. Conditional Expressions
21. Unicode Matching
22. Dotall Mode
23. Verbose Mode
24. Inline Modifiers
25. Substitution
26. Splitting Strings
27. Finding All Matches
28. Compiling Regular Expressions
29. Performance Considerations
30. Common Pitfalls
31. Practical Examples
32. Regular Expressions in Python
33. Regular Expressions in JavaScript
34. Regular Expressions in Java
35. Regular Expressions in PHP
36. Regular Expressions in Ruby
37. Regular Expressions in Perl
38. Regular Expressions in Shell Scripts
39. Testing and Debugging Regular Expressions
40. Resources and Further Reading

## 1. Introduction to Regular Expressions

Regular expressions (regex) are a powerful tool for matching patterns in text. They are commonly used for search, replace, and validation tasks in text processing.

### Example
To match the word "hello" in a text, you would use the pattern `hello`.

In [None]:
# Example
import re

text = "hello world"
pattern = r"hello"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

### Interactive Exercises

1. Match the word `world` in the text `hello world`.
2. Match the word `Python` in the text `I love Python`.
3. Match the word `regex` in the text `Regular expressions are powerful`.

In [None]:
# Exercise 1
text = "hello world"
pattern = r"world"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "I love Python"
pattern = r"Python"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "Regular expressions are powerful"
pattern = r"regex"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

## 2. Basic Symbols

Regular expressions use a variety of symbols to define patterns. Here are some basic symbols:

- `.`: Matches any character except a newline.
- `^`: Matches the start of the string.
- `$`: Matches the end of the string.
- `*`: Matches 0 or more repetitions of the preceding element.
- `+`: Matches 1 or more repetitions of the preceding element.
- `?`: Matches 0 or 1 repetition of the preceding element.

In [None]:
# Example
text = "hello"
pattern = r"h.llo"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

### Interactive Exercises

1. Match any character in the text `a1b2c3`.
2. Match the start of the string `^start` in the text `start here`.
3. Match the end of the string `end$` in the text `the end`.

In [None]:
# Exercise 1
text = "a1b2c3"
pattern = r"."
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 2
text = "start here"
pattern = r"^start"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "the end"
pattern = r"end$"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

## 3. Character Classes

Character classes allow you to match any one of a set of characters. Here are some examples:

- `[abc]`: Matches any one of the characters a, b, or c.
- `[a-z]`: Matches any one character from a to z.
- `[0-9]`: Matches any one digit from 0 to 9.

In [None]:
# Example
text = "cat"
pattern = r"[cb]at"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

### Interactive Exercises

1. Match any one of the characters `a`, `b`, or `c` in the text `abc`.
2. Match any one character from `a` to `z` in the text `hello`.
3. Match any one digit from `0` to `9` in the text `123`.

In [None]:
# Exercise 1
text = "abc"
pattern = r"[abc]"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 2
text = "hello"
pattern = r"[a-z]"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 3
text = "123"
pattern = r"[0-9]"
matches = re.findall(pattern, text)
print(matches)

## 4. Quantifiers

Quantifiers specify how many instances of a character, group, or character class must be present in the input for a match to be found:

- `*`: 0 or more
- `+`: 1 or more
- `?`: 0 or 1
- `{n}`: exactly n
- `{n,}`: n or more
- `{n,m}`: between n and m

In [None]:
# Example
text = "hello"
pattern = r"l+"
matches = re.findall(pattern, text)
print(matches)

### Interactive Exercises

1. Match 0 or more `a` in the text `aaab`.
2. Match 1 or more `b` in the text `bbbc`.
3. Match exactly 2 `c` in the text `ccc`.

In [None]:
# Exercise 1
text = "aaab"
pattern = r"a*"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 2
text = "bbbc"
pattern = r"b+"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 3
text = "ccc"
pattern = r"c{2}"
matches = re.findall(pattern, text)
print(matches)

## 5. Anchors

Anchors are used to specify the position of the match:

- `^`: Start of string
- `$`: End of string
- `\b`: Word boundary
- `\B`: Non-word boundary

In [None]:
# Example
text = "hello world"
pattern = r"^hello"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

### Interactive Exercises

1. Match the start of the string `^start` in the text `start here`.
2. Match the end of the string `end$` in the text `the end`.
3. Match the word boundary `\bword\b` in the text `a word in a sentence`.

In [None]:
# Exercise 1
text = "start here"
pattern = r"^start"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "the end"
pattern = r"end$"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "a word in a sentence"
pattern = r"\bword\b"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

## 6. Grouping and Capturing

Parentheses `()` are used for grouping and capturing:

- `(abc)`: Matches and captures `abc`.
- `(a|b)`: Matches either `a` or `b`.

In [None]:
# Example
text = "hello"
pattern = r"(h)(e)(l)(l)(o)"
match = re.search(pattern, text)
if match:
    print(match.groups())
else:
    print("No match.")

### Interactive Exercises

1. Capture the groups `a` and `b` in the text `ab`.
2. Match either `cat` or `dog` in the text `cat and dog`.
3. Capture the groups `h`, `e`, and `llo` in the text `hello`.

In [None]:
# Exercise 1
text = "ab"
pattern = r"(a)(b)"
match = re.search(pattern, text)
if match:
    print(match.groups())
    else:
        print("No match.")

In [None]:
# Exercise 2
text = "cat and dog"
pattern = r"(cat|dog)"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 3
text = "hello"
pattern = r"(h)(e)(llo)"
match = re.search(pattern, text)
if match:
    print(match.groups())
else:
    print("No match.")

## 7. Alternation

Alternation is used to match one of several patterns. The pipe symbol `|` is used for alternation.

### Example
To match either `cat` or `dog` in a text, you would use the pattern `cat|dog`.

In [None]:
# Example
text = "I have a cat and a dog"
pattern = r"cat|dog"
matches = re.findall(pattern, text)
print(matches)

### Interactive Exercises

1. Match either `apple` or `orange` in the text `I like apple and orange`.
2. Match either `red` or `blue` in the text `red, green, blue`.
3. Match either `cat`, `dog`, or `bird` in the text `I have a cat, a dog, and a bird`.

In [None]:
# Exercise 1
text = "I like apple and orange"
pattern = r"apple|orange"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 2
text = "red, green, blue"
pattern = r"red|blue"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 3
text = "I have a cat, a dog, and a bird"
pattern = r"cat|dog|bird"
matches = re.findall(pattern, text)
print(matches)

## 8. Escaping Special Characters

Special characters in regular expressions include `. ^ $ * + ? { } [ ] \ | ( )`. To match these characters literally, you need to escape them with a backslash `\`.

### Example
To match the character `.` in a text, you would use the pattern `\.`.

In [None]:
# Example
text = "example.com"
pattern = r"\."
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

### Interactive Exercises

1. Match the character `?` in the text `Is this correct?`.
2. Match the character `*` in the text `5 * 3 = 15`.
3. Match the character `[` in the text `Array[0]`.

In [None]:
# Exercise 1
text = "Is this correct?"
pattern = r"\?"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "5 * 3 = 15"
pattern = r"\*"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "Array[0]"
pattern = r"\["
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

## 9. Lookahead and Lookbehind

Lookahead and lookbehind are used to assert that a pattern is followed or preceded by another pattern.

- Positive lookahead `(?=...)`
- Negative lookahead `(?!...)`
- Positive lookbehind `(?<=...)`
- Negative lookbehind `(?<!...)`

In [None]:
# Example
text = "foo123"
pattern = r"foo(?=123)"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

### Interactive Exercises

1. Match `hello` only if it is followed by `world` in the text `hello world`.
2. Match `foo` only if it is not followed by `bar` in the text `foobar`.
3. Match `123` only if it is preceded by `abc` in the text `abc123`.

In [None]:
# Exercise 1
text = "hello world"
pattern = r"hello(?= world)"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "foobar"
pattern = r"foo(?!bar)"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "abc123"
pattern = r"(?<=abc)123"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

## 10. Non-Capturing Groups

Non-capturing groups are used to group patterns without capturing them. They are defined using `(?:...)`.

### Example
To match `cat` or `dog` without capturing the group, you would use the pattern `(?:cat|dog)`.

In [None]:
# Example
text = "I have a cat and a dog"
pattern = r"(?:cat|dog)"
matches = re.findall(pattern, text)
print(matches)

### Interactive Exercises

1. Match `apple` or `orange` without capturing the group in the text `I like apple and orange`.
2. Match `red` or `blue` without capturing the group in the text `red, green, blue`.
3. Match `cat`, `dog`, or `bird` without capturing the group in the text `I have a cat, a dog, and a bird`.

In [None]:
# Exercise 1
text = "I like apple and orange"
pattern = r"(?:apple|orange)"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 2
text = "red, green, blue"
pattern = r"(?:red|blue)"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 3
text = "I have a cat, a dog, and a bird"
pattern = r"(?:cat|dog|bird)"
matches = re.findall(pattern, text)
print(matches)

## 11. Flags

Flags are used to modify the behavior of the regular expression. Some common flags include:

- `re.IGNORECASE` (or `re.I`): Ignore case
- `re.MULTILINE` (or `re.M`): Multi-line matching
- `re.DOTALL` (or `re.S`): Dot matches all characters, including newline
- `re.VERBOSE` (or `re.X`): Allow verbose regexps

In [None]:
# Example
text = "Hello World"
pattern = r"hello"
match = re.search(pattern, text, re.IGNORECASE)
if match:
    print("Match found!")
else:
    print("No match.")

### Interactive Exercises

1. Match `hello` in the text `Hello World` ignoring case.
2. Match `^start` in the text `start here` with multi-line matching.
3. Match `.` in the text `line1\nline2` including newline characters.

In [None]:
# Exercise 1
text = "Hello World"
pattern = r"hello"
match = re.search(pattern, text, re.IGNORECASE)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "start here"
pattern = r"^start"
match = re.search(pattern, text, re.MULTILINE)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "line1\nline2"
pattern = r"."
matches = re.findall(pattern, text, re.DOTALL)
print(matches)

## 12. Greedy vs. Non-Greedy Matching

By default, quantifiers are greedy, meaning they match as much as possible. Non-greedy (or lazy) matching can be achieved by appending a `?` to the quantifier.

### Example
To match the smallest possible string between `<` and `>`, you would use the pattern `<.*?>`.

In [None]:
# Example
text = "<div>content</div>"
pattern = r"<.*?>"
matches = re.findall(pattern, text)
print(matches)

### Interactive Exercises

1. Match the smallest string between `<` and `>` in the text `<div>content</div>`.
2. Match the smallest string between `"` in the text `"hello" world "python"`.
3. Match the smallest string between `{` and `}` in the text `{key:value}`.

In [None]:
# Exercise 1
text = "<div>content</div>"
pattern = r"<.*?>"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 2
text = "\"hello\" world \"python\""
pattern = r"\".*?\""
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 3
text = "{key:value}"
pattern = r"{.*?}"
matches = re.findall(pattern, text)
print(matches)

## 13. Word Boundaries

Word boundaries are used to match the position between a word character and a non-word character. They are represented by `\b`.

### Example
To match the word `cat` as a whole word, you would use the pattern `\bcat\b`.

In [None]:
# Example
text = "I have a cat."
pattern = r"\bcat\b"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

### Interactive Exercises

1. Match the word `dog` as a whole word in the text `I have a dog.`.
2. Match the word `hello` as a whole word in the text `hello world!`.
3. Match the word `python` as a whole word in the text `I love python programming`.

In [None]:
# Exercise 1
text = "I have a dog."
pattern = r"\bdog\b"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "hello world!"
pattern = r"\bhello\b"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "I love python programming"
pattern = r"\bpython\b"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

## 14. Start and End of String

The `^` symbol matches the start of a string, and the `$` symbol matches the end of a string.

### Example
To match the string `hello` at the start of a text, you would use the pattern `^hello`.

In [None]:
# Example
text = "hello world"
pattern = r"^hello"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

### Interactive Exercises

1. Match the string `start` at the beginning of the text `start here`.
2. Match the string `end` at the end of the text `this is the end`.
3. Match the string `Python` at the beginning of the text `Python is great`.

In [None]:
# Exercise 1
text = "start here"
pattern = r"^start"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "this is the end"
pattern = r"end$"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "Python is great"
pattern = r"^Python"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

## 15. Matching Digits and Non-Digits

The `\d` symbol matches any digit, while `\D` matches any non-digit.

### Example
To match any digit in a text, you would use the pattern `\d`.

In [None]:
# Example
text = "There are 2 apples"
pattern = r"\d"
matches = re.findall(pattern, text)
print(matches)

### Interactive Exercises

1. Match any digit in the text `123abc`.
2. Match any non-digit in the text `123abc`.
3. Match any digit in the text `Price: $100.00`.

In [None]:
# Exercise 1
text = "123abc"
pattern = r"\d"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 2
text = "123abc"
pattern = r"\D"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 3
text = "Price: $100.00"
pattern = r"\d"
matches = re.findall(pattern, text)
print(matches)

## 16. Matching Whitespace and Non-Whitespace

The `\s` symbol matches any whitespace character, while `\S` matches any non-whitespace character.

### Example
To match any whitespace character in a text, you would use the pattern `\s`.

In [None]:
# Example
text = "Hello World"
pattern = r"\s"
matches = re.findall(pattern, text)
print(matches)

### Interactive Exercises

1. Match any whitespace character in the text `Hello World`.
2. Match any non-whitespace character in the text `Hello World`.
3. Match any whitespace character in the text `a\tb\nc`.

In [None]:
# Exercise 1
text = "Hello World"
pattern = r"\s"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 2
text = "Hello World"
pattern = r"\S"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 3
text = "a\tb\nc"
pattern = r"\s"
matches = re.findall(pattern, text)
print(matches)

## 17. Matching Word Characters and Non-Word Characters

The `\w` symbol matches any word character (alphanumeric plus underscore), while `\W` matches any non-word character.

### Example
To match any word character in a text, you would use the pattern `\w`.

In [None]:
# Example
text = "Hello_World123"
pattern = r"\w"
matches = re.findall(pattern, text)
print(matches)

### Interactive Exercises

1. Match any word character in the text `Hello_World123`.
2. Match any non-word character in the text `Hello, World!`.
3. Match any word character in the text `a_b-c`.

In [None]:
# Exercise 1
text = "Hello_World123"
pattern = r"\w"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 2
text = "Hello, World!"
pattern = r"\W"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 3
text = "a_b-c"
pattern = r"\w"
matches = re.findall(pattern, text)
print(matches)

## 18. Backreferences

Backreferences allow you to reuse a previously captured group. They are represented by `\1`, `\2`, etc., where the number corresponds to the group number.

### Example
To match a pair of repeated words, you would use the pattern `\b(\w+) \1\b`.

In [None]:
# Example
text = "hello hello"
pattern = r"\b(\w+) \1\b"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

### Interactive Exercises

1. Match a pair of repeated words in the text `test test`.
2. Match a pair of repeated digits in the text `123 123`.
3. Match a pair of repeated characters in the text `aa bb cc`.

In [None]:
# Exercise 1
text = "test test"
pattern = r"\b(\w+) \1\b"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "123 123"
pattern = r"\b(\d+) \1\b"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "aa bb cc"
pattern = r"\b(\w) \1\b"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

## 19. Named Groups

Named groups allow you to assign a name to a capturing group. They are defined using `(?P<name>...)`.

### Example
To match and capture a date in the format `YYYY-MM-DD`, you would use the pattern `(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})`.

In [None]:
# Example
text = "2024-09-25"
pattern = r"(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})"
match = re.search(pattern, text)
if match:
    print(match.groupdict())
else:
    print("No match.")

### Interactive Exercises

1. Capture the year, month, and day in the date `2024-09-25`.
2. Capture the first name and last name in the text `John Doe`.
3. Capture the area code and phone number in the text `(123) 456-7890`.

In [None]:
# Exercise 1
text = "2024-09-25"
pattern = r"(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})"
match = re.search(pattern, text)
if match:
    print(match.groupdict())
else:
    print("No match.")

In [None]:
# Exercise 2
text = "John Doe"
pattern = r"(?P<first_name>\w+) (?P<last_name>\w+)"
match = re.search(pattern, text)
if match:
    print(match.groupdict())
else:
    print("No match.")

In [None]:
# Exercise 3
text = "(123) 456-7890"
pattern = r"\((?P<area_code>\d{3})\) (?P<phone_number>\d{3}-\d{4})"
match = re.search(pattern, text)
if match:
    print(match.groupdict())
else:
    print("No match.")

## 20. Conditional Expressions

Conditional expressions allow you to apply different patterns based on the presence or absence of a capturing group. They are defined using `(?(id/name)yes-pattern|no-pattern)`.

### Example
To match `abc` if the string starts with `a`, otherwise match `xyz`, you would use the pattern `^(a)?(?(1)bc|xyz)`.

In [None]:
# Example
text1 = "abc"
text2 = "xyz"
pattern = r"^(a)?(?(1)bc|xyz)"
match1 = re.search(pattern, text1)
match2 = re.search(pattern, text2)
print("Match1 found!" if match1 else "No match1.")
print("Match2 found!" if match2 else "No match2.")

### Interactive Exercises

1. Match `abc` if the string starts with `a`, otherwise match `def` in the text `abc`.
2. Match `123` if the string starts with `1`, otherwise match `456` in the text `123`.
3. Match `foo` if the string starts with `f`, otherwise match `bar` in the text `foo`.

In [None]:
# Exercise 1
text = "abc"
pattern = r"^(a)?(?(1)bc|def)"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "123"
pattern = r"^(1)?(?(1)23|456)"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "foo"
pattern = r"^(f)?(?(1)oo|bar)"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

## 21. Unicode Matching

Unicode matching allows you to match Unicode characters using the `\u` escape sequence followed by the character's Unicode code point.

### Example
To match the Unicode character `é`, you would use the pattern `\u00E9`.

In [None]:
# Example
text = "café"
pattern = r"caf\u00E9"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

### Interactive Exercises

1. Match the Unicode character `é` in the text `café`.
2. Match the Unicode character `ñ` in the text `piñata`.
3. Match the Unicode character `ü` in the text `über`.

In [None]:
# Exercise 1
text = "café"
pattern = r"\u00E9"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "piñata"
pattern = r"\u00F1"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "über"
pattern = r"\u00FC"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

## 22. Dotall Mode

Dotall mode allows the dot `.` to match all characters, including newline characters. It is enabled using the `re.DOTALL` flag or the inline modifier `(?s)`.

### Example
To match any character, including newlines, you would use the pattern `(?s).`.

In [None]:
# Example
text = "line1\nline2"
pattern = r"(?s)."
matches = re.findall(pattern, text)
print(matches)

### Interactive Exercises

1. Match any character, including newlines, in the text `line1\nline2`.
2. Match any character, including newlines, in the text `abc\ndef`.
3. Match any character, including newlines, in the text `123\n456`.

In [None]:
# Exercise 1
text = "line1\nline2"
pattern = r"(?s)."
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 2
text = "abc\ndef"
pattern = r"(?s)."
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 3
text = "123\n456"
pattern = r"(?s)."
matches = re.findall(pattern, text)
print(matches)

## 23. Verbose Mode

Verbose mode allows you to write regular expressions with whitespace and comments for better readability. It is enabled using the `re.VERBOSE` flag or the inline modifier `(?x)`.

### Example
To match a date in the format `YYYY-MM-DD` with verbose mode, you would use the pattern `(?x) \d{4} - \d{2} - \d{2}`.

In [None]:
# Example
text = "2024-09-25"
pattern = r"(?x) \d{4} - \d{2} - \d{2}"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

### Interactive Exercises

1. Match a date in the format `YYYY-MM-DD` with verbose mode in the text `2024-09-25`.
2. Match a phone number in the format `(123) 456-7890` with verbose mode in the text `(123) 456-7890`.
3. Match an email address with verbose mode in the text `user@example.com`.

In [None]:
# Exercise 1
text = "2024-09-25"
pattern = r"(?x) \d{4} - \d{2} - \d{2}"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "(123) 456-7890"
pattern = r"(?x) \( \d{3} \) \s \d{3} - \d{4}"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "user@example.com"
pattern = r"(?x) \b [\w.%+-]+ @ [\w.-]+ \.[a-zA-Z]{2,6} \b"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

## 24. Inline Modifiers

Inline modifiers allow you to enable or disable flags for part of a regular expression. They are defined using `(?i)`, `(?m)`, `(?s)`, and `(?x)`.

### Example
To match `hello` case-insensitively, you would use the pattern `(?i)hello`.

In [None]:
# Example
text = "Hello World"
pattern = r"(?i)hello"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

### Interactive Exercises

1. Match `hello` case-insensitively in the text `Hello World`.
2. Match `start` at the beginning of the text `start here` with multi-line matching.
3. Match any character, including newlines, in the text `line1\nline2`.

In [None]:
# Exercise 1
text = "Hello World"
pattern = r"(?i)hello"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "start here"
pattern = r"(?m)^start"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "line1\nline2"
pattern = r"(?s)."
matches = re.findall(pattern, text)
print(matches)

## 25. Substitution

Substitution allows you to replace parts of a string that match a pattern. The `re.sub` function is used for this purpose.

### Example
To replace all digits in a text with `#`, you would use the pattern `\d`.

In [None]:
# Example
text = "My phone number is 123-456-7890"
pattern = r"\d"
replacement = "#"
result = re.sub(pattern, replacement, text)
print(result)

### Interactive Exercises

1. Replace all digits with `#` in the text `My phone number is 123-456-7890`.
2. Replace all whitespace characters with `_` in the text `Hello World`.
3. Replace all vowels with `*` in the text `Regular Expressions`.

In [None]:
# Exercise 1
text = "My phone number is 123-456-7890"
pattern = r"\d"
replacement = "#"
result = re.sub(pattern, replacement, text)
print(result)

In [None]:
# Exercise 2
text = "Hello World"
pattern = r"\s"
replacement = "_"
result = re.sub(pattern, replacement, text)
print(result)

In [None]:
# Exercise 3
text = "Regular Expressions"
pattern = r"[aeiouAEIOU]"
replacement = "*"
result = re.sub(pattern, replacement, text)
print(result)

## 26. Splitting Strings

The `re.split` function allows you to split a string by a pattern.

### Example
To split a text by whitespace characters, you would use the pattern `\s`.

In [None]:
# Example
text = "Split this text by whitespace"
pattern = r"\s"
result = re.split(pattern, text)
print(result)

### Interactive Exercises

1. Split the text `Split this text by whitespace` by whitespace characters.
2. Split the text `123-456-7890` by hyphens.
3. Split the text `apple,orange,banana` by commas.

In [None]:
# Exercise 1
text = "Split this text by whitespace"
pattern = r"\s"
result = re.split(pattern, text)
print(result)

In [None]:
# Exercise 2
text = "123-456-7890"
pattern = r"-"
result = re.split(pattern, text)
print(result)

In [None]:
# Exercise 3
text = "apple,orange,banana"
pattern = r"," 
result = re.split(pattern, text)
print(result)

## 27. Finding All Matches

The `re.findall` function returns all non-overlapping matches of a pattern in a string.

### Example
To find all digits in a text, you would use the pattern `\d`.

In [None]:
# Example
text = "There are 2 apples and 3 oranges"
pattern = r"\d"
matches = re.findall(pattern, text)
print(matches)

### Interactive Exercises

1. Find all digits in the text `There are 2 apples and 3 oranges`.
2. Find all words in the text `Find all words in this text`.
3. Find all whitespace characters in the text `Whitespace characters`.

In [None]:
# Exercise 1
text = "There are 2 apples and 3 oranges"
pattern = r"\d"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 2
text = "Find all words in this text"
pattern = r"\w+"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 3
text = "Whitespace characters"
pattern = r"\s"
matches = re.findall(pattern, text)
print(matches)

## 28. Compiling Regular Expressions

The `re.compile` function allows you to compile a regular expression pattern into a regular expression object, which can be used for matching.

### Example
To compile a pattern for matching digits, you would use the pattern `\d`.

In [None]:
# Example
text = "There are 2 apples and 3 oranges"
pattern = re.compile(r"\d")
matches = pattern.findall(text)
print(matches)

### Interactive Exercises

1. Compile a pattern for matching digits and find all digits in the text `There are 2 apples and 3 oranges`.
2. Compile a pattern for matching words and find all words in the text `Find all words in this text`.
3. Compile a pattern for matching whitespace characters and find all whitespace characters in the text `Whitespace characters`.

In [None]:
# Exercise 1
text = "There are 2 apples and 3 oranges"
pattern = re.compile(r"\d")
matches = pattern.findall(text)
print(matches)

In [None]:
# Exercise 2
text = "Find all words in this text"
pattern = re.compile(r"\w+")
matches = pattern.findall(text)
print(matches)

In [None]:
# Exercise 3
text = "Whitespace characters"
pattern = re.compile(r"\s")
matches = pattern.findall(text)
print(matches)

## 29. Performance Considerations

Regular expressions can be computationally expensive, especially with complex patterns and large input strings. Here are some tips for improving performance:

- Use raw strings (`r"pattern"`) to avoid unnecessary escaping.
- Compile regular expressions using `re.compile`.
- Avoid backtracking by using non-capturing groups and atomic groups.
- Use specific character classes instead of `.` when possible.
- Test and optimize your regular expressions using tools like regex101 or regexr.

### Interactive Exercises

1. Compile a pattern for matching digits and find all digits in the text `There are 2 apples and 3 oranges`.
2. Use a specific character class to match vowels in the text `Regular Expressions`.
3. Optimize the pattern `.*` to match any character except newline in the text `Hello World`.

In [None]:
# Exercise 1
text = "There are 2 apples and 3 oranges"
pattern = re.compile(r"\d")
matches = pattern.findall(text)
print(matches)

In [None]:
# Exercise 2
text = "Regular Expressions"
pattern = re.compile(r"[aeiouAEIOU]")
matches = pattern.findall(text)
print(matches)

In [None]:
# Exercise 3
text = "Hello World"
pattern = re.compile(r"[^
]*")
matches = pattern.findall(text)
print(matches)

## 30. Common Pitfalls

Here are some common pitfalls to avoid when working with regular expressions:

- Forgetting to escape special characters.
- Using greedy quantifiers when non-greedy ones are needed.
- Not using raw strings for regular expressions.
- Overusing backreferences, which can lead to performance issues.
- Not testing regular expressions with various input cases.

### Interactive Exercises

1. Correct the pattern to match the string `a+b` literally in the text `a+b`.
2. Use a non-greedy quantifier to match the smallest string between `<` and `>` in the text `<div>content</div>`.
3. Use a raw string to match the pattern `\d+` in the text `123`.

In [None]:
# Exercise 1
text = "a+b"
pattern = r"a\+b"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "<div>content</div>"
pattern = r"<.*?>"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 3
text = "123"
pattern = r"\d+"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

## 31. Practical Examples

Let's look at some practical examples of regular expressions in real-world scenarios.

### Example 1: Email Validation
To validate an email address, you can use the pattern `^[\w.%+-]+@[\w.-]+\.[a-zA-Z]{2,6}$`.

In [None]:
# Example 1
text = "user@example.com"
pattern = r"^[\w.%+-]+@[\w.-]+\.[a-zA-Z]{2,6}$"
match = re.match(pattern, text)
if match:
    print("Valid email address.")
else:
    print("Invalid email address.")

### Example 2: Phone Number Extraction
To extract phone numbers from a text, you can use the pattern `\(\d{3}\) \d{3}-\d{4}`.

In [None]:
# Example 2
text = "Contact: (123) 456-7890 or (987) 654-3210"
pattern = r"\(\d{3}\) \d{3}-\d{4}"
matches = re.findall(pattern, text)
print(matches)

### Example 3: URL Extraction
To extract URLs from a text, you can use the pattern `https?://[\w.-]+`.

In [None]:
# Example 3
text = "Visit https://example.com or http://test.com"
pattern = r"https?://[\w.-]+"
matches = re.findall(pattern, text)
print(matches)

### Interactive Exercises

1. Validate the email address `test@example.com`.
2. Extract phone numbers from the text `Call me at (555) 123-4567 or (555) 765-4321`.
3. Extract URLs from the text `Check out https://example.com and http://example.org`.

In [None]:
# Exercise 1
text = "test@example.com"
pattern = r"^[\w.%+-]+@[\w.-]+\.[a-zA-Z]{2,6}$"
match = re.match(pattern, text)
if match:
    print("Valid email address.")
else:
    print("Invalid email address.")

In [None]:
# Exercise 2
text = "Call me at (555) 123-4567 or (555) 765-4321"
pattern = r"\(\d{3}\) \d{3}-\d{4}"
matches = re.findall(pattern, text)
print(matches)

In [None]:
# Exercise 3
text = "Check out https://example.com and http://example.org"
pattern = r"https?://[\w.-]+"
matches = re.findall(pattern, text)
print(matches)

## 32. Regular Expressions in Python

Python's `re` module provides support for regular expressions. Here are some commonly used functions:

- `re.search(pattern, string, flags=0)`: Searches for the pattern in the string.
- `re.match(pattern, string, flags=0)`: Matches the pattern at the beginning of the string.
- `re.findall(pattern, string, flags=0)`: Finds all non-overlapping matches of the pattern in the string.
- `re.sub(pattern, repl, string, count=0, flags=0)`: Replaces occurrences of the pattern with `repl` in the string.
- `re.split(pattern, string, maxsplit=0, flags=0)`: Splits the string by occurrences of the pattern.
- `re.compile(pattern, flags=0)`: Compiles a regular expression pattern into a regular expression object.

### Example
To find all words in a text, you can use the `re.findall` function with the pattern `\w+`.

In [None]:
# Example
import re
text = "Find all words in this text"
pattern = r"\w+"
matches = re.findall(pattern, text)
print(matches)

### Interactive Exercises

1. Use `re.search` to find the first occurrence of `Python` in the text `I love Python programming`.
2. Use `re.match` to check if the text `Hello World` starts with `Hello`.
3. Use `re.sub` to replace all digits with `#` in the text `My phone number is 123-456-7890`.

In [None]:
# Exercise 1
text = "I love Python programming"
pattern = r"Python"
match = re.search(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 2
text = "Hello World"
pattern = r"Hello"
match = re.match(pattern, text)
if match:
    print("Match found!")
else:
    print("No match.")

In [None]:
# Exercise 3
text = "My phone number is 123-456-7890"
pattern = r"\d"
replacement = "#"
result = re.sub(pattern, replacement, text)
print(result)

## 33. Regular Expressions in JavaScript

JavaScript provides support for regular expressions through the `RegExp` object and methods of the `String` object:

- `RegExp(pattern, flags)`: Creates a regular expression object.
- `string.match(pattern)`: Matches a string against a pattern.
- `string.search(pattern)`: Searches for a pattern in a string.
- `string.replace(pattern, replacement)`: Replaces occurrences of a pattern with `replacement` in a string.
- `string.split(pattern)`: Splits a string by occurrences of a pattern.

### Example
To find all words in a text, you can use the `match` method with the pattern `/\w+/g`.

In [None]:
# Example (JavaScript)
text = "Find all words in this text";
pattern = /\w+/g;
matches = text.match(pattern);
console.log(matches);

### Interactive Exercises

1. Use `search` to find the first occurrence of `JavaScript` in the text `I love JavaScript programming`.
2. Use `match` to find all digits in the text `My phone number is 123-456-7890`.
3. Use `replace` to replace all whitespace characters with `_` in the text `Hello World`.

In [None]:
# Exercise 1 (JavaScript)
text = "I love JavaScript programming";
pattern = /JavaScript/;
index = text.search(pattern);
console.log(index !== -1 ? "Match found!" : "No match.");

In [None]:
# Exercise 2 (JavaScript)
text = "My phone number is 123-456-7890";
pattern = /\d+/g;
matches = text.match(pattern);
console.log(matches);

In [None]:
# Exercise 3 (JavaScript)
text = "Hello World";
pattern = /\s/g;
result = text.replace(pattern, "_");
console.log(result);

## 34. Regular Expressions in Java

Java provides support for regular expressions through the `java.util.regex` package:

- `Pattern.compile(pattern)`: Compiles a regular expression pattern.
- `Matcher matcher = pattern.matcher(input)`: Creates a matcher for the input string.
- `matcher.find()`: Finds the next match.
- `matcher.group()`: Returns the matched subsequence.
- `input.replaceAll(pattern, replacement)`: Replaces all occurrences of the pattern with `replacement`.

### Example
To find all words in a text, you can use the `Pattern` and `Matcher` classes with the pattern `\w+`.

In [None]:
# Example (Java)
import java.util.regex.*;
String text = "Find all words in this text";
Pattern pattern = Pattern.compile("\\w+");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
    System.out.println(matcher.group());
}

### Interactive Exercises

1. Use `Pattern` and `Matcher` to find the first occurrence of `Java` in the text `I love Java programming`.
2. Use `Pattern` and `Matcher` to find all digits in the text `My phone number is 123-456-7890`.
3. Use `replaceAll` to replace all whitespace characters with `_` in the text `Hello World`.

In [None]:
# Exercise 1 (Java)
import java.util.regex.*;
String text = "I love Java programming";
Pattern pattern = Pattern.compile("Java");
Matcher matcher = pattern.matcher(text);
if (matcher.find()) {
    System.out.println("Match found!");
} else {
    System.out.println("No match.");
}

In [None]:
# Exercise 2 (Java)
import java.util.regex.*;
String text = "My phone number is 123-456-7890";
Pattern pattern = Pattern.compile("\\d+");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
    System.out.println(matcher.group());
}

In [None]:
# Exercise 3 (Java)
String text = "Hello World";
String result = text.replaceAll("\\s", "_");
System.out.println(result);

## 35. Regular Expressions in PHP

PHP provides support for regular expressions through the `preg_` functions:

- `preg_match(pattern, subject)`: Searches for the pattern in the subject.
- `preg_match_all(pattern, subject)`: Finds all matches of the pattern in the subject.
- `preg_replace(pattern, replacement, subject)`: Replaces occurrences of the pattern with `replacement` in the subject.
- `preg_split(pattern, subject)`: Splits the subject by occurrences of the pattern.

### Example
To find all words in a text, you can use the `preg_match_all` function with the pattern `/\w+/`.

In [None]:
# Example (PHP)
$text = "Find all words in this text";
$pattern = "/\\w+/";
preg_match_all($pattern, $text, $matches);
print_r($matches[0]);

### Interactive Exercises

1. Use `preg_match` to find the first occurrence of `PHP` in the text `I love PHP programming`.
2. Use `preg_match_all` to find all digits in the text `My phone number is 123-456-7890`.
3. Use `preg_replace` to replace all whitespace characters with `_` in the text `Hello World`.

In [None]:
# Exercise 1 (PHP)
$text = "I love PHP programming";
$pattern = "/PHP/";
if (preg_match($pattern, $text)) {
    echo "Match found!";
} else {
    echo "No match.";
}

In [None]:
# Exercise 2 (PHP)
$text = "My phone number is 123-456-7890";
$pattern = "/\\d+/";
preg_match_all($pattern, $text, $matches);
print_r($matches[0]);

In [None]:
# Exercise 3 (PHP)
$text = "Hello World";
$pattern = "/\\s/";
$replacement = "_";
$result = preg_replace($pattern, $replacement, $text);
echo $result;

## 36. Regular Expressions in Ruby

Ruby provides support for regular expressions through the `Regexp` class and methods of the `String` class:

- `Regexp.new(pattern)`: Creates a regular expression object.
- `string.match(pattern)`: Matches a string against a pattern.
- `string.scan(pattern)`: Finds all non-overlapping matches of the pattern in the string.
- `string.gsub(pattern, replacement)`: Replaces occurrences of the pattern with `replacement` in the string.
- `string.split(pattern)`: Splits the string by occurrences of the pattern.

### Example
To find all words in a text, you can use the `scan` method with the pattern `/\w+/`.

In [None]:
# Example (Ruby)
text = "Find all words in this text"
pattern = /\w+/ 
matches = text.scan(pattern)
puts matches

### Interactive Exercises

1. Use `match` to find the first occurrence of `Ruby` in the text `I love Ruby programming`.
2. Use `scan` to find all digits in the text `My phone number is 123-456-7890`.
3. Use `gsub` to replace all whitespace characters with `_` in the text `Hello World`.

In [None]:
# Exercise 1 (Ruby)
text = "I love Ruby programming"
pattern = /Ruby/
match = text.match(pattern)
puts match ? "Match found!" : "No match."

In [None]:
# Exercise 2 (Ruby)
text = "My phone number is 123-456-7890"
pattern = /\d+/
matches = text.scan(pattern)
puts matches

In [None]:
# Exercise 3 (Ruby)
text = "Hello World"
pattern = /\s/
replacement = "_"
result = text.gsub(pattern, replacement)
puts result

## 37. Regular Expressions in Perl

Perl provides built-in support for regular expressions using the `=~` and `!~` operators:

- `=~`: Matches a string against a pattern.
- `!~`: Checks if a string does not match a pattern.
- `s/pattern/replacement/`: Replaces occurrences of the pattern with `replacement`.
- `split(pattern, string)`: Splits the string by occurrences of the pattern.

### Example
To find all words in a text, you can use the pattern `\w+` with the `=~` operator.

In [None]:
# Example (Perl)
my $text = "Find all words in this text";
my @matches = ($text =~ /\w+/g);
print "@matches\n";

### Interactive Exercises

1. Use `=~` to find the first occurrence of `Perl` in the text `I love Perl programming`.
2. Use `=~` to find all digits in the text `My phone number is 123-456-7890`.
3. Use `s///` to replace all whitespace characters with `_` in the text `Hello World`.

In [None]:
# Exercise 1 (Perl)
my $text = "I love Perl programming";
if ($text =~ /Perl/) {
    print "Match found!\n";
} else {
    print "No match.\n";
}

In [None]:
# Exercise 2 (Perl)
my $text = "My phone number is 123-456-7890";
my @matches = ($text =~ /\d+/g);
print "@matches\n";

In [None]:
# Exercise 3 (Perl)
my $text = "Hello World";
$text =~ s/\s/_/g;
print "$text\n";

## 38. Regular Expressions in Shell Scripts

Shell scripts, such as those written in Bash, provide support for regular expressions using the `grep`, `sed`, and `awk` commands:

- `grep 'pattern' file`: Searches for the pattern in the file.
- `sed 's/pattern/replacement/' file`: Replaces occurrences of the pattern with `replacement` in the file.
- `awk '/pattern/ {print}' file`: Prints lines matching the pattern in the file.

### Example
To find all lines containing the word `error` in a log file, you can use the `grep` command with the pattern `error`.

In [None]:
# Example (Shell Script)
# grep 'error' logfile.log

### Interactive Exercises

1. Use `grep` to find all lines containing the word `warning` in a file `logfile.log`.
2. Use `sed` to replace all occurrences of `foo` with `bar` in a file `input.txt`.
3. Use `awk` to print lines containing the word `success` in a file `results.txt`.

In [None]:
# Exercise 1 (Shell Script)
# grep 'warning' logfile.log

In [None]:
# Exercise 2 (Shell Script)
# sed 's/foo/bar/g' input.txt

In [None]:
# Exercise 3 (Shell Script)
# awk '/success/ {print}' results.txt

## 39. Testing and Debugging Regular Expressions

Testing and debugging regular expressions can be challenging. Here are some tools and techniques to help you:

- **Online Tools**: Websites like [regex101](https://regex101.com/) and [regexr](https://regexr.com/) provide interactive environments for testing and debugging regular expressions.
- **Unit Tests**: Write unit tests to verify that your regular expressions work as expected with various input cases.
- **Verbose Mode**: Use verbose mode (if supported) to add comments and whitespace to your regular expressions for better readability.
- **Step-by-Step Analysis**: Break down complex regular expressions into smaller parts and test each part individually.

### Example
To test a regular expression for matching email addresses, you can use regex101 to see the matches and explanations.

### Interactive Exercises

1. Test the pattern `^[\w.%+-]+@[\w.-]+\.[a-zA-Z]{2,6}$` for matching email addresses using regex101.
2. Write a unit test to verify that the pattern `\d+` matches all digits in the text `12345`.
3. Use verbose mode to add comments and whitespace to the pattern `\d{4}-\d{2}-\d{2}` for matching dates.

## 40. Resources and Further Reading

Here are some resources and further reading materials to help you master regular expressions:

- **Books**:
  - "Mastering Regular Expressions" by Jeffrey E.F. Friedl
  - "Regular Expressions Cookbook" by Jan Goyvaerts and Steven Levithan
- **Online Tutorials**:
  - [RegexOne](https://regexone.com/): Interactive tutorial for learning regular expressions.
  - [Regular-Expressions.info](https://www.regular-expressions.info/): Comprehensive resource on regular expressions.
- **Cheat Sheets**:
  - [Regex Cheat Sheet](https://www.rexegg.com/regex-quickstart.html): Quick reference for regular expression syntax and patterns.
- **Tools**:
  - [regex101](https://regex101.com/): Online tool for testing and debugging regular expressions.
  - [regexr](https://regexr.com/): Interactive tool for learning, testing, and debugging regular expressions.

### Conclusion
Regular expressions are a powerful tool for text processing and pattern matching. By understanding the syntax and common patterns, you can leverage regular expressions to solve a wide range of problems in various programming languages. Practice regularly and refer to the resources listed above to continue improving your skills.