# Regular Expressions

### Extract Dates from the given text using Regex

In [4]:
import re

text = "On 2024-01-12, the conference will begin, and the event will end on 2024-02-30" 

# Regex pattern to match dates in the format YYYY-MM-DD
date_pattern = r"\d{4}-\d{2}-\d{2}"

dates = re.findall(date_pattern, text)

print(dates)

['2024-01-12', '2024-02-30']


### Types of functions in Regular Expressions

#### 1. Matching Functions
Used to check if a pattern exists in a specific part of the string.

| **Function**        | **Description**                                  | **Example**                                   |
|----------------------|--------------------------------------------------|-----------------------------------------------|
| `re.match()`         | Matches a pattern **at the start** of the string.| `re.match(r"\d+", "123abc")` → Matches `123`. |
| `re.fullmatch()`     | Matches the **entire string** against the pattern.| `re.fullmatch(r"\d+", "123")` → Matches `123`.|
| `re.search()`        | Searches for the **first occurrence** of the pattern in the string.| `re.search(r"\d+", "abc123def")` → Matches `123`. |

---

#### 2. Finding and Extracting Functions
Used to find or extract matches in a string.

| **Function**        | **Description**                                  | **Example**                                   |
|----------------------|--------------------------------------------------|-----------------------------------------------|
| `re.findall()`       | Returns **all non-overlapping matches** as a list.| `re.findall(r"\d+", "abc123def456")` → `['123', '456']`. |
| `re.finditer()`      | Returns an **iterator of match objects** for all matches.| `for match in re.finditer(r"\d+", "abc123def456"): print(match.group())` → `123, 456`. |

---

#### 3. Substitution Functions
Used to replace parts of a string that match a pattern.

| **Function**        | **Description**                                  | **Example**                                   |
|----------------------|--------------------------------------------------|-----------------------------------------------|
| `re.sub()`           | Replaces **all occurrences** of the pattern with a specified string.| `re.sub(r"\d+", "X", "abc123def456")` → `abcXdefX`. |
| `re.subn()`          | Same as `re.sub()` but also returns the **number of substitutions**.| `re.subn(r"\d+", "X", "abc123def456")` → (`abcXdefX`, 2). |

---

#### 4. Splitting Functions
Used to split a string based on a pattern.

| **Function**        | **Description**                                  | **Example**                                   |
|----------------------|--------------------------------------------------|-----------------------------------------------|
| `re.split()`         | Splits the string **at each match of the pattern**.| `re.split(r"\d+", "abc123def456")` → `['abc', 'def', '']`. |

---

#### 5. Compilation Functions
Used to compile regular expressions for repeated use.

| **Function**        | **Description**                                  | **Example**                                   |
|----------------------|--------------------------------------------------|-----------------------------------------------|
| `re.compile()`       | Compiles a regex pattern into a **regex object** for reuse.| `pattern = re.compile(r"\d+"); pattern.findall("123abc456")` → `['123', '456']`. |

---

#### 6. Escaping Functions
Used to escape special characters in a string.

| **Function**        | **Description**                                  | **Example**                                   |
|----------------------|--------------------------------------------------|-----------------------------------------------|
| `re.escape()`        | Escapes all special characters in a string.      | `re.escape("a+b*c")` → `a\+b\*c`. |


In [7]:
pattern = 'Python'
text = 'Python is very powerful, and python has alot of features'

re.search(pattern, text)

<re.Match object; span=(0, 6), match='Python'>

In [15]:
pattern = 'Python'
text = 'Python is very powerful, and python has alot of features'

match = re.search(pattern, text)
# print(match)

if match:
    print('found', match.group())
else: 
    print('Not Found')

<re.Match object; span=(0, 6), match='Python'>
found Python


#### Flags in Regular Expressions: 
Flags are optional parameters used to modify the behavior of the regex engine.

| **Flag**   | **Name**           | **Description**                                                                 | **Example**                                                                                       |
|------------|--------------------|---------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------|
| `i`        | **Ignore Case**    | Makes the regex case-insensitive.                                               | Pattern: `r"cat"`<br>Flags: `re.I`<br>Matches: `cat`, `Cat`, `CAT`                               |
| `m`        | **Multiline**      | Allows `^` and `$` to match the start and end of each line, not just the string. | Pattern: `r"^cat"`<br>Flags: `re.M`<br>Matches: `cat` at the beginning of multiple lines.        |
| `s`        | **Dot All**        | Allows the dot (`.`) to match newline characters.                                | Pattern: `r"cat.*dog"`<br>Flags: `re.S`<br>Matches: `cat\n\ndog`                                 |
| `x`        | **Verbose**        | Enables whitespace and comments for more readable regex patterns.                | Pattern: `r"cat\ \# Matches the word 'cat'"`<br>Flags: `re.X`<br>Matches: `cat`                 |
| `a`        | **ASCII**          | Restricts `\w`, `\W`, `\d`, `\D`, `\s`, and `\S` to match ASCII characters only. | Pattern: `r"\w+"`<br>Flags: `re.A`<br>Matches: `abc`, but not Unicode like `你好`.               |
| `u`        | **Unicode**        | Allows `\w`, `\W`, `\d`, `\D`, `\s`, and `\S` to match Unicode characters.       | Pattern: `r"\w+"`<br>Flags: `re.U`<br>Matches: Unicode strings like `你好`.                      |
| `L`        | **Locale**         | Matches characters based on the current locale setting.                         | Pattern: `r"\w+"`<br>Flags: `re.L`<br>Matches characters in a locale-aware manner.              |
| `g`        | **Global**         | Enables global search; finds all matches instead of stopping at the first one.  | Pattern: `r"cat"`<br>With global search: Matches all occurrences of `cat` in the text.           |
| `t`        | **Template**       | Marks the regex as a template; disables backreferences and grouping.            | Pattern: `r"(cat|dog)"`<br>Flags: `re.T`<br>Does not capture groups.                             |
