Sure! Here’s a breakdown of commonly used regex patterns with explanations and examples:

---

### 1. **Basic Characters and Escaping**
- **`.`**: Matches any single character except newline.
  - Example: `a.c` matches `abc`, `a1c`, but not `ac`.
  
- **`\`**: Escapes special characters to match them literally.
  - Example: `a\.c` matches `a.c` (the literal dot).

---

### 2. **Character Sets**
- **`[abc]`**: Matches any one of the characters `a`, `b`, or `c`.
  - Example: `[aeiou]` matches any vowel.

- **`[^abc]`**: Matches any character *except* `a`, `b`, or `c`.
  - Example: `[^aeiou]` matches any consonant.

- **`[a-z]`**: Matches any lowercase letter.
  - Example: `[A-Z]` matches any uppercase letter.

---

### 3. **Predefined Character Classes**
- **`\d`**: Matches any digit (equivalent to `[0-9]`).
  - Example: `\d+` matches `123`, `4567`.

- **`\D`**: Matches any non-digit character.
  - Example: `\D+` matches `abc`, `#%$`.

- **`\w`**: Matches any word character (letters, digits, underscore).
  - Example: `\w+` matches `hello_world`, `abc123`.

- **`\W`**: Matches any non-word character.
  - Example: `\W+` matches `@#$%`.

- **`\s`**: Matches any whitespace character (space, tab, newline).
  - Example: `\s+` matches spaces between words.

- **`\S`**: Matches any non-whitespace character.
  - Example: `\S+` matches `HelloWorld`.

---

### 4. **Anchors**
- **`^`**: Matches the beginning of a string.
  - Example: `^Hello` matches `Hello World`, but not `World Hello`.

- **`$`**: Matches the end of a string.
  - Example: `World$` matches `Hello World`, but not `World Hello`.

- **`\b`**: Matches a word boundary.
  - Example: `\bcat\b` matches `cat` but not `category`.

---

### 5. **Quantifiers**
- **`*`**: Matches 0 or more of the preceding character.
  - Example: `ab*` matches `a`, `ab`, `abb`.

- **`+`**: Matches 1 or more of the preceding character.
  - Example: `ab+` matches `ab`, `abb`, but not `a`.

- **`?`**: Matches 0 or 1 of the preceding character.
  - Example: `ab?` matches `a`, `ab`.

- **`{n}`**: Matches exactly `n` occurrences of the preceding character.
  - Example: `a{3}` matches `aaa`.

- **`{n,}`**: Matches `n` or more occurrences of the preceding character.
  - Example: `a{3,}` matches `aaa`, `aaaa`.

- **`{n,m}`**: Matches between `n` and `m` occurrences of the preceding character.
  - Example: `a{2,4}` matches `aa`, `aaa`, `aaaa`.

---

### 6. **Groups and Alternation**
- **`(abc)`**: Groups multiple characters into a single unit.
  - Example: `(ab)+` matches `ab`, `abab`.

- **`|`**: Matches either the expression before or after it.
  - Example: `cat|dog` matches `cat` or `dog`.

---

### 7. **Lookahead and Lookbehind**
- **Positive Lookahead `(?=...)`**: Ensures that a certain pattern follows.
  - Example: `\d(?=px)` matches `3` in `3px`, but not `3em`.

- **Negative Lookahead `(?!...)`**: Ensures that a certain pattern does not follow.
  - Example: `\d(?!px)` matches `3` in `3em`, but not `3px`.

- **Positive Lookbehind `(?<=...)`**: Ensures that a certain pattern precedes.
  - Example: `(?<=\$)\d+` matches `100` in `$100`.

- **Negative Lookbehind `(?<!...)`**: Ensures that a certain pattern does not precede.
  - Example: `(?<!\$)\d+` matches `100` in `EUR100`, but not `$100`.

---

### 8. **Flags**
- **`re.IGNORECASE` (`re.I`)**: Makes the regex case-insensitive.
  - Example: `re.search('hello', 'Hello World', re.I)` matches `Hello`.

- **`re.MULTILINE` (`re.M`)**: Allows `^` and `$` to match at the start and end of each line.
  - Example: `^Hello` matches both `Hello` in:
    ```
    Hello
    World
    ```

- **`re.DOTALL` (`re.S`)**: Makes `.` match newline characters as well.
  - Example: `a.*z` matches across lines in:
    ```
    a
    b
    z
    ```

---

### Practical Examples
1. **Validate Email**:
   ```regex
   ^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$
   ```
   Matches `test@example.com`.

2. **Extract Dates**:
   ```regex
   \b\d{4}-\d{2}-\d{2}\b
   ```
   Matches `2024-12-14`.

3. **Match Hex Colors**:
   ```regex
   #[A-Fa-f0-9]{6}
   ```
   Matches `#FFFFFF`, `#abc123`.

4. **Find Repeated Words**:
   ```regex
   \b(\w+)\s+\1\b
   ```
   Matches `word word`.

Let me know if you'd like deeper examples or help building specific regex patterns!

Here are several examples comparing the use of **regex** and **string methods** for filtering text data in the context of **data analysis**. 

---

### 1. **Filter Rows Containing a Specific Word**
#### Task: Find rows where a string contains the word "error" (case-insensitive).
- **Regex**:
  ```python
  import pandas as pd
  
  df = pd.DataFrame({"logs": ["Error: File not found", "All good", "Critical ERROR detected"]})
  filtered = df[df["logs"].str.contains(r"\berror\b", case=False, regex=True)]
  print(filtered)
  ```
  Matches `Error: File not found` and `Critical ERROR detected`.

- **String Method**:
  ```python
  filtered = df[df["logs"].str.contains("error", case=False)]
  ```

---

### 2. **Match Rows with Digits**
#### Task: Filter rows containing any digits.
- **Regex**:
  ```python
  df = pd.DataFrame({"text": ["abc", "123", "abc123"]})
  filtered = df[df["text"].str.contains(r"\d", regex=True)]
  print(filtered)
  ```
  Matches `123` and `abc123`.

- **String Method**:
  ```python
  filtered = df[df["text"].str.contains("1") | df["text"].str.contains("2") | df["text"].str.contains("3")]
  ```

Regex is more concise and handles any digit automatically, while string methods require manually listing all digits.

---

### 3. **Extract Domain Names from Email Addresses**
#### Task: Extract the domain from email addresses like `user@example.com`.
- **Regex**:
  ```python
  df = pd.DataFrame({"emails": ["user@example.com", "admin@test.org"]})
  df["domain"] = df["emails"].str.extract(r"@([A-Za-z0-9.-]+)")
  print(df)
  ```
  Extracts `example.com` and `test.org`.

- **String Method**:
  ```python
  df["domain"] = df["emails"].str.split("@").str[1]
  ```

Regex is more flexible if email formats vary.

---

### 4. **Filter Rows Starting or Ending with a Specific String**
#### Task: Find rows that start with "temp" or end with ".csv".
- **Regex**:
  ```python
  df = pd.DataFrame({"files": ["temp123", "data.csv", "tempfile.csv"]})
  filtered = df[df["files"].str.contains(r"^temp|\.csv$", regex=True)]
  print(filtered)
  ```
  Matches `temp123`, `data.csv`, and `tempfile.csv`.

- **String Method**:
  ```python
  filtered = df[df["files"].str.startswith("temp") | df["files"].str.endswith(".csv")]
  ```

---

### 5. **Filter Rows with Patterns Like Dates**
#### Task: Match strings that look like dates in the format `YYYY-MM-DD`.
- **Regex**:
  ```python
  df = pd.DataFrame({"text": ["2024-12-14", "no date", "14-12-2024"]})
  filtered = df[df["text"].str.contains(r"\b\d{4}-\d{2}-\d{2}\b", regex=True)]
  print(filtered)
  ```
  Matches `2024-12-14`.

- **String Method**: Not feasible without regex because dates have complex patterns.

---

### 6. **Filter Rows with Multiple Substrings**
#### Task: Filter rows containing "apple" or "banana".
- **Regex**:
  ```python
  df = pd.DataFrame({"fruits": ["apple pie", "banana smoothie", "grape juice"]})
  filtered = df[df["fruits"].str.contains(r"apple|banana", regex=True)]
  print(filtered)
  ```
  Matches `apple pie` and `banana smoothie`.

- **String Method**:
  ```python
  filtered = df[df["fruits"].str.contains("apple") | df["fruits"].str.contains("banana")]
  ```

Regex is shorter and more scalable for more terms.

---

### 7. **Filter by Complex Patterns**
#### Task: Match rows containing phone numbers in the format `(123) 456-7890`.
- **Regex**:
  ```python
  df = pd.DataFrame({"contact": ["Call (123) 456-7890", "No phone here"]})
  filtered = df[df["contact"].str.contains(r"\(\d{3}\) \d{3}-\d{4}", regex=True)]
  print(filtered)
  ```
  Matches `Call (123) 456-7890`.

- **String Method**: Not feasible because of the complexity of the pattern.

---

### 8. **Remove Non-Alphanumeric Characters**
#### Task: Clean up text data by removing non-alphanumeric characters.
- **Regex**:
  ```python
  df = pd.DataFrame({"text": ["hello@world!", "python#rocks"]})
  df["cleaned"] = df["text"].str.replace(r"[^a-zA-Z0-9\s]", "", regex=True)
  print(df)
  ```
  Outputs `hello world` and `python rocks`.

- **String Method**: Not feasible because it cannot identify non-alphanumeric characters without regex.

---

### 9. **Filter Rows Without a Specific Pattern**
#### Task: Exclude rows containing numbers.
- **Regex**:
  ```python
  df = pd.DataFrame({"text": ["abc123", "hello", "456"]})
  filtered = df[~df["text"].str.contains(r"\d", regex=True)]
  print(filtered)
  ```
  Matches `hello`.

- **String Method**: Not feasible without manually checking each digit.

---

### 10. **Extract Numbers**
#### Task: Extract numbers from strings.
- **Regex**:
  ```python
  df = pd.DataFrame({"text": ["abc123", "no number", "45 apples"]})
  df["numbers"] = df["text"].str.extract(r"(\d+)")
  print(df)
  ```
  Extracts `123` and `45`.

- **String Method**: Not feasible for flexible patterns like extracting only numbers.

---

### Summary Table

| **Task**                        | **Regex**                                | **String Methods**                       |
|----------------------------------|------------------------------------------|------------------------------------------|
| Find specific word               | `\berror\b`                              | `contains("error")`                      |
| Match rows with digits           | `\d`                                     | Needs manual digit checks                |
| Extract domain                   | `@([A-Za-z0-9.-]+)`                      | `split("@").str[1]`                      |
| Rows starting/ending pattern     | `^temp|\.csv$`                           | `startswith()`/`endswith()`              |
| Match date patterns              | `\b\d{4}-\d{2}-\d{2}\b`                 | Not feasible                             |
| Match multiple substrings        | `apple|banana`                           | Chain multiple `contains()`              |
| Match phone numbers              | `\(\d{3}\) \d{3}-\d{4}`                 | Not feasible                             |
| Remove non-alphanumeric          | `[^a-zA-Z0-9\s]`                         | Not feasible                             |
| Exclude by pattern               | `~contains(r"\d", regex=True)`           | Not feasible                             |
| Extract numbers                  | `(\d+)`                                  | Not feasible                             |

---

Regex is far more versatile and powerful for complex text filtering and pattern recognition, while string methods are simpler and faster for basic tasks.