# 🔍 Notepad++ Regular Expression (Regex) Codes

### Introduction
These regular expression patterns can be used in Notepad++'s Find and Replace dialog to perform complex text manipulations efficiently. Always ensure "Regular expression" is selected under "Search Mode" when using these patterns.

### Objectives
- Explore regular expression codes.
- Learn to search by expressions.

Regular expressions are powerful tools for searching and manipulating text. Here's a list of commonly used regex codes in Notepad++:

### Basic Regex Codes

1. **`.` (Dot)**
   - Matches any single character except newline characters.
   - **Example**: `a.c` matches "abc", "adc", "a c", etc.

2. **`^` (Caret)**
   - Matches the start of a line.
   - **Example**: `^Hello` finds "Hello" at the beginning of lines.

3. **`$` (Dollar)**
   - Matches the end of a line.
   - **Example**: `end$` finds "end" at the end of lines.

4. **`*` (Asterisk)**
   - Matches zero or more of the preceding element.
   - **Example**: `lo*` matches "l", "lo", "loo", etc.

5. **`+` (Plus)**
   - Matches one or more of the preceding element.
   - **Example**: `lo+` matches "lo", "loo", "looo", etc.

6. **`?` (Question Mark)**
   - Makes the preceding character optional (matches zero or one occurrence).
   - **Example**: `colou?r` matches "color" and "colour".

7. **`[]` (Character Class)**
   - Matches any single character contained within the brackets.
   - **Example**: `[aeiou]` matches any vowel.

8. **`[^]` (Negated Character Class)**
   - Matches any single character not contained within the brackets.
   - **Example**: `[^aeiou]` matches any non-vowel.

9. **`{n}`**
   - Matches exactly `n` occurrences of the preceding character.
   - **Example**: `lo{2}` matches "loo".

10. **`{n,}`**
    - Matches `n` or more occurrences of the preceding element.
    - **Example**: `lo{2,}` matches "loo", "looo", "loooo", etc.

11. **`{n,m}`**
    - Matches from `n` to `m` occurrences of the preceding character.
    - **Example**: `lo{1,3}` matches "lo", "loo", "looo".

12. **`\` (Backslash)**
    - Escapes a special character.
    - **Example**: `\.` matches a literal dot.

13. **`|` (Pipe)**
    - Acts as a logical OR.
    - **Example**: `cat|dog` matches "cat" or "dog".

14. **`()` (Grouping)**
    - Groups multiple tokens together and creates a capture group for extracting a substring or using back-references.
    - **Example**: `(abc)+` matches "abc", "abcabc", "abcabcabc", etc.

15. **`\1, \2, ...` (Back-references)**
    - Matches the same text as previously matched by a capturing group.
    - **Example**: `(abc)\1` matches "abcabc".

16. **`\s`**
    - Matches any whitespace character (spaces, tabs, line breaks).
    - **Example**: `\s+` matches any sequence of whitespace.

17. **`\S`**
    - Matches any non-whitespace character.
    - **Example**: `\S+` matches any sequence of non-whitespace characters.

18. **`\d`**
    - Matches any digit (equivalent to `[0-9]`).
    - **Example**: `\d+` matches any sequence of digits.

19. **`\D`**
    - Matches any non-digit.
    - **Example**: `\D+` matches any sequence of non-digit characters.

20. **`\w`**
    - Matches any word character (letters, digits, or underscore).
    - **Example**: `\w+` matches any sequence of word characters.

21. **`\W`**
    - Matches any non-word character.
    - **Example**: `\W+` matches any sequence of non-word characters.

Use these patterns in Notepad++'s Find and Replace dialog by selecting "Regular expression" under "Search Mode".



# 🤖 Regular Expression Test

Copy this text into Notepad++ to try some regular expression searches.

---
---
Hello, welcome to the regex playground! Here are some lines to test:
    
1. The quick brown fox jumps over 13 lazy dogs.
2. 2023-08-30 is a significant date for project launch.
3. Email addresses like john.doe@example.com should be matched.
4. Look for special characters like `%`, `$`, and `&` within this line.
5. The cost of the item was $299.99 on 02/01/2020.
6. My phone number is 555-1234-567, call me maybe!
7. Find lines with only one word: Success
8. This line contains, commas, semicolons; and colons: should be interesting.
9. Match digits like 12345 and non-digits with characters together 123abc.
10. Identify lines that end with a period.
11. Mr. Smith bought cheapsite.com for 1.5 million dollars, i.e., he paid a lot for it.
12. Hello? Who is there? It's me, wondering why you're not here!
13. Catch multi-line statements
that break over two lines.
14. Try matching line breaks and tabs 		here.
15. There should be lines that contain the word 'lines' multiple times in different lines.
16. What about matching words with apostrophes like it's, you're, and they're?
17. Look for patterns that start with a capital letter and end with a question mark?
18. This line is very simple.
19. End of the list.

---
---

Try these **Find Patterns**.

1.   Find any digit: Use the regex \d to find all the digits in the document.
2.   Match email addresses: You might use something like `\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b`.
3.   Identify monetary values: To find a pattern like "\$299.99", try this search term `\$\d+(\.\d{2})?`
4.   A date like mm/dd/yyyy can be found with this pattern `\b\d{2}/\d{2}/\d{4}\b`.

Ask ChatGPT to break this pattern down for you. `\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'.


# 🔍 Comparing Files Using Python

## Introduction
Comparing data is a common requirement in many data analysis, and system administration tasks. This section will introduce you to basic techniques for comparing files using Python.

## Objectives
- Understand how to use Python for file comparison.
- Learn to compare text files line by line.
- Explore methods to compare binary files.
- Implement file comparison in practical programming scenarios.

## Tools and Libraries
- **`filecmp` module**: A module that provides functions to compare files and directories in Python.
- **`difflib` library**: Useful for identifying differences between sequences, including lines in text files.

## Comparing Text Files
To compare text files, you can read the files line by line and identify differences using the `difflib` library. This allows you to see exactly what has changed between the two files.

### Getting a List of Files
To build a list of files in a directory, follow these steps:

1. **Navigate to the Directory**: Open File Explorer and go to the desired directory.
2. **Select All Files**: Press `CTRL + A` to select all files in the directory.
3. **Copy as Path**: Hold `Shift`, right-click, and select "Copy as path."
4. This will be the list that you feed to ChatGPT.

If you encounter issues with this process, you can ask ChatGPT to help build a list, even if the data is not on a single line.

### Getting a Script from ChatGPT
Ask ChatGPT to write a Python script that performs the file comparison for you.


# 🤖 Example Query

<div class="alert alert-block alert-info">

- **Objective**: Compare text files across two model folders for differences, ignoring variations in whitespace.
- **Data**: Text files named DEPFP.OUT, FINALDEP.OUT, FPINFILTRATION.OUT.
- **Output Format**: Output whether there are differences directly to the console and log specific differences to a file named diff_log.txt.
- **Dependencies**: Use Python standard libraries for file handling and text manipulation.
- **Path**: C:\Users\Karen\Chat GPT Workshop\Data\File Diff
- **Request**: Create a Python script that compares text files in 'Model1' and 'Model2' folders.
- **White Space**: Ignore whitespace differences using strip and split methods. 
- **Output**: Write results to the console and log specific differences to a file.

</div>

## ▶️ Code Block

The following code block is the result of this query. (Insert your Python code here if needed)

In [None]:
# Automatically set base path to the project directory where the notebook is running
from pathlib import Path

# This gets the directory where the current notebook is located
base_path = Path.cwd()

print(f"📂 Base path automatically set to: {base_path}")

import os

def compare_files(path1, path2):
    with open(path1, 'r') as file1, open(path2, 'r') as file2:
        lines1 = file1.readlines()
        lines2 = file2.readlines()
    
    differences = []
    for line1, line2 in zip(lines1, lines2):
        # Strip whitespace from the ends and split by whitespace to ignore differences within lines
        if line1.strip().split() != line2.strip().split():
            differences.append((line1, line2))
    return differences

def main():
    folder_path = base_path / 'Data' / 'File Diff'
    models = ['Model1', 'Model2']
    files = ['DEPFP.OUT', 'FINALDEP.OUT', 'FPINFILTRATION.OUT']
    
    # Create or open the log file
    with open(os.path.join(folder_path, 'diff_log.txt'), 'w') as log:
        for file in files:
            file_path1 = os.path.join(folder_path, models[0], file)
            file_path2 = os.path.join(folder_path, models[1], file)
            
            if os.path.exists(file_path1) and os.path.exists(file_path2):
                differences = compare_files(file_path1, file_path2)
                if differences:
                    print(f"Differences found in {file}:")
                    log.write(f"Differences in {file}:\n")
                    for line1, line2 in differences:
                        log.write(f"Model1: {line1}Model2: {line2}\n")
                else:
                    print(f"No differences in {file}.")
                    log.write(f"No differences in {file}.\n")
            else:
                print(f"One or both files are missing for {file}.")
                log.write(f"One or both files are missing for {file}.\n")

if __name__ == "__main__":
    main()
