# Regex Copy

## About Regex

This tool uses regex, or "regular expressions". Regex is a powerful way of searching for text and defining new text strings based on found text. It is widely used, extensively documented, and well understood by AI.

## Using the Tool

To use this tool, include the script `file_tools.py` in the same directory as your Jupyter notebook, and import it using `import file_tools`.

To try out the examples in this notebook, create the following directory structure (the `.txt` files can be empty):
```md
├── tests
│   ├── Arabic
│   │   ├── Leila_F_e10.txt
│   │   └── Leila_F_e20_best.txt
│   └── English
│       ├── Jane_F_e10_best.txt
│       ├── Tom_M_e5_best.txt
│       └── Tom_M_e5.txt
├── (this notebook)
└── file_tools.py
```

The `regex_copy` function has the following arguments:
- __source_pattern (str):__ regex to match source file paths.
- __target_pattern (str):__ target path pattern, can use \1, \2, ... from source_pattern.
- __dry_run (bool):__ if True, only print the actions that would be taken without actually copying files. Useful for preventing mistakes.
- __root (str or Path):__ directory to start searching from, e.g. root=relative/path/to/root/folder. To save time, set the root to the folder that you are searching within.
- __verbose (bool):__ if True, print detailed debug information. Useful if you don't understand why files are not copying.


__WARNING:__
- This script is POWERFUL. It will create new folders or overwrite existing files without warning. Use with care!


In [1]:
# Always run this cell.
import file_tools

In [2]:
# Dry run: Copy all English files to the folder output/en,
file_tools.regex_copy(r"tests/English/(.*)", r"tests/output/en/\1", root="tests", dry_run=True)

--------------------
--------------------
  - Would copy tests\English\Jane_F_e10_best.txt → tests\output\en\Jane_F_e10_best.txt
  - Would copy tests\English\Tom_M_e5.txt → tests\output\en\Tom_M_e5.txt
  - Would copy tests\English\Tom_M_e5_best.txt → tests\output\en\Tom_M_e5_best.txt


In [None]:
# Copy all English files to the folder output/en,
file_tools.regex_copy(r"tests/English/(.*)", r"tests/output/en/\1", root="tests", dry_run=False)

--------------------
--------------------
  - Would copy tests\regex_copy\English\Jane_F_e10_best.txt → tests\regex_copy\output\en\Jane_F_e10_best.txt
  - Would copy tests\regex_copy\English\Tom_M_e5.txt → tests\regex_copy\output\en\Tom_M_e5.txt
  - Would copy tests\regex_copy\English\Tom_M_e5_best.txt → tests\regex_copy\output\en\Tom_M_e5_best.txt


In [None]:
# Copy all "_best" files to the folder output/best, and remove the _best suffix.
file_tools.regex_copy(r"tests/.*/(.*)_best\.txt", r"tests/output/best/\1.txt", root="tests", dry_run=False)

--------------------
--------------------
  - Would copy tests\regex_copy\Arabic\Leila_F_e20_best.txt → tests\regex_copy\output\best\Leila_F_e20.txt
  - Would copy tests\regex_copy\English\Jane_F_e10_best.txt → tests\regex_copy\output\best\Jane_F_e10.txt
  - Would copy tests\regex_copy\English\Tom_M_e5_best.txt → tests\regex_copy\output\best\Tom_M_e5.txt
  - Would copy tests\regex_copy\output\en\Jane_F_e10_best.txt → tests\regex_copy\output\best\Jane_F_e10.txt
  - Would copy tests\regex_copy\output\en\Tom_M_e5_best.txt → tests\regex_copy\output\best\Tom_M_e5.txt
