# Lesson 03 Reference

# `pathlib`

```python
import pathlib
```

Your current working directory ("cwd") is the same directory where your notebook file is.

**Relative path**: The current working directory is taken as the starting point and is called `.`. As you navigate into directories, your path deepens relative to your cwd, e.g. `./directory_1/directory_2`

**Absolute path**: Your relative cwd can be _resolved_ to create an absolute path. An absolute path starts at your computer's root directory and navigates deeper. On Windows that starts with a drive letter, i.e. `C:/` and deepens in relation to the root, e.g. `C:/Users/your_name/Desktop`. On "POSIX" systems (Linux and Mac OS), the root directory is called `/`.

```python
here = pathlib.Path() # Relative path to cwd
here_abs = here.resolve()
```

**Navigate paths by using the `/` operator (division)**

```python
my_file = here / "directory 1" / "my_file.txt"
```

**Go "up" a directory by using `.parent`**

```python
dir_1 = my_file.parent
my_other_file = dir_1 / "my_other_file.txt"
```

Use `[Tab]` to see the list of methods for `Path` objects. Use `[Shift]-[Tab]` to see how to use an individual method.

> See Reference 02 for more info on using `[Tab]` and `[Shift]-[Tab]` on Python objects in Jupyter.

### Some `Path` methods

* `path.exists()` - Returns True if the path exists on your computer; False otherwise
* `path.mkdir()` - Starting with a path (abs or rel) that exists, make up a new directory name and navigate into it (after you do, running `.exists()` on your path should return False). Then run `.mkdir()` to make that new directory exist on your computer.
* `path.touch()` - Similar to `.mkdir()`, navigate into a directory that exists and then create a path to a file that does not exist yet (after you do, running `.exists()` on your file path should return False). Then run `.touch()` to make that new file exist.
* `path.rename(new_path)` - Moves/renames the current path on your machine to the `new_path`. If you are want to just change the file name, you should include the full path (i.e. all of the parents) up to that file, e.g. `my_path.rename(my_path.parents / "new_file_name.ext")`
* `path.glob(match_str)` - Returns an iterator of the contents of the directory at `path` (if `path` is a directory and not a file) that matches the `match_str`. Use `"*"` as a "wild card" to match files of a certain types or names. 
    * `"*"` will match all files and directories
    * `"*.txt"` will match all files ending in ".txt"
    * `"VAN.120*.xlsx"` will match all files with names that start with "VAN.120" and end with ".xlsx"

---

# Looping with `for` loops

```python
for <name_of_individual_item_in_loop> in <container_name_or_iterable>:
    <do something here inside the loop>
    
<this code outside the loop>
```

**Note: The code you want to execute within the loop MUST be indented four spaces. Use the `[Tab]` key to indent if Jupyter does not auto-indent for you. To write code outside of the loop, unindent (use `[Shift]-[Tab]`).**

## "Pythonic" loop
```python
multiples_3 = [3, 6, 9, 12]

for multiple in multiples_3:
    print(multiple)
```

## "Non-pythonic" loop

```python
multiples_3 = [3, 6, 9, 12]

for idx in range(len(multiples_3)):
    print(multiples_3[idx])
```

# Some loop "recipes"

## Transforming items in a list (or other collection)

```python
your_data = ['data1', 'data2', 'data3', ...]

acc = [] # Your "accumulator", an empty list
for <item> in <your_data>:
    new_item = <do something with item>
    acc.append(new_item)
```

## Combining data from two lists

```python
data_list_1 = [...]
data_list_2 = [...]

acc = [] # Accumulator
for idx, item in enumerate(data_list_1):
    other_item = data_list_2[idx]
    new_item = <do something with item and other_item>
    acc.append(new_item)
```

## Double loops

```python
nested_data = [[...], [...], ...]

outside_acc = [] # "Outside" single accumulator
for outside_item in nested_data:
    inner_acc = [] # You may or may not need an inner accumulator
    for inside_item in outside_item:
        new_inside_item = <do something with inside item>
        inner_acc.append
    new_outside_item = inner_acc
    outside_acc.append(new_outside_item)
```

## Cycling

```python
data_list_1 = [...] # 'data_list_1' is much longer than 'cycle_data'
cycle_data = [...]

acc = [] # Accumulator
for idx, item in enumerate(data_list_1):
    # Use the modulo operator to get the *remainder* of idx / len(cycle_data)
    cycle_idx = idx % len(cycle_data)
    other_item = cycle_data[cycle_idx]
    new_item = <do something with item and other_item>
    acc.append(new_item)
```

---

# File looping recipes

## Looping through lines of a file: reading

```python
here = pathlib.Path()
my_file = here / "file.txt" # Can be a text file with any extension but must be a text-based file (i.e. you can read it in Notepad)
print(my_file.exists())
```

```python
file_data = [] # Our accumulator for the data
with open(my_file, mode="r") as file:
    for line in file.readlines():
        file_data.append(line)
```

## Looping through lines of a file: writing

```python
my_new_file = here / "new_file.txt"
with open(my_new_file, mode="w") as file:
    for line in file_data: # Your file_data is a list of strings
        file.write(line)
```

## Looping over files in a directory

```python
for file_path in here.glob("*.txt"): # Will return all files ending in .xlsx
    print(file_path)
```

### Example of bulk file operations

Create a bunch of empty text files (to start working on):

```python
new_dir = here / "New Directory"
new_dir.mkdir()
print(new_dir.exists())

file_names = ["a.txt", "b.txt", "c.txt", "d.txt", "e.txt"]
for file_name in file_names:
    file_path = new_dir / file_name
    file_path.touch()
```

Now, change the file names to look like RJC project file names:

```python
for file_path in new_dir.glob("*.txt"):
    name_template = "VAN.123456.0001-NOTE-20220115-CMF"
    new_file_name = f"{name_template}-{file_path.name}"
    print("Renaming file: ", file_path.name, new_file_name) # Test first
    # file_path.rename(file_path.parent / new_file_name)
```

### FOR GOODNESS'S SAKE!!! BE CAREFUL WHEN DOING BULK FILE OPERATIONS!!!

There is NO UNDO for this.

To be safe:

1. Make backups and keep them safe before attempting anything
2. Use the `print()` function to test what you are about to do before you do it
3. Only perform these operations on backed up files on your local machine. **DO NOT OPERATE ON FILES ON THE SHARED DRIVES OR ON ANYONE ELSE'S FILES! YOU _WILL_ BE SORRY OTHERWISE!**
