# Pathlib

We often use the `os` module to manipulate the paths of files and directories, like below.

In [1]:
import os

print(os.getcwd())
print(os.path.exists("pathlib.ipynb"))

d:\Project\PythonUniverse\modules
True


After Python 3.4, we can use a new module called `pathlib`, which encapsulates various functions of `os` into the `Path` class; It makes the operations with files and directories more intuitive and more object-oriented.

## Path object

We send the `raw path string` of a file or a directory to the `Path` object, then we can do any operation using its **instance function**.

In [2]:
# old:
# os.path.exists('pathlib.ipynb')

from pathlib import Path

Path("pathlib.ipynb").exists()

True

`Path` also has some arithmetic operations, one is using **division operator** (/) to join the `Path`s.

In [3]:
Path("home") / Path("project") / Path("something.txt")

WindowsPath('home/project/something.txt')

It is also possible to **compare** two `Path` objects.

In [4]:
print(Path("/tmp") == Path("/tmp"))
print(Path("/tmp/a") == Path("/tmp"))

True
False


We can get the **filename** and **filename extension** by using the properties: `name` and `suffix` from the `Path` object. 

In [20]:
file_ = Path("home") / Path("project") / Path("something.txt")

something.txt
.txt


## Create, Read, Write, Check, Remove Files

### Create

you can use `touch()` to create an empty file, and use `mkdir()` to create an empty folder.

`mkdir()` has two important params:

1. `parents`
   1. True = any **missing parents** of this path are created as needed
   2. False = a missing parent raises `FileNotFoundError`
2. `exist_ok`
   1. True = `FileExistsError` exceptions will be ignored
   2. False = `FileExistsError` is raised if the target directory already exists

In [55]:
sub_folder = Path("subfolder/subfolder")

sub_folder.mkdir(parents=True, exist_ok=True)

file_ = sub_folder / Path("test.txt")
file_.touch()

# subfolder
#   - subfolder
#     - test.txt

### Write

We can use **instance function** `write_text()` or context manager style to write the file.

In [35]:
file_.write_text("hello world")

with file_.open("a") as f:
    f.write("!!!")

### Read

Same as **write**.

In [36]:
print(file_.read_text())

with file_.open("r") as f:
    print(f.readlines())

hello world!!!
['hello world!!!']


### Check

- check if the `Path` is file = `is_file()`
- check if the `Path` is directory = `is_dir()`
- check file's stat = `stat()`

In [43]:
print(file_.is_file())
print(file_.is_dir())
print(file_.stat())

True
False
os.stat_result(st_mode=33206, st_ino=1970324836975828, st_dev=4289877840, st_nlink=1, st_uid=0, st_gid=0, st_size=14, st_atime=1612971481, st_mtime=1612971476, st_ctime=1612971425)


### Remove

You can use `unlink()` to remove a file, and `rmdir()` to remove a **empty folder**.

If you want to remove a non-empty folder recursively, please check out: https://stackoverflow.com/questions/50186904/pathlib-recursively-remove-directory

In [54]:
# subfolder
#   - subfolder
#     - test.txt

file_.unlink()

Path("subfolder/subfolder").rmdir()
Path("subfolder").rmdir()

## Find all files 

There are two ways to find the files in all folders, one is to traverse the file tree with `iterdir()` and the other is `glob()`.

Consider our file system tree is:

- subfolder
  - test.txt
  - subfolder
    - test.txt

In [69]:
def iter_dir(dir):
    for x in dir.iterdir():
        if x.is_file():
            yield x
        else:
            yield from iter_dir(x)

for f in iter_dir(Path("subfolder")):
    print(f)

subfolder\subfolder\test.txt
subfolder\test.txt


In [74]:
for f in Path("subfolder").glob("**/*"):
    if f.is_file():
        print(f)

subfolder\test.txt
subfolder\subfolder\test.txt


# Reference

* https://myapollo.com.tw/zh-tw/python-pathlib/
* https://docs.python.org/3/library/pathlib.html