##### Execute on Google Collab only

In [None]:
# Install the environnement
%pip install git+https://github.com/AwePhD/NotebooksLabsessionImage.git

In [None]:
# Import dataset 
# Can be found at https://www.kaggle.com/vishalsubbiah/pokemon-images-and-types
!rm -rf ./*
!curl -LO https://github.com/AwePhD/NotebooksLabsessionImage/raw/main/pokemon_dataset.zip
!unzip -qq pokemon_dataset.zip
!rm pokemon_dataset.zip

##### Execute anywhere

In [None]:
# Standard import
from pathlib import Path

## The problem of Path
Path can be a huge pain for any project in Python. These functions might have complicated syntax, hard  to remember, hard to search. 

More, when a project is developped by multiple users, they might work on different OS (Linux, macOS, Windows). Each OS have a different way to work with paths. Which means that some code might run perfectly on Linux and does not work on Windows. So we want to figure out a way of browsing paths without being OS dependent.

The point of this notebook is to provide some lines that can be used in any project.
All deeper code explanation can be found in the [Python doc](https://docs.python.org/fr/3/library/pathlib.html), I highly recommend to read this doc page once in your life or to read the excellent [Real Python blog](https://realpython.com/python-pathlib/).


### Get the current working directory

This is the base command to locate where Python is running. All [relative paths](https://www.computerhope.com/jargon/r/relapath.htm) are relative to the current working directory.

In [None]:
print(f"Path.cwd(): {Path.cwd()}\n")

### Relative and absolute path

Speaking of relative path, there is the opposite: the absolute path. This path is relative to the root folder. It shows all the details of the path to go into a specific location. This is absolute because the root folder is an absolute reference for every locations of the computer file's tree. The relative path is relative to a specific folder `.`. The same path `some/path` is not the same location based on where is your `.` in the global path. While, the absolute path version is clear and long `/absolute/path/to/some/path` and do not depend of anything.

- `.` is the local path / current working directory. Typing a path beginning with `.` is a **relative path** to the current position in the compute folder's and file's tree.
- `/` (or `<DRIVER_LETTER>:/` on Windows) is the root path. Typing a path starting with `/` is an **absolute path**. This path does not depend of anything but the root folder which is known by every files of the system.

In [None]:
print(
    f"Path.cwd().is_absolute(): {Path.cwd().is_absolute()}\n"
    f"Path('.').is_absolute(): {Path('.').is_absolute()}\n"
    f"Path.cwd().samefile(Path('.'): {Path.cwd().samefile(Path('.'))}\n"
)

### Browsing files in a folder

One of the most useful feature is to get access of files under a specific folder. This can be easily done with an iterator provided by `.iterdir()` for any `Path` variable, if the corresponding path is a directory.

In Python, an **iterator** is something that you can loop over, no more (kind of). For more details see the [official Python documentation](https://wiki.python.org/moin/Iterator). So `iterdir()` offers to loop over all the files containing in a folder, which is pretty useful. We will iterate over all the `Path` variables respresenting each file in the folder. This has the great advantage that we can use the methods of the `Path` object when itering, which makes the code much more readable.

Actually, as we see just below, `.iterdir()` is a generator which is very similar to an iterator. For more details, see the [official Python documentation](https://wiki.python.org/moin/Generators). In short, it acts like an iterator - we can loop over it. But, the generator is not evaluated until it gets iterated. The iterator stores all its data beforehand while the generator load the useful data on the fly, during the iteration.

Let's take a look of what is inside the current working directory.


In [None]:
print(f"type(Path.cwd().iterdir()): {type(Path.cwd().iterdir())}\n")
for pathfile in Path.cwd().iterdir():
  print(pathfile)

Also we can take a look of the data inside the `images` folder.

One detail, we change the generator - given by `iterdir()` - into a list. Why? Because a generator - or iterator - does not know its length. So, if we want only the first $n$ elements of a generator, we force it to be a list (or a tuple, or any datastructure that supports `len()`) so we can take the first $n$ elements.

In [None]:
path_images_dir: Path = Path.cwd() / "images"
for pathfile in list(path_images_dir.iterdir())[:5]:
  print(pathfile)

As we say previously, we can take advantage of the `Path` object returned for each iteration. Here `pathfile` is a `Path` object which will designed each files in the folder. From this object we can easily get the name of the file, the extension of the file or the name without the extension. Here we can see that the `Path` object makes browsing files and get information very convenient for us.

In [None]:
for pathfile in list(path_images_dir.iterdir())[:5]:
  print(
      f"name: {pathfile.name}\n"
      f"stem: {pathfile.stem}\n"
      f"suffix: {pathfile.suffix}\n"
  )

### Browsing up

We just see that we can browse down the tree of files. Also, we can browse up, namely we can go to the parent directory. Same as before, the parent directory `Path` object will give us handy features. 

To browse up there is the straightforward `parent` attribute to ask for the direct parent directory. Plus, if we need to browse up even further, we can use the `parents` list attribute to go upper by several levels. This version of `parents` is useful to avoid `.parent.parent.parent` to go up 3 levels: it's repetitive and error prone.

In [None]:
# We can navigate in the path by going to parent directories
print(
    f"path: {path_images_dir}\n"
    f"path.parent: {path_images_dir.parent}\n"
    f"path.parents[0]: {path_images_dir.parents[0]}\n"
    f"path.parents[1]: {path_images_dir.parents[1]}\n"
)

### File or folder

With `is_dir()` and `is_file()` we can respectively check wether the `Path` object designates a directory or a file.

In [None]:
# We can test if the path is directory or a file
path_pikachu = path_images_dir / "pikachu.png"
print(
    f"Images path is dir ? {path_images_dir.is_dir()}\n"
    f"Images path is file ? {path_images_dir.is_file()}\n\n"
    f"Pikachu path is dir ? {path_pikachu.is_dir()}\n"
    f"Pikachu path is file ? {path_pikachu.is_file()}\n"
)