<p>
  <b>AI Lab: Deep Learning for Computer Vision</b><br>
  <b><a href="https://www.wqu.edu/">WorldQuant University</a></b>
</p>

<div class="alert alert-success" role="alert">
  <p>
    <center><b>Usage Guidelines</b></center>
  </p>
  <p>
    This file is licensed under <a href="https://creativecommons.org/licenses/by-nc-nd/4.0/">Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International</a>.
  </p>
  <p>
    You <b>can</b>:
    <ul>
      <li><span style="color: green">✓</span> Download this file</li>
      <li><span style="color: green">✓</span> Post this file in public repositories</li>
    </ul>
    You <b>must always</b>:
    <ul>
      <li><span style="color: green">✓</span> Give credit to <a href="https://www.wqu.edu/">WorldQuant University</a> for the creation of this file</li>
      <li><span style="color: green">✓</span> Provide a <a href="https://creativecommons.org/licenses/by-nc-nd/4.0/">link to the license</a></li>
    </ul>
    You <b>cannot</b>:
    <ul>
      <li><span style="color: red">✗</span> Create derivatives or adaptations of this file</li>
      <li><span style="color: red">✗</span> Use this file for commercial purposes</li>
    </ul>
  </p>
  <p>
    Failure to follow these guidelines is a violation of your terms of service and could lead to your expulsion from WorldQuant University and the revocation your certificate.
  </p>
</div>

### Starting to use `pathlib`

The first step is to import the `Path` class from `pathlib` module with the `from pathlib import Path` statement. This single import gives you access to a powerful set of tools for path manipulation and file system operations.

In [1]:
from pathlib import Path

Let's access the home directory with `Path.home()`. Note that method-based operation is cross-platform, thus the same code will work the same regardless of the operating system. 

In [2]:
home_directory = Path.home()
print(home_directory)

/root


Now it's your turn to try to use the `Path` class to explore the file system.

**Task 3.1.1:** Find the current working directory using the `cwd` method on `Path`.

In [3]:
current_working_directory = Path.cwd()
print(current_working_directory)

/app


### The forward slash (/) operator

The forward slash (`/`) operator in `pathlib` is used for path joining. The `/` operator allows you to combine multiple Path objects or strings to create new paths. It automatically uses the correct path separator for the current operating system (`/` for Unix-based systems and `\` for Windows). As a result, the same code can run on different operating systems without any rewriting.

Below is an example of creating a variable that combines a Path object with several other elements. Note that since this only is a variable right now, it does not need to reference an actual location.

In [None]:
path = Path("folder") / "subfolder" / "file.txt"
print(path)

**Task 3.1.2:** Create a variable that defines a folder named "scripts" in the current directory.


In [5]:
current_working_directory = Path.cwd()
scripts_path = current_working_directory / "scripts"
print(scripts_path)

/app/scripts


Python Path objects are different from Python strings (`str`). The addition operator (`+`) for Python string does not work with Python Path objects. The forward slash (`/`) operator has to be used with Python Path objects.

**Task 3.1.3:** Change the operator to work with a Python Path object.

In [7]:
path = Path("scripts") / "tests" / "tests.py"
print(path)

scripts/tests/tests.py


### Absolute vs. Relative Paths

When working with paths, it's important to understand the distinction between absolute and relative paths. Absolute paths provide the complete route from the root directory to the specified location, while relative paths are based on the current working directory. `pathlib` allows you to create both types: an absolute path might look like `Path('/home/jovyan')` and a relative path could be `Path('03-traffic-in-dhaka/lessons')`. Note that a relative path doesn't start with a root directory indicator (like "/" on Unix-based systems or "C:" on Windows).

Below is an example of converting a relative path to an absolute path by prepending the absolute path to the beginning of the relative path.

In [9]:
relative_path = Path("config", "settings")
absolute_path = Path.home() / relative_path
print(absolute_path)

/root/config/settings


**Task 3.1.4:** Convert a relative path to an absolute path based on `Path.cwd()`.

In [10]:
relative_path = Path("scripts", "tests")
absolute_path = Path.cwd() / relative_path
print(absolute_path)

/app/scripts/tests


### Listing directory contents

Instances of the `Path` class have the `iterdir` method which iterates over the contents, files and subdirectories, of a given directory. Calling the `iterdir` method returns an iterator that doesn't load all the results into memory at once.

Below is an example of using the `iterdir` method.

In [11]:
current_working_directory = Path.cwd()
print(current_working_directory.iterdir())

<generator object Path.iterdir at 0x7c882469b4c0>


An iterator can be looped over in Python.

In [12]:
current_working_directory = Path.cwd()
for item in current_working_directory.iterdir():
    print(item)

/app/.ipynb_checkpoints
/app/031-fix-my-code.ipynb
/app/Arial.ttf
/app/Test.ipynb


`iterdir()` returns an iterator, not a list. That means you can't directly print it or get its length.

In [13]:
print(Path.cwd().iterdir())

<generator object Path.iterdir at 0x7c882469b4c0>


If you want to load the data into memory all at once, the iterator can be cast into a list.

In [14]:
# Cast to list before printing
print(list(Path.cwd().iterdir()))

[PosixPath('/app/.ipynb_checkpoints'), PosixPath('/app/031-fix-my-code.ipynb'), PosixPath('/app/Arial.ttf'), PosixPath('/app/Test.ipynb')]


**Task 3.1.5:** Fix the code to get the number elements in the current directory with `len`.

In [15]:
current_dir_iter = Path.cwd().iterdir()

# Fix the code
num_elements = len(list(current_dir_iter))

In [16]:
print(num_elements)

4


### Making directories

The `pathlib` module has a convenient way to create a new directory. Create an instance of a `Path` class that specifies the location of this new directory, then call the `mkdir` method to actually create it. 

Below is an example of how this is done. If this cell is run twice it will generate a `FileExistsError`.

In [17]:
scripts_dir = Path.cwd() / "scripts"
scripts_dir.mkdir()

Adding the keyword argument `exist_ok=True` prevents raising an error if the directory already exists.

In [18]:
scripts_dir = Path.cwd() / "scripts"
scripts_dir.mkdir(exist_ok=True)

**Task 3.1.6:** Make a "tests" folder in the current directory, even it already exists.

In [19]:
tests_dir = Path.cwd() / "tests"
tests_dir.mkdir(exist_ok=True)

### Glob patterns

Glob allows you to search for files and directories using wildcard patterns, making it handy to find files with certain suffixes. We can use the glob to find all the TrueType font (`.ttf`) files in the current directory.

In [20]:
current_working_directory = Path.cwd()
all_ttf_files = current_working_directory.glob("*.ttf")
for file in all_ttf_files:
    print(file)

/app/Arial.ttf


**Task 3.1.7:** Find all the Jupyter Notebooks (`.ipynb`) in the current directory.

In [21]:
current_working_directory = Path.cwd()
all_ipynb_files = current_working_directory.glob("*.ipynb")
for file in all_ipynb_files:
    print(file)

/app/031-fix-my-code.ipynb
/app/Test.ipynb


In summary, the `pathlib` module provides a powerful and intuitive way to work with file systems in Python. The module's object-oriented approach to handling paths simplifies common operations like listing directory contents, creating directories, and searching for files. 

---
This file &#169; 2024 by [WorldQuant University](https://www.wqu.edu/) is licensed under [CC BY-NC-ND 4.0](https://creativecommons.org/licenses/by-nc-nd/4.0/).