# Path Library Module

The Path Library Module ```pathlib``` is an object orientated approach to handling file paths.

## Categorize_Identifiers Module

This notebook will use the following functions ```dir2```, ```variables``` and ```view``` in the custom module ```categorize_identifiers``` which is found in the same directory as this notebook file. ```dir2``` is a variant of ```dir``` that groups identifiers into a ```dict``` under categories and ```variables``` is an IPython based a variable inspector. ```view``` is used to view a ```Collection``` in more detail:

In [1]:
from categorize_identifiers import dir2, variables, view

## Identifiers

The ```pathlib``` module can be imported using:

In [2]:
import pathlib

Its identifiers can be viewed. The most commonly used identifier is the class ```'Path'``` which automatically determines the Operating System and creates the appropriate path. The ```WindowsPath``` or ```'PosixPath'``` (Linux/Mac) are only normally directly selected if a Windows path is required on a Linux/Mac machine or a Linux/Mac path is required on a Windows machine:

In [3]:
dir2(pathlib)

{'constant': ['EBADF', 'ELOOP', 'ENOENT', 'ENOTDIR'],
 'module': ['fnmatch',
            'functools',
            'io',
            'ntpath',
            'os',
            'posixpath',
            're',
            'sys',
 'method': ['urlquote_from_bytes'],
 'upper_class': ['Path',
                 'PosixPath',
                 'PurePath',
                 'PurePosixPath',
                 'PureWindowsPath',
                 'Sequence',
                 'WindowsPath'],
 'datamodel_attribute': ['__all__',
                         '__builtins__',
                         '__cached__',
                         '__doc__',
                         '__file__',
                         '__loader__',
                         '__name__',
                         '__package__',
                         '__spec__'],
 'internal_attribute': ['_FNMATCH_PREFIX',
                        '_FNMATCH_SLICE',
                        '_FNMATCH_SUFFIX',
                        '_IGNORED_ERRNOS',
            

Since normally only the ```Path``` class is used, it is normally imported directly:

In [4]:
from pathlib import Path

There are a number of attributes and methods. The datamodel method ```__truedive__``` is also defined which defines the behaviour of the ```/``` operator:

In [5]:
dir2(Path, object, unique_only=True)

{'attribute': ['anchor',
               'drive',
               'name',
               'parent',
               'parents',
               'parts',
               'root',
               'stem',
               'suffix',
               'suffixes'],
 'method': ['absolute',
            'as_posix',
            'as_uri',
            'chmod',
            'cwd',
            'exists',
            'expanduser',
            'glob',
            'group',
            'hardlink_to',
            'home',
            'is_absolute',
            'is_block_device',
            'is_char_device',
            'is_dir',
            'is_fifo',
            'is_file',
            'is_junction',
            'is_mount',
            'is_relative_to',
            'is_reserved',
            'is_socket',
            'is_symlink',
            'iterdir',
            'joinpath',
            'lchmod',
            'lstat',
            'match',
            'mkdir',
            'open',
            'owner',
            'read_by

## Windows Path

Recall that Windows Operating System uses ```\``` as a file separator between folders and files. In Python ```\``` is also used to insert an escape character into a ```str``` instance. To insert ```\``` as an escape character the ```str``` instance contains ```\\```:

In [6]:
'C:\\Windows'

'C:\\Windows'

And if this is printed:

In [7]:
print('C:\\Windows')

C:\Windows


Manually converting each ```\``` to a ```\\``` can be tedious for a long file path and the ```str``` can be prefixed with ```R``` to make a raw string. A raw string does not process escape characters and the ```\``` represents the character backslash:

In [8]:
r'C:\Windows'

'C:\\Windows'

In [9]:
print(r'C:\Windows')

C:\Windows


An instance of the ```Path``` class can be instantiated from one of the rawstrings:

In [10]:
windows_folder = Path(r'C:\Windows')

Notice that instead of creating a ```Path``` instance it is automatically determined to be a ```WindowsPath``` instance, a ```WindowsPath``` is a child class of ```Path``` and therefore has consistent identifiers:

In [11]:
variables(['windows_folder'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
windows_folder,WindowsPath,,C:\Windows


The formal representation uses the ```WindowsPath``` alternative separator ```/```:

In [12]:
repr(windows_folder)

"WindowsPath('C:/Windows')"

However the informal representation uses the ```WindowsPath``` default separator:

In [13]:
str(windows_folder)

'C:\\Windows'

These control the behaviour shown in a cell output and when printed respectively:

In [14]:
windows_folder

WindowsPath('C:/Windows')

In [15]:
print(windows_folder)

C:\Windows


The datamodel method ```__truediv__``` (*dunder truediv*) is defined which recall defines the behaviour of the ```/``` operator. This is used for concatenation of a directory to the file path:

In [16]:
system_32 = windows_folder / 'System32'

In [17]:
variables(['system_32'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
system_32,WindowsPath,,C:\Windows\System32


And the ```notepad.exe``` application is found here:

In [18]:
app = windows_folder / 'System32' / 'notepad.exe'

In [19]:
variables(['app'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
app,WindowsPath,,C:\Windows\System32\notepad.exe


The ```joinpath``` method carries out a similar function:

In [20]:
app2 = windows_folder.joinpath('System32', 'notepad.exe')

In [21]:
variables(['app2'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
app2,WindowsPath,,C:\Windows\System32\notepad.exe


This ```Path``` instance has a number of attributes for example the app ```name```:

In [22]:
app.name

'notepad.exe'

Which includes the app ```stem``` and ```suffix```:

In [23]:
app.stem

'notepad'

In [24]:
app.suffix

'.exe'

The suffix can also be added to a list of suffixes:

In [25]:
app.suffixes

['.exe']

The ```parent``` directory:

In [26]:
app.parent

WindowsPath('C:/Windows/System32')

The ```parents``` directory:

In [27]:
app.parents

<WindowsPath.parents>

This is typically indexed:

In [28]:
app.parents[0]

WindowsPath('C:/Windows/System32')

In [29]:
app.parents[1]

WindowsPath('C:/Windows')

In [30]:
app.parents[2]

WindowsPath('C:/')

In [31]:
tuple(app.parents)

(WindowsPath('C:/Windows/System32'),
 WindowsPath('C:/Windows'),
 WindowsPath('C:/'))

The ```anchor``` includes the ```drive``` and the ```root```:

In [32]:
app.anchor

'C:\\'

In [33]:
app.drive

'C:'

In [34]:
app.root

'\\'

The ```is_absolute``` method will check to see if the ```Path``` instance corresponds to an absolute path:

In [35]:
app.is_absolute()

True

The ```is_relative_to``` method will check if the supplied ```Path``` instance is a root directory:

In [36]:
windows_folder

WindowsPath('C:/Windows')

In [37]:
system_32

WindowsPath('C:/Windows/System32')

In [38]:
app.is_relative_to(windows_folder)

True

The ```relative_to``` method will return a ```Path``` instance that can be used, to get to the directory from that root:

In [39]:
app.relative_to(windows_folder)

WindowsPath('System32/notepad.exe')

The ```is_reserved``` method will check if the instance corresponds to a path name that is reserved by the operating system such as:

* CON - Console
* PRN - Printer
* AUX - Auxiliary device
* NUL - Null device
* COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9 - Serial ports
* LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9 - Parallel ports

In [40]:
Path('CON').is_reserved()

True

The ```match``` method can be used to search for a regular expression pattern and will return a ```bool``` if the ```Path``` instance matches the supplied pattern, for example:

In [41]:
app.match('*.exe')

True

The ```with_stem```, ```with_name``` and ```with_suffix``` methods can be used to change the ```stem``` of a file/program maintaining the file extension, giving a subfolder without a file extension or used to change the file extension respectively:

In [42]:
app.with_stem('explorer')

WindowsPath('C:/Windows/System32/explorer.exe')

In [43]:
app.with_name('debug')

WindowsPath('C:/Windows/System32/debug')

In [44]:
app.with_suffix('.txt')

WindowsPath('C:/Windows/System32/notepad.txt')

The attribute ```parts``` will break down components of a path into a ```tuple```:

In [45]:
app.parts

('C:\\', 'Windows', 'System32', 'notepad.exe')

The ```Path``` class has two alternative constructors (class methods) ```cwd``` and ```home``` which are create a ```Path``` instance corresponding to the current working directory and user profile respectively:

In [46]:
current_working_directory = Path.cwd()

In [47]:
user_profile = Path.home()

In [48]:
variables(['current_working_directory', 'user_profile'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
current_working_directory,WindowsPath,,C:\Users\phili\OneDrive\Documents\GitHub\python-notebooks\pathlib_module
user_profile,WindowsPath,,C:\Users\phili


To get to Documents, the ```/``` operator can be used:

In [49]:
documents = user_profile / 'Documents'

In [50]:
variables(['documents'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
documents,WindowsPath,,C:\Users\phili\Documents


The ```expanduser``` instance method is normally called from a ```Path``` instance beginning with ```~```. The ```~``` is expanded to the user profile:

In [51]:
relative_documents = Path('~/Documents')

In [52]:
relative_documents.expanduser()

WindowsPath('C:/Users/phili/Documents')

Normally this is used directly:

In [53]:
documents = Path('~/Documents').expanduser()

In [54]:
variables(['documents'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
documents,WindowsPath,,C:\Users\phili\Documents


A ```Path``` instance to a ```text``` file in the current working directory can be created using:

In [55]:
text = Path.cwd() / 'text.txt'

The ```check``` method can be used to check whether or not this file exists:

In [56]:
text.exists()

False

The ```touch``` method can be used to create a file in the location specified in the ```Path``` instance:

In [57]:
if not text.exists():
    text.touch()

The ```mkdir``` method can used to make a new directory in the location specified in the ```Path``` instance:

In [58]:
directory = Path.cwd() / 'new_folder'

In [59]:
if not directory.exists():
    directory.mkdir()

And a new file can also be created here:

In [60]:
text2 = directory / 'text2.txt'

In [61]:
if not text2.exists():
    text2.touch()

The ```iterdir``` method of a root ```Path``` instance can be used to create an iterator. The iterator cycles through all the files and subdirectories in the root folder and when ```next``` is used on the iterator the corresponding ```Path``` instance displays:

In [62]:
directory_iterator = Path.cwd().iterdir()

In [63]:
directory_iterator

<generator object Path.iterdir at 0x000002716C9F3780>

In [64]:
next(directory_iterator)

WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/.ipynb_checkpoints')

In [65]:
next(directory_iterator)

WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/categorize_identifiers.py')

In [66]:
next(directory_iterator)

WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/new_folder')

The iterator can also be cast into a ```tuple```:

In [67]:
tuple(Path.cwd().iterdir())

(WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/.ipynb_checkpoints'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/categorize_identifiers.py'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/new_folder'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/notebook.ipynb'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/text.txt'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/__pycache__'))

The ```rename``` method can be used to rename a file:

In [68]:
new_text = Path.cwd() / 'new_text.txt'

In [69]:
text.rename(new_text)

WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/new_text.txt')

The ```open``` method of the ```Path``` class is essentially a wrapper around the ```open``` function of the ```io``` module, the method does not require specifiction of the file name as its taken from the ```Path``` instance.

The datamodel identifiers ```__enter__``` (*dunder enter*) and ```__exit__``` (*dunder exit*) are also defined meaning this can be used in a ```with``` code block.

In [70]:
with new_text.open(mode='w', encoding='utf-8', newline='\r\n') as file:
    file.writelines(['Hello World!\n', 'Hello'])

In [71]:
with new_text.open(mode='r', encoding='utf-8', newline='\r\n') as file:
    string = file.read()

string

'Hello World!\r\nHello'

The ```read_text``` and ```write_text``` methods make reading and writing the text files simpler. Unlike when the method ```open``` is used, the file will automatically be closed after the method is carried out. The ```read_bytes``` and ```write_bytes``` are the ```bytes``` counterparts:

In [72]:
new_text.read_text(encoding='utf-8')

'Hello World!\nHello'

Writing will overrride the old contents of the file:

In [73]:
new_text.write_text('Bye World!\nBye', encoding='utf-8', newline='\r\n')

14

In [74]:
new_text.read_text(encoding='utf-8')

'Bye World!\nBye'

There is no equivalent append method however the method ```open``` can be used for appending text as previously seen.

Another file ```text3.txt``` can be created using:

In [75]:
text3 = Path.cwd() / 'text3.txt'

In [76]:
text3.touch()

In [77]:
text3.write_text('Hello World!\nHello')

18

The ```replace``` method can be used to replace a file with another file:

In [78]:
new_text.replace(text3)

WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/text3.txt')

In [79]:
text3.read_text(encoding='utf-8')

'Bye World!\nBye'

In [80]:
tuple(Path.cwd().iterdir())

(WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/.ipynb_checkpoints'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/categorize_identifiers.py'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/new_folder'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/notebook.ipynb'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/text3.txt'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/__pycache__'))

The ```unlink``` method can be used to delete the file that the instance links to:

In [81]:
if text3.exists():
    text3.unlink()

In [82]:
tuple(Path.cwd().iterdir())

(WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/.ipynb_checkpoints'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/categorize_identifiers.py'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/new_folder'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/notebook.ipynb'),
 WindowsPath('C:/Users/phili/OneDrive/Documents/GitHub/python-notebooks/pathlib_module/__pycache__'))

The ```rmdir``` method can be used to remove a directory. The directory must be empty in order to be removed:

In [83]:
if text2.exists():
    text2.unlink()

In [84]:
directory.rmdir()

[Return to Python Tutorials](../readme.md)