# pathlib module

In the previous notebook, the ```os``` module was examined. This module contained the submodule ```os.path``` which was used for working with file paths. This module was seen to be functional and most of the functions returned strings.

The ```pathlib``` module can be imported using:

In [1]:
import pathlib

To view the identifiers the function ```print_identifier_group``` from the custom ```helper_module``` will be imported:

In [2]:
from helper_module import print_identifier_group

The ```pathlib``` module has a number of classes that are in ```PascalCase```:

In [3]:
print_identifier_group(pathlib, kind='error_class')

['Path', 'PosixPath', 'PurePath', 'PurePosixPath', 'PureWindowsPath', 'WindowsPath']


In ```builtins``` there was a distinction between error classes that used ```PascalCase``` and fundamental classes such as the ```str```, ```int```, ```float```, ```tuple```, ```list``` and ```dict``` that were in lower case. Normally thirdparty classes are in ```PascalCase``` like these within ```pathlib```.

The most commonly used class in ```pathlib``` is ```'Path'```. This selects the appropriate path for your Operating System ```WindowsPath``` on Windows or ```'PosixPath'``` on Linux/Mac. These are only normally directly selected if a Windows path is required on a Linux/Mac machine or a Linux/Mac path is required on a Windows machine. 

```Path``` is a child class of ```PurePath```. ```PurePath``` is used purely for hardcoded paths. ```Path``` has additional support for user specific relative paths, in addition to input and output operations. Both of these classes can be imported and their intersection identifiers examined:

In [4]:
from pathlib import Path, PurePath

In [5]:
print_identifier_group(Path, kind='datamodel_attribute', second=PurePath, show_only_intersection_identifiers=True)

['__doc__', '__module__', '__slots__']


In [6]:
print_identifier_group(Path, kind='datamodel_method', second=PurePath, show_only_intersection_identifiers=True)

['__bytes__', '__class__', '__delattr__', '__dir__', '__eq__', '__format__', '__fspath__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rtruediv__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__truediv__']


In [7]:
print_identifier_group(Path, kind='attribute', second=PurePath, show_only_intersection_identifiers=True)

['_cached_cparts', '_cparts', '_drv', '_hash', '_parts', '_pparts', '_root', '_str', 'anchor', 'drive', 'name', 'parent', 'parents', 'parts', 'root', 'stem', 'suffix', 'suffixes']


In [8]:
print_identifier_group(Path, kind='method', second=PurePath, show_only_intersection_identifiers=True)

['_format_parsed_parts', '_from_parsed_parts', '_from_parts', '_make_child', '_parse_args', 'as_posix', 'as_uri', 'is_absolute', 'is_relative_to', 'is_reserved', 'joinpath', 'match', 'relative_to', 'with_name', 'with_stem', 'with_suffix']


The Windows Operating System uses ```\``` as a seperator between folders and files. In Python ```\``` is used in a string to represent an escape character. To insert ```\``` as an escape character the string contains ```\\```:

In [9]:
'C:\\Windows'

'C:\\Windows'

And if this is printed:

In [10]:
print('C:\\Windows')

C:\Windows


Manually converting each ```\``` to a ```\\``` can be tedious for a long file path and the string can be prefixed with ```r``` to make a raw string. In a raw string the ```\``` represents the character backslash and there is no means to insert an escape character:

In [11]:
r'C:\Windows'

'C:\\Windows'

In [12]:
print(r'C:\Windows')

C:\Windows


An instance of the ```Path``` class can be instantiated from one of the rawstrings:

In [13]:
windows_folder = Path(r'C:\Windows')

Notice when the formal representation of the ```windows_folder``` instance of the ```Path``` class is examined that the ```\``` is changed to a ```/``` which is recognised as an alternative folder and file seperator on Windows and the default on Linux.

In [14]:
windows_folder

WindowsPath('C:/Windows')

The datamodel method ```__truediv__``` (*dunder truediv*) is defined which recall defines the behaviour of the ```/``` operator. This is used for concatenation of a directory to the file path:

In [15]:
system_32 = windows_folder / 'System32'

In [16]:
system_32

WindowsPath('C:/Windows/System32')

And the ```notepad.exe``` application is found here:

In [17]:
app = windows_folder / 'System32' / 'notepad.exe'

In [18]:
app

WindowsPath('C:/Windows/System32/notepad.exe')

The ```joinpath``` method of the ```Path``` class carries out a similar function:

In [19]:
app2 = windows_folder.joinpath('System32', 'notepad.exe')

In [20]:
app2

WindowsPath('C:/Windows/System32/notepad.exe')

This ```Path``` instance has a number of attributes for example the app ```name```:

In [21]:
app.name

'notepad.exe'

Which includes the app ```stem``` and ```suffix```:

In [22]:
app.stem

'notepad'

In [23]:
app.suffix

'.exe'

The suffix can also be added to a list:

In [24]:
app.suffixes

['.exe']

The ```parent``` directory:

In [25]:
app.parent

WindowsPath('C:/Windows/System32')

The ```parents``` directory, which is typically indexed:

In [26]:
app.parents

<WindowsPath.parents>

In [27]:
app.parents[0]

WindowsPath('C:/Windows/System32')

In [28]:
app.parents[1]

WindowsPath('C:/Windows')

In [29]:
app.parents[2]

WindowsPath('C:/')

In [30]:
tuple(app.parents)

(WindowsPath('C:/Windows/System32'),
 WindowsPath('C:/Windows'),
 WindowsPath('C:/'))

The anchor includes the dirve and the root:

In [31]:
app.anchor

'C:\\'

In [32]:
app.drive

'C:'

In [33]:
app.root

'\\'

The ```is_absolute``` method of the ```Path``` class will check if the instance corresponds to an absolute path:

In [34]:
app.is_absolute()

True

In [35]:
Path().joinpath('C:/', 'Windows', 'System32')

WindowsPath('C:/Windows/System32')

The ```is_relative_to``` method of the ```Path``` class will check if the supplied ```Path``` instance is a root directory:

In [36]:
windows_folder

WindowsPath('C:/Windows')

In [37]:
system_32

WindowsPath('C:/Windows/System32')

In [38]:
app.is_relative_to(windows_folder)

True

If it is a root directory the ```relative_to``` method will return a string that can be used, to get to the directory from that root:

In [39]:
app.relative_to(windows_folder)

WindowsPath('System32/notepad.exe')

The ```is_reserved``` method of the ```Path``` class will check if the instance corresponds to a path name that is reserved by the operating system such as:

* CON - Console
* PRN - Printer
* AUX - Auxiliary device
* NUL - Null device
* COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9 - Serial ports
* LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9 - Parallel ports

In [40]:
Path('CON').is_reserved()

True

The ```match``` method of the ```Path``` class can be used to see if the ```Path``` instance matches a pattern, for example:

In [41]:
app.match('*.exe')

True

In [42]:
app.match?

[1;31mSignature:[0m [0mapp[0m[1;33m.[0m[0mmatch[0m[1;33m([0m[0mpath_pattern[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m Return True if this path matches the given pattern.
[1;31mFile:[0m      c:\users\philip\miniconda3\envs\vscode\lib\pathlib.py
[1;31mType:[0m      method

The ```with_stem```, ```with_name``` and ```with_suffix``` methods of the ```Path``` class can be used to change the ```stem``` of a file/program maintaining the file extension, ```name``` giving a subfolder without a file extension or used to change the file extension respectively:

In [43]:
app.with_stem('explorer')

WindowsPath('C:/Windows/System32/explorer.exe')

In [44]:
app.with_name('debug')

WindowsPath('C:/Windows/System32/debug')

In [45]:
app.with_suffix('.txt')

WindowsPath('C:/Windows/System32/notepad.txt')

Thae attribute ```parts``` will break down components of a path into a ```tuple```:

In [46]:
app.parts

('C:\\', 'Windows', 'System32', 'notepad.exe')

The ```Path``` class has additional support for user specific relative paths, in addition to input and output operations:

In [47]:
print_identifier_group(Path, kind='datamodel_attribute', second=PurePath, show_unique_identifiers=True)

[]


In [48]:
print_identifier_group(Path, kind='datamodel_method', second=PurePath, show_unique_identifiers=True)

['__enter__', '__exit__']


In [49]:
print_identifier_group(Path, kind='attribute', second=PurePath, show_unique_identifiers=True)

[]


In [50]:
print_identifier_group(Path, kind='method', second=PurePath, show_unique_identifiers=True)

['_make_child_relpath', '_scandir', 'absolute', 'chmod', 'cwd', 'exists', 'expanduser', 'glob', 'group', 'hardlink_to', 'home', 'is_block_device', 'is_char_device', 'is_dir', 'is_fifo', 'is_file', 'is_mount', 'is_socket', 'is_symlink', 'iterdir', 'lchmod', 'link_to', 'lstat', 'mkdir', 'open', 'owner', 'read_bytes', 'read_text', 'readlink', 'rename', 'replace', 'resolve', 'rglob', 'rmdir', 'samefile', 'stat', 'symlink_to', 'touch', 'unlink', 'write_bytes', 'write_text']


```Path``` has two class methods ```cwd``` and ```home``` which are used as alternative constructors to create a ```Path``` instance correspondin to the current working directory and user profile respectively:

In [51]:
current_working_directory = Path.cwd()

In [52]:
current_working_directory

WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module')

In [53]:
user_profile = Path.home()

In [54]:
user_profile

WindowsPath('C:/Users/Philip')

To get to Documents, the ```/``` operator can be used:

In [55]:
documents = user_profile / 'Documents'

In [56]:
documents

WindowsPath('C:/Users/Philip/Documents')

The ```expanduser``` instance method is normally called from a ```Path``` instance beginning with ```~```. The ```~``` is expanded to the user profile:

In [57]:
relative_documents = Path('~/Documents')

In [58]:
relative_documents

WindowsPath('~/Documents')

In [59]:
relative_documents.expanduser()

WindowsPath('C:/Users/Philip/Documents')

Normally this is used directly:

In [60]:
documents = Path('~/Documents').expanduser()

In [61]:
documents

WindowsPath('C:/Users/Philip/Documents')

A ```Path``` instance to a ```text``` file in the current working directory can be created using:

In [62]:
text = Path.cwd() / 'text.txt'

The ```check``` method of the ```Path``` class can be used to check whether or not this file exists:

In [63]:
text.exists()

False

The ```touch``` method of the ```Path``` class can be used to create a file in the location specified in the ```Path``` instance:

In [99]:
if not text.exists():
    text.touch()

The ```mkdir``` method of the ```Path``` class can used to make a new directory in the location specified in the ```Path``` instance:

In [65]:
directory = Path.cwd() / 'new_folder'

In [66]:
if not directory.exists():
    directory.mkdir()

And a new file can also be created here:

In [67]:
text2 = directory / 'text2.txt'

In [92]:
if not text2.exists():
    text2.touch()

The ```iterdir``` method of a root ```Path``` instance can be used to create an iterator. The iterator cycles through all the files and subdirectories in the root folder and when ```next``` is used on the iterator the corresponding ```Path``` instance displays:

In [69]:
directory_iterator = Path.cwd().iterdir()

In [70]:
directory_iterator

<generator object Path.iterdir at 0x000001F03FE08E40>

In [71]:
next(directory_iterator)

WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/helper_module.py')

In [72]:
next(directory_iterator)

WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/new_folder')

In [73]:
next(directory_iterator)

WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/notebook.ipynb')

The iterator can also be cast into a ```tuple```:

In [74]:
tuple(Path.cwd().iterdir())

(WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/helper_module.py'),
 WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/new_folder'),
 WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/notebook.ipynb'),
 WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/text.txt'),
 WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/__pycache__'))

The ```rename``` method of the ```Path``` class can be used to rename a file:

In [75]:
new_text = Path.cwd() / 'new_text.txt'

In [76]:
text.rename(new_text)

WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/new_text.txt')

The ```open``` method of the ```Path``` class is essentially a wrapper around the ```open``` function of the ```io``` module, the method does not require specifiction of the file name as its taken from the ```Path``` instance.

The datamodel identifiers ```__enter__``` (*dunder enter*) and ```__exit__``` (*dunder exit*) are also defined meaning this can be used in a ```with``` code block.

In [77]:
with new_text.open(mode='w', encoding='utf-8', newline='\r\n') as file:
    file.writelines(['Hello World!\n', 'Hello'])

In [78]:
with new_text.open(mode='r', encoding='utf-8', newline='\r\n') as file:
    string = file.read()

string

'Hello World!\r\nHello'

The ```read_text``` and ```write_text``` methods from the ```Path``` instance make reading and writing the text files simpler. Unlike when the method ```open``` is used, the file will automatcally be closed after the method is carried out. The ```read_bytes``` and ```write_bytes``` are the ```bytes``` counterparts:

In [79]:
new_text.read_text(encoding='utf-8')

'Hello World!\nHello'

Writing will overrride the old contents of the file:

In [80]:
new_text.write_text('Bye World!\nBye', encoding='utf-8', newline='\r\n')

14

In [81]:
new_text.read_text(encoding='utf-8')

'Bye World!\nBye'

There is no equivalent append method of the ```Path``` class however the method ```open``` can be used for appending text as previously seen.

Another file ```text3.txt``` can be created using:

In [82]:
text3 = Path.cwd() / 'text3.txt'

In [83]:
text3.touch()

In [84]:
text3.write_text('Hello World!\nHello')

18

The ```replace``` method of the ```Path``` class can be used to replace a file with another file:

In [85]:
new_text.replace(text3)

WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/text3.txt')

In [86]:
text3.read_text(encoding='utf-8')

'Bye World!\nBye'

In [87]:
tuple(Path.cwd().iterdir())

(WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/helper_module.py'),
 WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/new_folder'),
 WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/notebook.ipynb'),
 WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/text3.txt'),
 WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/__pycache__'))

The ```unlink``` method of the ```Path``` class can be used to delete the file that the instance links to:

In [88]:
if text3.exists():
    text3.unlink()

In [89]:
tuple(Path.cwd().iterdir())

(WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/helper_module.py'),
 WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/new_folder'),
 WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/notebook.ipynb'),
 WindowsPath('c:/Users/Philip/Documents/GitHub/python-notebooks/pathlib_module/__pycache__'))

The ```rmdir``` method of the ```Path``` can be used to remove a directory. The directory must be empty in order to be removed:

In [97]:
if text2.exists():
    text2.unlink()

In [98]:
directory.rmdir()