# A primer on Pathlib

[`Pathlib`](https://docs.python.org/3/library/pathlib.html) is one of the most amazing Python libraries. It's incredibly beautiful, well designed and written and convenient for day to day work. It's about file and directory management. It has a highly intuitive and beautiful API. Wait, "beautiful API"?

![simpsons nerd](https://media1.tenor.com/images/32bc92e5ab305b8c5ad3edac00de47e9/tenor.gif?itemid=7884166)

`Pathlib` is a builtin library **only** in Python 3, if you want to use it in Python 2 you need to install a 3rd party library, which is not official and might have some inconsistencies with the official Py3 one.

Pathlib API is inspired by the regular usage of paths in \*nix operating systems, which makes it intuitive and familiar for developers. In this notebook, we'll explore different situations where Pathlib shows all its power and strength.

### File exists? Is it a file or a directory?

How can you check if a file exists in Python? Regular solution ([most popular StackOverflow answer](https://stackoverflow.com/questions/82831/how-to-check-whether-a-file-exists)) shows something like:

In [1]:
import os
os.path.exists('alice.txt')

True

In [2]:
os.path.isfile('alice.txt')

True

In [3]:
os.path.isdir('alice.txt')

False

Pathlib solution:

In [4]:
from pathlib import Path
path = Path('alice.txt')

In [5]:
path.exists()

True

In [6]:
path.is_file()

True

In [7]:
path.is_dir()

False

In [8]:
path

PosixPath('alice.txt')

Much cleaner, intuitive and Object Oriented, isn't it? With `Pathlib`, you just need to create a `Path` object, which is an "abstract interface" that bases its concrete implementation according to the Operating System:

![pathlib hierarchy](https://docs.python.org/3/_images/pathlib-inheritance.png)

### Concatenating Paths

Joining "parts" of paths can be tedious because you need to be aware of the Operating System syntax to express paths. For example, Linux and Mac use forward slash `/` to join paths, while windows uses backslashes `\`. Example of paths:

* Linux/Mac: `data/subdir/0005.txt`
* Windows: `data\subdir\0005.txt`

So, if you have those paths expressed in parts, it's hard to combine them to read the content:

In [9]:
BASE_DIR = 'data'
SUBDIR = 'subdir'
FILE_NAME = '0005.txt'

Using the `os` module:

In [10]:
os.path.join(BASE_DIR, SUBDIR)

'data/subdir'

In [11]:
os.path.join(os.path.join(BASE_DIR, SUBDIR), FILE_NAME)

'data/subdir/0005.txt'

Using the pathlib module:

In [12]:
BASE_PATH = Path(BASE_DIR)

In [13]:
BASE_PATH / SUBDIR / FILE_NAME

PosixPath('data/subdir/0005.txt')

In this case, the `/` operator (division for integers) is overloaded and assigned a different "meaning" per Pathlib's API. Isn't `/` intuitive to join paths? And given a full path, it's much easier to split it into each part:

In [14]:
p1 = Path('/home/rmotr/code/python/main.py')
p2 = Path('C:/home/rmotr/code/python/main.py')

In [15]:
p1.parts

('/', 'home', 'rmotr', 'code', 'python', 'main.py')

In [16]:
p2.parts

('C:', 'home', 'rmotr', 'code', 'python', 'main.py')

Plus a few other convenient methods:

In [17]:
p1.root

'/'

In [18]:
p2.root

''

In [31]:
p1.parents[0]

PosixPath('/home/rmotr/code/python')

In [32]:
p2.parents[0]

PosixPath('C:/home/rmotr/code/python')

In [33]:
p1.parent

PosixPath('/home/rmotr/code/python')

In [34]:
p2.parent

PosixPath('C:/home/rmotr/code/python')

### Extracting suffixes

How can you get the suffix and name of the file under `/home/rmotr/code/python/main.py`? Regular `os` solution:

In [35]:
os.path.splitext('/home/rmotr/code/python/main.py')

('/home/rmotr/code/python/main', '.py')

`os.path.splitext` isn't really intuitive, what about Pathlib?

In [36]:
p = Path('/home/rmotr/code/python/main.py')

In [37]:
p.suffix

'.py'

In [38]:
p.stem

'main'

### Getting current dir and Home dir

This is one of the [proposed solutions](https://stackoverflow.com/questions/5137497/find-current-directory-and-files-directory) to get the current path using the `os` module:

```python
os.path.dirname(os.path.realpath(__main__))
```
Not pretty. What about the other one:

In [39]:
os.getcwd()

'/app'

A little better. What about Pathlib?

In [40]:
Path.cwd()  # cwd: current working dir

PosixPath('/app')

What about the Home directory:

In [41]:
from os.path import expanduser
expanduser("~")

'/root'

`expanduser` 😕

Pathlib:

In [42]:
Path.home()

PosixPath('/root')

![feels good](https://user-images.githubusercontent.com/872296/37731886-3856a696-2d22-11e8-9fd4-05be4b6672df.png)


### Creating directories

Let's create a new subdir using `os`:

In [43]:
new_dir_name = 'new-dir-1'
new_dir = os.path.join(os.path.join(BASE_DIR, SUBDIR), new_dir_name)
os.makedirs(new_dir)

In [44]:
os.path.exists(new_dir)

True

But with Pathlib, as expected, it's a lot more intuitive:

In [45]:
p = Path(BASE_DIR) / SUBDIR / 'new-dir-2'

In [None]:
p.mkdir()

In [47]:
p.exists()

True

### Searching for things

This is the last strike for `os`. There aren't many ways of doing what we'll do without Pathlib. So, we'll just skip the `os` versions.

Looking for all the `txt` files in a directory:

In [50]:
sorted(Path('data').glob('*.txt'))

[PosixPath('data/0001.txt'),
 PosixPath('data/0002.txt'),
 PosixPath('data/0003.txt'),
 PosixPath('data/0004.txt'),
 PosixPath('data/0005.txt')]

Now, looking for files **recursively** (hold my beer):

In [51]:
sorted(Path('data').glob('**/*.txt'))

[PosixPath('data/0001.txt'),
 PosixPath('data/0002.txt'),
 PosixPath('data/0003.txt'),
 PosixPath('data/0004.txt'),
 PosixPath('data/0005.txt'),
 PosixPath('data/subdir/0005.txt'),
 PosixPath('data/subdir/0006.txt')]

It also found stuff under `subdir`!

![mind blown](http://www.reactiongifs.com/wp-content/uploads/2013/10/tim-and-eric-mind-blown.gif)

You've now seen how useful and clean the Pathlib API is. We encourage you to check other useful methods and practice with it: https://docs.python.org/3/library/pathlib.html