# `pathlib` primer

* Available from Python 3.4
* [pathlib doc](https://docs.python.org/3/library/pathlib.html)

## Imports

In [1]:
from pathlib import Path

## Basic use cases

### Get current working directory or home directory

In [2]:
Path.cwd()

PosixPath('/home/pawjast/Documents/my github/medium/code')

In [3]:
Path.home()

PosixPath('/home/pawjast')

### Turn location into Path

In [4]:
# Using forward slash
my_path = Path.cwd() / "datasets" / "data_4"
my_path

PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4')

In [5]:
# Using `joinpath` method
my_path = Path.cwd().joinpath("datasets", "data_4")
my_path

PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4')

## Current directory - list files and directories

In [6]:
# Using list comprehension and `iterdir()` method
[path for path in my_path.iterdir()]

[PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/hi.md'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data1.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data2.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/abc.txt'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/cde.txt'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/temp')]

In [7]:
# Using `.glob()` method
list(my_path.glob("*"))

[PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/hi.md'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data1.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data2.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/abc.txt'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/cde.txt'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/temp')]

## Current directory and sub directories - list files and directories

In [8]:
list(my_path.rglob("*"))

[PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/hi.md'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data1.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data2.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/abc.txt'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/cde.txt'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/temp'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/abc.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/go_deeper'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/whatshere.md'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/hello.html'),
 PosixPath('/h

## Relative paths

In [9]:
[path.relative_to(my_path) for path in my_path.rglob("*")]

[PosixPath('more_data'),
 PosixPath('hi.md'),
 PosixPath('data1.csv'),
 PosixPath('data2.csv'),
 PosixPath('abc.txt'),
 PosixPath('cde.txt'),
 PosixPath('temp'),
 PosixPath('more_data/abc.csv'),
 PosixPath('more_data/go_deeper'),
 PosixPath('more_data/whatshere.md'),
 PosixPath('more_data/hello.html'),
 PosixPath('more_data/yyy.txt'),
 PosixPath('more_data/data3.csv'),
 PosixPath('more_data/xxx.txt'),
 PosixPath('more_data/go_deeper/abcd.py'),
 PosixPath('more_data/go_deeper/surprise.csv'),
 PosixPath('temp/dead_end')]

## Various filters

### List all files only

In [10]:
# In current folder - way 1
[path for path in my_path.iterdir() if path.is_file()]

[PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/hi.md'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data1.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data2.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/abc.txt'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/cde.txt')]

In [11]:
# In current folder - way 2
[path for path in my_path.glob("*") if path.is_file()]

[PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/hi.md'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data1.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data2.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/abc.txt'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/cde.txt')]

In [12]:
# In current folder and sub directories
[path for path in my_path.rglob("*") if path.is_file()]

[PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/hi.md'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data1.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data2.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/abc.txt'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/cde.txt'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/abc.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/whatshere.md'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/hello.html'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/yyy.txt'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/data3.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/xxx.

### List all directories only

In [13]:
# In current folder - way 1
[path for path in my_path.iterdir() if path.is_dir()]

[PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/temp')]

In [14]:
# In current folder - way 2
[path for path in my_path.glob("*") if path.is_dir()]

[PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/temp')]

In [15]:
# In current folder and sub directories
[path for path in my_path.rglob("*") if path.is_dir()]

[PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/temp'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/go_deeper'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/temp/dead_end')]

In [16]:
sorted([path for path in my_path.rglob("*") if path.is_dir()])

[PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/go_deeper'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/temp'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/temp/dead_end')]

### List specific file extensions only

**Note:** this works both for `glob` and `rglob()`

In [17]:
list(my_path.glob("*.txt"))

[PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/abc.txt'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/cde.txt')]

In [18]:
list(my_path.rglob("*.csv"))

[PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data1.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/data2.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/abc.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/data3.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/go_deeper/surprise.csv')]

### List files that match pattern

In [19]:
list(my_path.rglob("*abc*"))

[PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/abc.txt'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/abc.csv'),
 PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/more_data/go_deeper/abcd.py')]

## Single path methods 

### For a file

In [20]:
single_path = [path for path in my_path.iterdir()][1]
single_path

PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4/hi.md')

In [21]:
single_path.name

'hi.md'

In [22]:
single_path.stem

'hi'

In [23]:
single_path.suffix

'.md'

In [24]:
single_path.parent

PosixPath('/home/pawjast/Documents/my github/medium/code/datasets/data_4')

In [25]:
single_path.parts

('/',
 'home',
 'pawjast',
 'Documents',
 'my github',
 'medium',
 'code',
 'datasets',
 'data_4',
 'hi.md')