# pathlib - Path Objects in Python

### Links
pathlib - https://docs.python.org/3/library/pathlib.html

os.path - https://docs.python.org/3/library/os.path.html


### A Quick Example

In [1]:
from pathlib import Path # "Path is most likely what you'll need"

In [2]:
Path.cwd()

WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup')

In [3]:
path = Path("requirements.txt")
path

WindowsPath('requirements.txt')

In [4]:
path.exists()

True

In [5]:
path.absolute().parts

('c:\\',
 'Users',
 'calvin',
 'Documents',
 'GitHub',
 'pathlib-os-boston-meetup',
 'requirements.txt')

## Paths as  Objects

### Pure Paths and Concrete Paths
Pure paths are objects without respect to the file system.

Concrete paths can interact with the file system. They inherit from Pure paths.

Most likely you'll just use the <b>Path</b> class because it gives you everything a PurePath does and i/o operations without any real tradeoffs (see appendix).

In [6]:
from pathlib import PurePath, Path

In [7]:
# Change to your path
path_string = "c:/Users/calvin/Documents/Github/pathlib-os-boston-meetup/requirements.txt"

In [8]:
pure_path = PurePath(path_string)
pure_path

PureWindowsPath('c:/Users/calvin/Documents/Github/pathlib-os-boston-meetup/requirements.txt')

In [9]:
path = Path(path_string)
path

WindowsPath('c:/Users/calvin/Documents/Github/pathlib-os-boston-meetup/requirements.txt')

In [10]:
pure_path.parts

('c:\\',
 'Users',
 'calvin',
 'Documents',
 'Github',
 'pathlib-os-boston-meetup',
 'requirements.txt')

In [11]:
path.parts

('c:\\',
 'Users',
 'calvin',
 'Documents',
 'Github',
 'pathlib-os-boston-meetup',
 'requirements.txt')

In [12]:
pure_path.exists()

AttributeError: 'PureWindowsPath' object has no attribute 'exists'

In [13]:
path.exists()

True

### Flavo(u)rs

Flavors/OS - Windows or POSIX

Both PurePath and Path have these flavors

Purepath - PureWindowsPath, PurePosixPath

Path - WindowsPath, PosixPath

PurePath and Path objects will figure it out for you - as you can see from the example above. The use the os.name to figure it out under the hood.

os.name == 'nt' (Windows) or not (POSIX)
https://github.com/python/cpython/blob/ba16324b276c7b2b5ecf09479f30fc82c12192ae/Lib/pathlib.py#L469

In [14]:
from pathlib import PureWindowsPath, PurePosixPath, WindowsPath, PosixPath

Pure paths will let me use Posix, but not concrete paths

In [15]:
PurePosixPath(path_string)

PurePosixPath('c:/Users/calvin/Documents/Github/pathlib-os-boston-meetup/requirements.txt')

In [16]:
PosixPath(path_string)

NotImplementedError: cannot instantiate 'PosixPath' on your system

Some Windows and POSIX flavor difference:

* Windows handles UNC paths 
* Slashes are different. - / and \
* POSIX is case sensitive, Windows is not.

### Path Parts

In [17]:
path = Path(path_string)
path.parts

('c:\\',
 'Users',
 'calvin',
 'Documents',
 'Github',
 'pathlib-os-boston-meetup',
 'requirements.txt')

In [18]:
{
    "path": str(path),
    "anchor": path.anchor,
    "parent": path.parent,
    "name": path.name,
    "suffix": path.suffix,
    "suffixes": path.suffixes,
    "stem": path.stem,
}

{'path': 'c:\\Users\\calvin\\Documents\\Github\\pathlib-os-boston-meetup\\requirements.txt',
 'anchor': 'c:\\',
 'parent': WindowsPath('c:/Users/calvin/Documents/Github/pathlib-os-boston-meetup'),
 'name': 'requirements.txt',
 'suffix': '.txt',
 'suffixes': ['.txt'],
 'stem': 'requirements'}

### Simple Methods

In [19]:
(
    str(path),
    path.is_absolute(),
    path.is_relative_to("c:\\Users"),
    # Create similar paths
    path.with_name("dev-requirements.txt"),
    path.with_stem("dev-requirements"),
    path.with_suffix(".in"),
)

('c:\\Users\\calvin\\Documents\\Github\\pathlib-os-boston-meetup\\requirements.txt',
 True,
 True,
 WindowsPath('c:/Users/calvin/Documents/Github/pathlib-os-boston-meetup/dev-requirements.txt'),
 WindowsPath('c:/Users/calvin/Documents/Github/pathlib-os-boston-meetup/dev-requirements.txt'),
 WindowsPath('c:/Users/calvin/Documents/Github/pathlib-os-boston-meetup/requirements.in'))

In [20]:
# Concrete Path Only - These interact with the file system
(
    str(path),
    path.cwd(),# using instance of Path
    Path.cwd(), # using Path class
    path.home(), # using instance of Path
    Path.home(), # using Path class
    path.stat(),
    path.is_dir(),
    path.is_file(),
    path.absolute(),
    path.resolve(),
    
)

('c:\\Users\\calvin\\Documents\\Github\\pathlib-os-boston-meetup\\requirements.txt',
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup'),
 WindowsPath('C:/Users/calvin'),
 WindowsPath('C:/Users/calvin'),
 os.stat_result(st_mode=33206, st_ino=11540474045207525, st_dev=2093496561, st_nlink=1, st_uid=0, st_gid=0, st_size=19, st_atime=1683825588, st_mtime=1683824985, st_ctime=1683742556),
 False,
 True,
 WindowsPath('c:/Users/calvin/Documents/Github/pathlib-os-boston-meetup/requirements.txt'),
 WindowsPath('C:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/requirements.txt'))

### Joining Paths
You can use .joinpath() or the "/" operator 😮

In [21]:
Path("c:/Users") / Path("cforbes")

WindowsPath('c:/Users/cforbes')

In [22]:
Path("c:/Users").joinpath(Path("cforbes"))

WindowsPath('c:/Users/cforbes')

In [23]:
# mix and match strings and Path objects
Path("c:/Users") / "cforbes" / Path("Documents")

WindowsPath('c:/Users/cforbes/Documents')

### Change Files

Create, change, and remove file

In [24]:
new_file = Path("change_this.txt")

In [25]:
new_file.exists()

False

In [26]:
# Create Empty File
new_file.touch()

In [27]:
new_file.exists()

True

In [28]:
# chmod
import stat
new_file.chmod(stat.S_IWRITE) # there is no chown in pathlib - use os.chown()

In [29]:
# Rename
new_file_renamed = new_file.rename("change_this_1.txt") # new_file gets removed 
new_file.exists(), new_file_renamed.exists()

(False, True)

In [30]:
# Delete File
new_file_renamed.unlink()
new_file_renamed.exists()

False

Create and remove directory

In [31]:
new_dir = Path("change_me")
new_dir.exists(), new_dir.is_dir()

(False, False)

In [32]:
# Make Directory
new_dir.mkdir()
new_dir.exists(), new_dir.is_dir()

(True, True)

In [33]:
# Remove Directory
new_dir.rmdir()
new_dir.exists(), new_dir.is_dir()

(False, False)

### Iterate through Files

In [34]:
path = Path.cwd()
path

WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup')

In [35]:
iter_path = path.iterdir()
iter_path

<generator object Path.iterdir at 0x0000027F8454EA40>

If you delete or create a file after instantiating iterdir, "whether a path object for that file be included is unspecified".

In [36]:
list(iter_path)

[WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/.git'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/.gitignore'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/.venv'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/exercises'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/file_list.txt'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/gen_file_paths.py'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/notes.txt'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/pathlib_presentation.ipynb'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/README.md'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/requirements.txt'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/sample.json'),
 WindowsPath('c:/Users/calvin/Docum

.glob and  .rglob - https://en.wikipedia.org/wiki/Glob_(programming)


In [37]:
list(
    Path.cwd().glob("*.py")
)

[WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/gen_file_paths.py'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/speedtest.py')]

In [38]:
# Recursive Glob - equivilent to glob with "**/*.py"
list(
    Path.cwd().rglob("*.py")
)

[WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/gen_file_paths.py'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/speedtest.py'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/.venv/Lib/site-packages/decorator.py'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/.venv/Lib/site-packages/ipykernel_launcher.py'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/.venv/Lib/site-packages/jupyter.py'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/.venv/Lib/site-packages/nest_asyncio.py'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/.venv/Lib/site-packages/pickleshare.py'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/.venv/Lib/site-packages/pythoncom.py'),
 WindowsPath('c:/Users/calvin/Documents/GitHub/pathlib-os-boston-meetup/.venv/Lib/site-packages/six.py'),
 WindowsPath('c:/Users/calvin

### Reading and Writing Files

1. Context manager way with path string
2. Context manager way with Path object
3. Path read/write methods - open and close automatically.


In [39]:
import json
import pickle
text_file = "file_list.txt"
json_file = "sample.json"
pickle_file = "sample.pickle"


Text File

In [40]:
# 1
with open(text_file) as f:
    result = f.read()
    
# 2    
path = Path(text_file)
with path.open() as f:
    result = f.read()
    
# 3
result = path.read_text()

JSON

In [41]:

# 1
with open(json_file) as f:
    result = json.load(f)
    
# 2    
path = Path(json_file)
with path.open() as f:
    result = json.load(f)
    
# 3
result = json.loads(path.read_text())
result

{'animals': ['cat', 'dog', 'rabbit', 'rat']}

Pickle - bytes

In [42]:
# 1
with open(pickle_file, "rb") as f:
    result = pickle.load(f)
    
# 2
path = Path(pickle_file)
with path.open("rb") as f:
    result = pickle.load(f)
    
# 3
result = pickle.loads(path.read_bytes())
result

'thisisapythonobject'

### some notes/comparisons - os.path
* The os.path will return a string. Pathlib will return an object. 
* Some os methods are less intuitive, harder to remember.
* os.walk() alternatives - glob
* os.Pathlike - ABC for pathlib and os.path objects

## APPENDIX

### PurePath vs Path speed

Almost no difference in speed between the two.

In [43]:
# Load a bunch of file path strings (1 million)
# file_list.txt created by gen_file_paths.py
lots_of_paths = Path("file_list.txt").read_text().splitlines()

In [44]:
%%timeit
# Path
for i in lots_of_paths:
    p = Path(i)
    p.name

3.81 s ± 81.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [45]:
%%timeit
# PurePath
for i in lots_of_paths:
    p = PurePath(i)
    p.name

3.63 s ± 67.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
