# Patching Python builtins (third-party library compatibility)

Not every Python library is implemented to accept pathlib-compatible objects like those implemented by cloudpathlib. Many libraries will only accept strings as filepaths. These libraries then may internally use `open`, functions from `os` and `os.path`, or other core library modules like `glob` to navigate paths and manipulate them.

This means that out-of-the-box you can't just pass a `CloudPath` object to any method or function and have it work. For those implemented with `pathlib`, this will work. For anything else the code will throw an exception at some point.

The long-term solution is to ask developers to implement their library to support either (1) pathlib-compatible objects for files and directories, or (2) file-like objects passed directly (e.g., so you could call `CloudPath.open` in your code and pass the the file-like object to the library).

The short-term workaround that will be compatible with some libraries is to patch the builtins to make `open`, `os`, `os.path`, and `glob` work with `CloudPath` objects. Because this overrides default Python functionality, this is not on by default. When patched, these functions will use the `CloudPath` version if they are passed a `CloudPath` and will fallback to their normal implementations otherwise.

These methods can be enabled by setting the following environment variables:
 - `CLOUDPATHLIB_PACTH_ALL=1` - patch all the builtins we implement: `open`, `os` functions, and `glob`
 - `CLOUDPATHLIB_PACTH_OPEN=1` - patch the builtin `open` method
 - `CLOUDPATHLIB_PACTH_OS_FUNCTIONS=1` - patch the `os` functions
 - `CLOUDPATHLIB_PACTH_GLOB=1` - patch the `glob` module

You can set environment variables in many ways, but it is common to either pass it at the command line with something like `CLOUDPATHLIB_PACTH_ALL=1 python my_script.py` or to set it in your Python script with `os.environ['CLOUDPATHLIB_PACTH_ALL'] = 1`. Note, these _must_ be set before any `cloudpathlib` methods are imported.

Alternatively, you can call methods to patch the functions.

```python
from cloudpathlib import patch_open, patch_os_functions, patch_glob

# patch builtins
patch_open()
patch_os_functions()
patch_glob()
```

These patch methods are all context managers, so if you want to control where the patch is active, you can use them in a `with` statement. For example:

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from glob import glob

from cloudpathlib import patch_glob, CloudPath

try:
    glob(CloudPath("s3://cloudpathlib-test-bucket/manual-tests/**/*dir*/**"))
except Exception as e:
    print("Unpatched version fails:")
    print(e)


with patch_glob():
    print("Patched succeeds:")
    print(glob(CloudPath("s3://cloudpathlib-test-bucket/manual-tests/**/*dir*/**/*")))

    # or equivalently
    print("`glob` module now is equivalent to `CloudPath.glob`")
    print(glob("**/*dir*/**/*", root_dir=CloudPath("s3://cloudpathlib-test-bucket/manual-tests/")))

Unpatched version fails:
'S3Path' object is not subscriptable
Patched succeeds:
[S3Path('s3://cloudpathlib-test-bucket/manual-tests/dirB/fileB'), S3Path('s3://cloudpathlib-test-bucket/manual-tests/dirC/dirD'), S3Path('s3://cloudpathlib-test-bucket/manual-tests/dirC/fileC'), S3Path('s3://cloudpathlib-test-bucket/manual-tests/dirC/dirD/fileD'), S3Path('s3://cloudpathlib-test-bucket/manual-tests/nested-dir/test.file'), S3Path('s3://cloudpathlib-test-bucket/manual-tests/dirC/dirD/fileD'), S3Path('s3://cloudpathlib-test-bucket/manual-tests/glob_test/dirB/fileB'), S3Path('s3://cloudpathlib-test-bucket/manual-tests/glob_test/dirC/dirD'), S3Path('s3://cloudpathlib-test-bucket/manual-tests/glob_test/dirC/fileC'), S3Path('s3://cloudpathlib-test-bucket/manual-tests/glob_test/dirC/dirD/fileD'), S3Path('s3://cloudpathlib-test-bucket/manual-tests/glob_test/dirC/dirD/fileD')]
`glob` module now is equivalent to `CloudPath.glob`
[S3Path('s3://cloudpathlib-test-bucket/manual-tests/dirB/fileB'), S3Path('

We can see a similar result for patching the functions in the `os` module.

In [3]:
import os

from cloudpathlib import patch_os_functions, CloudPath

try:
    print(os.path.isdir(CloudPath("s3://cloudpathlib-test-bucket/manual-tests/")))
except Exception as e:
    print("Unpatched version fails:")
    print(e)


with patch_os_functions():
    result = os.path.isdir(CloudPath("s3://cloudpathlib-test-bucket/manual-tests/"))
    print("Patched version of `os.path.isdir` returns: ", result)

False
Patched version of `os.path.isdir` returns:  True


## Patching `open`

Sometimes code uses the Python built-in `open` to open files and operate on them. In those cases, passing a `CloudPath` will fail. You can patch the built-in `open` so that when a `CloudPath` is provided it uses `CloudPath.open`, otherwise defers to the original behavior.

### Patching `open` in Jupyter notebooks

Jupyter notebooks inject their own `open` into the user namespace. After enabling the patch, ensure the notebook's `open` refers to the patched built-in:

```python
from cloudpathlib import patch_open

open = patch_open().patched   # rebind notebook's open to the patched version
```

Here's an example that doesn't work right now (for example, if you depend on a third-party library that calls `open`).

In [4]:
# deep in a third-party library a function calls the built-in open
def library_function(filepath: str):
    with open(filepath, "w") as f:
        f.write("hello!")

In [5]:
from cloudpathlib import CloudPath

# create file to read
cp = CloudPath("s3://cloudpathlib-test-bucket/patching_builtins/new_file.txt")

# fails with a TypeError if passed a CloudPath
try:
    library_function(cp)
    cp.read_text() == "hello!"
except Exception as e:
    print(e)

[Errno 2] No such file or directory: '/var/folders/sz/c8j64tx91mj0jb0vd1s4wj700000gn/T/tmpykd4wirh/cloudpathlib-test-bucket/patching_builtins/new_file.txt'


In [6]:
from cloudpathlib import CloudPath, patch_open

# enable patch and rebind notebook's open
open = patch_open().patched


# deep in a third-party library a function calls the built-in open
def library_function(filepath: str):
    with open(filepath, "w") as f:
        f.write("hello!")


# create file to read
cp = CloudPath("s3://cloudpathlib-test-bucket/patching_builtins/file.txt")

try:
    library_function(cp)
    print(cp.read_text() == "hello!")
except Exception as e:
    print(e)

True


## Examples: os.path functions with CloudPath

The snippet below demonstrates common `os.path` functions when patched to accept `CloudPath` values. These calls work for `CloudPath` and still behave normally for string paths.


In [7]:
import os
from cloudpathlib import CloudPath, patch_os_functions

# Create a small demo structure in your configured cloud provider (mocked in tests)
base = CloudPath("s3://cloudpathlib-test-bucket/patching_builtins/ospath_demo/")
file_path = base / "dir" / "example.txt"

with patch_os_functions():
    # ensure directory/file exist for demo purposes
    file_path.parent.mkdir(exist_ok=True)
    file_path.write_text("content")

    # basename
    print("basename:", os.path.basename(file_path))  # => "example.txt"

    # dirname
    print("dirname:", os.path.dirname(file_path))  # => CloudPath(.../ospath_demo/dir)

    # exists / isfile / isdir
    print("exists(file):", os.path.exists(file_path))
    print("isfile(file):", os.path.isfile(file_path))
    print("isdir(dir):", os.path.isdir(file_path.parent))

    # join
    joined = os.path.join(base, "dir", "sub", "name.txt")
    print("join:", joined)

    # split
    head, tail = os.path.split(file_path)
    print("split head:", head)
    print("split tail:", tail)

    # splitext
    root, ext = os.path.splitext(file_path)
    print("splitext root:", root)
    print("splitext ext:", ext)

    # commonpath/commonprefix
    p1 = base / "dir" / "a.txt"
    p2 = base / "dir" / "b.txt"
    print("commonpath:", os.path.commonpath([p1, p2]))  # => CloudPath(.../ospath_demo/dir)
    print("commonprefix:", os.path.commonprefix([p1, p2]))

basename: example.txt
dirname: s3://cloudpathlib-test-bucket/patching_builtins/ospath_demo/dir
exists(file): True
isfile(file): True
isdir(dir): True
join: s3://cloudpathlib-test-bucket/patching_builtins/ospath_demo/dir/sub/name.txt
split head: s3://cloudpathlib-test-bucket/patching_builtins/ospath_demo/dir
split tail: example.txt
splitext root: s3://cloudpathlib-test-bucket/patching_builtins/ospath_demo/dir/example
splitext ext: .txt
commonpath: s3://cloudpathlib-test-bucket/patching_builtins/ospath_demo/dir
commonprefix: s3://cloudpathlib-test-bucket/patching_builtins/ospath_demo/dir/


## Examples: glob with CloudPath

The snippet below demonstrates `glob.glob` and `glob.iglob` working with `CloudPath` as the pattern or `root_dir` when patched.


In [8]:
import glob
from cloudpathlib import CloudPath, patch_glob

root = CloudPath("s3://cloudpathlib-test-bucket/patching_builtins/glob_demo/")

with patch_glob():
    # setup demo files
    (root / "sub").mkdir(exist_ok=True)
    (root / "file1.txt").write_text("1")
    (root / "file2.py").write_text("2")
    (root / "sub" / "file3.txt").write_text("3")

    # Pattern as CloudPath
    print("*.txt:", glob.glob(root / "*.txt"))

    # Recursive patterns
    print("**/*.txt:", glob.glob(root / "**/*.txt"))

    # Using root_dir with string pattern
    print("root_dir + pattern:", glob.glob("*.py", root_dir=root))

    # iglob iterator
    it = glob.iglob(root / "*.txt")
    print("iglob first:", next(it))

*.txt: [S3Path('s3://cloudpathlib-test-bucket/patching_builtins/glob_demo/file1.txt')]
**/*.txt: [S3Path('s3://cloudpathlib-test-bucket/patching_builtins/glob_demo/file1.txt'), S3Path('s3://cloudpathlib-test-bucket/patching_builtins/glob_demo/sub/file3.txt')]
root_dir + pattern: [S3Path('s3://cloudpathlib-test-bucket/patching_builtins/glob_demo/file2.py')]
iglob first: s3://cloudpathlib-test-bucket/patching_builtins/glob_demo/file1.txt
