improvement: support custom cloudpathlib providers in path normalization#8929
improvement: support custom cloudpathlib providers in path normalization#8929
Conversation
Closes #8868 Custom cloudpathlib providers (e.g., SMBPath) that subclass `CloudPath` but live in external packages had their URI schemes corrupted (`smb://` → `smb:/`) because the cloud-path check used `__module__.startswith("cloudpathlib")`, which only matched built-in providers. Added an `is_cloudpath()` utility that uses `isinstance(path, CloudPath)` when cloudpathlib is installed, with a module-name fallback when it isn't. Replaced both call sites (`normalize_path` and `file_browser` limit defaulting) with the new utility.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR fixes URI scheme corruption for custom cloudpathlib providers by improving cloud-path detection during path normalization and in the file browser’s default limit selection.
Changes:
- Add
is_cloudpath()utility to detectCloudPathinstances (including external subclasses). - Update
normalize_path()andfile_browserlimit defaulting to useis_cloudpath(). - Expand/adjust tests for cloud path detection and normalization behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
marimo/_utils/paths.py |
Adds is_cloudpath() and switches normalize_path() to use it for skipping cloud URI normalization. |
marimo/_plugins/ui/_impl/file_browser.py |
Uses is_cloudpath() to choose a conservative default file listing limit for cloud storage. |
marimo/_dependencies/dependencies.py |
Registers cloudpathlib in DependencyManager for dependency introspection. |
tests/_utils/test_paths.py |
Adds tests covering is_cloudpath() and custom provider normalization behavior. |
marimo/_utils/paths.py
Outdated
| # Return early if cloudpathlib is not installed | ||
| if not DependencyManager.cloudpathlib.imported(): | ||
| return False | ||
|
|
||
| try: | ||
| from cloudpathlib import CloudPath | ||
|
|
||
| return isinstance(path, CloudPath) | ||
| except ImportError: | ||
| return path.__class__.__module__.startswith("cloudpathlib") |
There was a problem hiding this comment.
is_cloudpath() short-circuits on DependencyManager.cloudpathlib.imported(), which only checks cloudpathlib in sys.modules. This makes the docstring’s “fallback to module-name check when cloudpathlib isn't installed” ineffective when the package is simply unavailable (i.e., not in sys.modules), and it also prevents the try/except ImportError logic from running in that case. Consider removing the imported() guard and instead try: import cloudpathlib (or use DependencyManager.cloudpathlib.has(quiet=True)) and fall back to the __module__ prefix check on ImportError.
tests/_utils/test_paths.py
Outdated
| def test_builtin_cloudpath_detected_via_isinstance(self) -> None: | ||
| """Built-in cloudpathlib paths are detected via isinstance.""" | ||
| from cloudpathlib import S3Path | ||
|
|
||
| assert is_cloudpath(S3Path("s3://bucket/key")) | ||
|
|
||
| def test_custom_cloudpath_subclass_detected(self) -> None: | ||
| """Custom CloudPath subclasses from external packages are detected. | ||
|
|
||
| This is the core scenario from issue #8868: a user creates a | ||
| custom provider whose __module__ does NOT start with 'cloudpathlib'. | ||
| We use virtual subclass registration to avoid cloudpathlib's | ||
| metaclass issues with direct subclassing. | ||
| """ | ||
| from cloudpathlib import CloudPath | ||
|
|
||
| class FakeSMBPath: |
There was a problem hiding this comment.
These tests import cloudpathlib unconditionally (from cloudpathlib import S3Path/CloudPath). CI runs the core test suite with the test dependency group (without optional deps like cloudpathlib), so this will raise ModuleNotFoundError and fail the core test job. Use pytest.importorskip("cloudpathlib") (or a skipif based on DependencyManager.cloudpathlib.has()) around cloudpathlib-dependent tests.
tests/_utils/test_paths.py
Outdated
| from cloudpathlib import S3Path | ||
|
|
||
| cloud_path = S3Path("s3://bucket/folder/file.txt") | ||
| result = normalize_path(cloud_path) | ||
| assert result is cloud_path |
There was a problem hiding this comment.
This test imports cloudpathlib directly (from cloudpathlib import S3Path). In the core CI test run (dependency group test), cloudpathlib isn’t installed (it’s in test-optional), so this test should be skipped when the dependency is missing (e.g., pytest.importorskip("cloudpathlib")).
tests/_utils/test_paths.py
Outdated
| from cloudpathlib import S3Path | ||
|
|
||
| cloud_path = S3Path("s3://bucket/folder/file.txt") | ||
| result = normalize_path(cloud_path) | ||
| assert result is cloud_path | ||
|
|
||
|
|
||
| def test_normalize_path_skips_custom_cloudpath_subclass() -> None: | ||
| """Custom CloudPath subclasses should also skip normalization (#8868).""" | ||
| from cloudpathlib import CloudPath |
There was a problem hiding this comment.
This test imports cloudpathlib directly (from cloudpathlib import CloudPath). In the core CI test run (dependency group test), cloudpathlib isn’t installed (it’s in test-optional), so this test should be skipped when the dependency is missing (e.g., pytest.importorskip("cloudpathlib")).
| from cloudpathlib import S3Path | |
| cloud_path = S3Path("s3://bucket/folder/file.txt") | |
| result = normalize_path(cloud_path) | |
| assert result is cloud_path | |
| def test_normalize_path_skips_custom_cloudpath_subclass() -> None: | |
| """Custom CloudPath subclasses should also skip normalization (#8868).""" | |
| from cloudpathlib import CloudPath | |
| cloudpathlib = pytest.importorskip("cloudpathlib") | |
| cloud_path = cloudpathlib.S3Path("s3://bucket/folder/file.txt") | |
| result = normalize_path(cloud_path) | |
| assert result is cloud_path | |
| def test_normalize_path_skips_custom_cloudpath_subclass() -> None: | |
| """Custom CloudPath subclasses should also skip normalization (#8868).""" | |
| cloudpathlib = pytest.importorskip("cloudpathlib") | |
| CloudPath = cloudpathlib.CloudPath |
…loudpath Tests that import cloudpathlib directly now use pytest.importorskip so they are skipped on core-deps CI where cloudpathlib is not installed. Restructured is_cloudpath() to: (1) use a fast module-name heuristic first, (2) skip the import when cloudpathlib hasn't been loaded yet, and (3) fall back to isinstance for virtual subclasses registered via CloudPath.register().
|
🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.21.2-dev99 |
Closes #8868
Custom cloudpathlib providers (e.g., SMBPath) that subclass
CloudPathbut live in external packages had their URI schemes corrupted (smb://→smb:/) because the cloud-path check used__module__.startswith("cloudpathlib"), which only matched built-in providers.Added an
is_cloudpath()utility that usesisinstance(path, CloudPath)when cloudpathlib is installed, with a module-name fallback when it isn't. Replaced both call sites (normalize_pathandfile_browserlimit defaulting) with the new utility.