In [1]:
import pathlib
import warnings

from upath import UPath
from upath.errors import DefaultImplementationWarning

warnings.filterwarnings(
    action="ignore",
    category=DefaultImplementationWarning,
    module="upath",
)

# local

If you give a local path, UPath defaults to `pathlib.PosixPath` or `pathlib.WindowsPath`, just as `pathlib.Path`.

In [2]:
local_path = UPath(".")
assert isinstance(local_path, (pathlib.PosixPath, pathlib.WindowsPath))
local_path

PosixPath('.')

If you give it a scheme registered with fsspec, it will return a UPath which uses fsspec FileSystem backend

In [3]:
local_upath = UPath(local_path.absolute().as_uri())
assert isinstance(local_upath, UPath)
print(type(local_upath))
local_upath.fs

<class 'upath.core.UPath'>


<fsspec.implementations.local.LocalFileSystem at 0x7f682472f0a0>

# fsspec FileSystems

with `UPath` you can connect to any fsspec FileSystem and interact with it in with it as you would with your local filesystem using pathlib. Connection arguments can be given in a couple of ways:

You can give them as keyword arguments as described for each filesystem in the fsspec docs:

In [4]:
gpath = UPath("github:/", org="Quansight", repo="universal_pathlib", sha="main")
assert gpath.exists()
gpath.fs

<fsspec.implementations.github.GithubFileSystem at 0x7f6824527700>

or define them in the path/url, in which case they will be appropriately parsed:

In [5]:
gpath = UPath("github://Quansight:universal_pathlib@main/")
gpath

UPath('github://Quansight:universal_pathlib@main/')

With a UPath object instantiated, you can now interact with the paths with the usual `pathlib.Path` API.

In [6]:
for p in gpath.iterdir():
    print(p)

github://Quansight:universal_pathlib@main/.flake8
github://Quansight:universal_pathlib@main/.github
github://Quansight:universal_pathlib@main/.gitignore
github://Quansight:universal_pathlib@main/LICENSE
github://Quansight:universal_pathlib@main/README.md
github://Quansight:universal_pathlib@main/environment.yml
github://Quansight:universal_pathlib@main/notebooks
github://Quansight:universal_pathlib@main/noxfile.py
github://Quansight:universal_pathlib@main/pyproject.toml
github://Quansight:universal_pathlib@main/setup.py
github://Quansight:universal_pathlib@main/upath


The `glob` method is also available. Note the syntax here is as detailed in `fsspec` [docs](https://filesystem-spec.readthedocs.io/en/latest/api.html#fsspec.spec.AbstractFileSystem.glob), rather than that of pathlib.

In [7]:
for p in gpath.glob("**.py"):
    print(p)

All the standard path methods and attributes of `pathlib.Path` are available too:

In [8]:
readme = gpath.joinpath("README.md")
readme

UPath('github://Quansight:universal_pathlib@main/README.md')

To get the full path as a string use:

In [9]:
str(readme)

'github://Quansight:universal_pathlib@main/README.md'

You can also use the path attribute to get just the path:

In [10]:
# path attribute added
readme.path

'/README.md'

In [11]:
readme.name

'README.md'

In [12]:
readme.stem

'README'

In [13]:
readme.suffix

'.md'

In [14]:
readme.exists()

False

In [15]:
print(readme.read_text())

# Universal Pathlib

Universal Pathlib is a python library that aims to extend Python's built-in [`pathlib.Path`](https://docs.python.org/3/library/pathlib.html) api to use a variety of backend filesystems using [`fsspec`](https://filesystem-spec.readthedocs.io/en/latest/intro.html)

## Installation

### Pypi

```bash
pip install universal_pathlib
```

### conda

```bash
conda install -c conda-forge universal_pathlib
```

## Basic Usage

```python
>>> from upath import UPath

>>> path = UPath(file:/some/filepath.py)
>>> path.name
filepath.py
>>> path.stem
filepath
>>> path.suffix
.py
>>> path.exists()
True
```

Some backends may require other dependencies. For example to work with S3 paths, [`s3fs`](https://s3fs.readthedocs.io/en/latest/) is required.

For more examples, see the [example notebook here](notebooks/examples.ipynb)






In [16]:
s3path = UPath("s3://spacenet-dataset")

In [17]:
for p in s3path.iterdir():
    print(p)

s3://spacenet-dataset/LICENSE.md
s3://spacenet-dataset/
s3://spacenet-dataset/AOIs
s3://spacenet-dataset/Hosted-Datasets
s3://spacenet-dataset/SpaceNet_Off-Nadir_Dataset
s3://spacenet-dataset/spacenet-model-weights
s3://spacenet-dataset/spacenet-stac
s3://spacenet-dataset/spacenet


Some filesystems may require additional packages to be installed.

In [18]:
from fsspec.registry import known_implementations

for name, d in sorted(known_implementations.items()):
    print("%s:\t%s" % (name, d.get("err", d.get("class", ""))))

abfs:	Install adlfs to access Azure Datalake Gen2 and Azure Blob Storage
adl:	Install adlfs to access Azure Datalake Gen1
az:	Install adlfs to access Azure Datalake Gen2 and Azure Blob Storage
blockcache:	fsspec.implementations.cached.CachingFileSystem
cached:	fsspec.implementations.cached.CachingFileSystem
dask:	Install dask distributed to access worker file system
dbfs:	Install the requests package to use the DatabricksFileSystem
dropbox:	DropboxFileSystem requires "dropboxdrivefs","requests" and "dropbox" to be installed
file:	fsspec.implementations.local.LocalFileSystem
filecache:	fsspec.implementations.cached.WholeFileCacheFileSystem
ftp:	fsspec.implementations.ftp.FTPFileSystem
gcs:	Please install gcsfs to access Google Storage
gdrive:	Please install gdrivefs for access to Google Drive
git:	Install pygit2 to browse local git repos
github:	Install the requests package to use the github FS
gs:	Please install gcsfs to access Google Storage
hdfs:	pyarrow and local java libraries requ