In [1]:
import pathlib
import warnings
from tempfile import NamedTemporaryFile

from upath import UPath

warnings.filterwarnings(action="ignore", message="UPath .*", module="upath.core")

# Local Filesystem

If you give a local path, UPath defaults to `pathlib.PosixPath` or `pathlib.WindowsPath`, just as `pathlib.Path`.

In [2]:
tmp = NamedTemporaryFile()
print(tmp.name, type(tmp.name))
local_path = UPath(tmp.name)
assert isinstance(local_path, (pathlib.PosixPath, pathlib.WindowsPath))
local_path

/tmp/tmpdeaokyh7 <class 'str'>


PosixPath('/tmp/tmpdeaokyh7')

If you give it a scheme registered with fsspec, it will return a UPath which uses fsspec FileSystem backend

In [3]:
local_uri = local_path.absolute().as_uri()
print(f"{local_uri=}")

local_upath = UPath(local_uri)
print(f"{local_upath=}")

print(f"{type(local_upath)=}")
assert isinstance(local_upath, UPath)

print(f"{type(local_upath.fs)=}")
tmp.close()

local_uri='file:///tmp/tmpdeaokyh7'
local_upath=UPath('file:/tmp/tmpdeaokyh7')
type(local_upath)=<class 'upath.core.UPath'>
type(local_upath.fs)=<class 'fsspec.implementations.local.LocalFileSystem'>


# `fsspec` FileSystems

With `UPath` you can connect to any `fsspec` FileSystem and interact with it in with it as you would with your local filesystem using `pathlib`. Connection arguments can be given in a couple of ways:

You can give them as keyword arguments as described in the `fsspec` [docs](https://filesystem-spec.readthedocs.io/en/latest/api.html#built-in-implementations) for each filesystem implementation:

In [4]:
ghpath = UPath('github:/', org='fsspec', repo='universal_pathlib', sha='main')
assert ghpath.exists()
ghpath.fs

<fsspec.implementations.github.GithubFileSystem at 0x7f87bfea66b0>

Or define them in the path/url, in which case they will be appropriately parsed:

In [5]:
ghpath = UPath('github://fsspec:universal_pathlib@main/')
ghpath

UPath('github://fsspec:universal_pathlib@main/')

With a `UPath` object instantiated, you can now interact with the paths with the usual `pathlib.Path` API.

In [6]:
for p in ghpath.iterdir():
    print(p)
    break

github://fsspec:universal_pathlib@main/.flake8


All the standard path methods and attributes of [`pathlib.Path`](https://docs.python.org/3/library/pathlib.html#pathlib.Path) are available too:

In [7]:
readme_path = ghpath / "README.md"
readme_path

UPath('github://fsspec:universal_pathlib@main/README.md')

To get the full path as a string use:

In [8]:
str(readme_path)

'github://fsspec:universal_pathlib@main/README.md'

You can also use the path attribute to get just the path:

In [9]:
# path attribute added
readme_path.path

'/README.md'

In [10]:
readme_path.name, readme_path.stem, readme_path.suffix

('README.md', 'README', '.md')

In [11]:
readme_path.read_text().splitlines()[0]

'# Universal Pathlib'

In [12]:
s3path = UPath("s3://spacenet-dataset")

In [13]:
for p in s3path.iterdir():
    if p.is_file():
        print(p)
        break

s3://spacenet-dataset/LICENSE.md


You can chain paths with the `/` operator and any methods.

In [14]:
(s3path / "LICENSE.md").exists()

True

In [15]:
with (s3path / "LICENSE.md").open("rt", encoding="utf-8") as f:
    print(f.read(22))

The "SpaceNet Dataset"


The `glob` method is also available for most filesystems. Note the syntax here is as detailed in `fsspec` [docs](https://filesystem-spec.readthedocs.io/en/latest/api.html#fsspec.spec.AbstractFileSystem.glob), rather than that of `pathlib`.

In [16]:
for p in (s3path / "AOIs" / "AOI_3_Paris").glob("**.TIF"):
    print(p)
    break

s3://spacenet-dataset/AOIs/AOI_3_Paris/MS/16FEB29111913-M2AS_R01C1-055649178040_01_P001.TIF


### Works with fsspec filesystems

Some filesystems may require additional packages to be installed.

Check out some of the known implementations:

In [17]:
from fsspec.registry import known_implementations
from IPython.display import Markdown, display

known = [
    f"| {name} | {d['class']} |" for name, d in sorted(known_implementations.items())
]
known = "\n".join(["| Name | Class |\n| --- | --- |", *known])
display(Markdown(known))

| Name | Class |
| --- | --- |
| abfs | adlfs.AzureBlobFileSystem |
| abfss | adlfs.AzureBlobFileSystem |
| adl | adlfs.AzureDatalakeFileSystem |
| arrow_hdfs | fsspec.implementations.arrow.HadoopFileSystem |
| asynclocal | morefs.asyn_local.AsyncLocalFileSystem |
| az | adlfs.AzureBlobFileSystem |
| blockcache | fsspec.implementations.cached.CachingFileSystem |
| cached | fsspec.implementations.cached.CachingFileSystem |
| dask | fsspec.implementations.dask.DaskWorkerFileSystem |
| dbfs | fsspec.implementations.dbfs.DatabricksFileSystem |
| dir | fsspec.implementations.dirfs.DirFileSystem |
| dropbox | dropboxdrivefs.DropboxDriveFileSystem |
| dvc | dvc.api.DVCFileSystem |
| file | fsspec.implementations.local.LocalFileSystem |
| filecache | fsspec.implementations.cached.WholeFileCacheFileSystem |
| ftp | fsspec.implementations.ftp.FTPFileSystem |
| gcs | gcsfs.GCSFileSystem |
| gdrive | gdrivefs.GoogleDriveFileSystem |
| generic | fsspec.generic.GenericFileSystem |
| git | fsspec.implementations.git.GitFileSystem |
| github | fsspec.implementations.github.GithubFileSystem |
| gs | gcsfs.GCSFileSystem |
| hdfs | fsspec.implementations.arrow.HadoopFileSystem |
| hf | huggingface_hub.HfFileSystem |
| http | fsspec.implementations.http.HTTPFileSystem |
| https | fsspec.implementations.http.HTTPFileSystem |
| jlab | fsspec.implementations.jupyter.JupyterFileSystem |
| jupyter | fsspec.implementations.jupyter.JupyterFileSystem |
| libarchive | fsspec.implementations.libarchive.LibArchiveFileSystem |
| memory | fsspec.implementations.memory.MemoryFileSystem |
| oci | ocifs.OCIFileSystem |
| oss | ossfs.OSSFileSystem |
| reference | fsspec.implementations.reference.ReferenceFileSystem |
| root | fsspec_xrootd.XRootDFileSystem |
| s3 | s3fs.S3FileSystem |
| s3a | s3fs.S3FileSystem |
| sftp | fsspec.implementations.sftp.SFTPFileSystem |
| simplecache | fsspec.implementations.cached.SimpleCacheFileSystem |
| smb | fsspec.implementations.smb.SMBFileSystem |
| ssh | fsspec.implementations.sftp.SFTPFileSystem |
| tar | fsspec.implementations.tar.TarFileSystem |
| wandb | wandbfs.WandbFS |
| webdav | webdav4.fsspec.WebdavFileSystem |
| webhdfs | fsspec.implementations.webhdfs.WebHDFS |
| zip | fsspec.implementations.zip.ZipFileSystem |