In [1]:
from upath import UPath

### local filesystem

If you give a local path, `UPath` defaults to `pathlib.PosixPath` or `pathlib.WindowsPath`

In [2]:
local_path = UPath('/tmp')
local_path

PosixPath('/tmp')

If you give it a scheme registered with fsspec, it will return a UPath which uses fsspec FileSystem backend

In [3]:
local_upath = UPath('file:/tmp')
local_upath



UPath('file:/tmp')

### fsspec filesystems

with `UPath` you can connect to any fsspec FileSystem and interact with it in with it as you would with your local filesystem using pathlib. Connection arguments can be given in a couple of ways:

You can give them as keyword arguments as described for each filesystem in the fsspec docs:

In [4]:
ghpath = UPath('github:/', org='fsspec', repo='universal_pathlib', sha='main')

or define them in the path/url, in which case they will be appropriately parsed:

In [5]:
ghpath = UPath('github://fsspec:universal_pathlib@main/')
ghpath

GithubPath('github://fsspec:universal_pathlib@main/')

With a `UPath` object instantiated, you can now interact with the paths with the usual `pathlib.Path` API

In [6]:
for p in ghpath.iterdir():
    print(p)

github://fsspec:universal_pathlib@main/.flake8
github://fsspec:universal_pathlib@main/.github
github://fsspec:universal_pathlib@main/.gitignore
github://fsspec:universal_pathlib@main/LICENSE
github://fsspec:universal_pathlib@main/README.md
github://fsspec:universal_pathlib@main/environment.yml
github://fsspec:universal_pathlib@main/notebooks
github://fsspec:universal_pathlib@main/noxfile.py
github://fsspec:universal_pathlib@main/pyproject.toml
github://fsspec:universal_pathlib@main/setup.py
github://fsspec:universal_pathlib@main/upath


The `glob` method is also available for most filesystems. Note the syntax here is as defined in `fsspec`, rather than that of pathlib. 

In [7]:
for p in ghpath.glob('**.py'):
    print(p)

github://fsspec:universal_pathlib@main/noxfile.py
github://fsspec:universal_pathlib@main/setup.py
github://fsspec:universal_pathlib@main/upath/__init__.py
github://fsspec:universal_pathlib@main/upath/core.py
github://fsspec:universal_pathlib@main/upath/errors.py
github://fsspec:universal_pathlib@main/upath/implementations/__init__.py
github://fsspec:universal_pathlib@main/upath/implementations/cloud.py
github://fsspec:universal_pathlib@main/upath/implementations/hdfs.py
github://fsspec:universal_pathlib@main/upath/implementations/http.py
github://fsspec:universal_pathlib@main/upath/implementations/memory.py
github://fsspec:universal_pathlib@main/upath/registry.py
github://fsspec:universal_pathlib@main/upath/tests/__init__.py
github://fsspec:universal_pathlib@main/upath/tests/cases.py
github://fsspec:universal_pathlib@main/upath/tests/conftest.py
github://fsspec:universal_pathlib@main/upath/tests/implementations/__init__.py
github://fsspec:universal_pathlib@main/upath/tests/implementati

All the standard path methods and attributes of `pathlib.Path` are available too:

In [8]:
readme_path = ghpath / 'README.md'
readme_path

GithubPath('github://fsspec:universal_pathlib@main/README.md')

To get the full path as a string use:

In [9]:
str(readme_path)

'github://fsspec:universal_pathlib@main/README.md'

You can also use the path attribute to get just the path:

In [10]:
# path attribute added
readme_path.path

'/README.md'

In [11]:
readme_path.name

'README.md'

In [12]:
readme_path.stem

'README'

In [13]:
readme_path.suffix

'.md'

In [14]:
readme_path.exists()

True

In [15]:
readme_path.read_text()[:19]

'# Universal Pathlib'

Some filesystems may require extra imports to use.

In [16]:
import s3fs

In [17]:
s3path = UPath("s3://spacenet-dataset")

In [18]:
for p in s3path.iterdir():
    print(p)

s3://spacenet-dataset/LICENSE.md
s3://spacenet-dataset/
s3://spacenet-dataset/AOIs
s3://spacenet-dataset/Hosted-Datasets
s3://spacenet-dataset/SpaceNet_Off-Nadir_Dataset
s3://spacenet-dataset/spacenet-model-weights
s3://spacenet-dataset/spacenet-stac
s3://spacenet-dataset/spacenet


You can chain paths with the `/` operator and read text or binary contents.

In [19]:
(s3path / "LICENSE.md").read_text()

'The "SpaceNet Dataset" is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.  The "SpaceNet Dataset" includes all contents of this S3 bucket except for the contents of the "Hosted-Datasets" folder and its subfolders.\n\nhttps://creativecommons.org/licenses/by-sa/4.0/\n'

In [20]:
with (s3path / "LICENSE.md").open("rt", encoding="utf-8") as f:
    print(f.read(22))

The "SpaceNet Dataset"


Globbing also works for many filesystems.

In [21]:
from itertools import islice 
for p in islice((s3path / "AOIs" / "AOI_3_Paris").glob("**.TIF"), 5):
    print(p)

s3://spacenet-dataset/AOIs/AOI_3_Paris/MS/16FEB29111913-M2AS_R01C1-055649178040_01_P001.TIF
s3://spacenet-dataset/AOIs/AOI_3_Paris/MS/16FEB29111913-M2AS_R01C2-055649178040_01_P001.TIF
s3://spacenet-dataset/AOIs/AOI_3_Paris/MS/16FEB29111913-M2AS_R01C3-055649178040_01_P001.TIF
s3://spacenet-dataset/AOIs/AOI_3_Paris/MS/16FEB29111913-M2AS_R01C4-055649178040_01_P001.TIF
s3://spacenet-dataset/AOIs/AOI_3_Paris/MS/16FEB29111913-M2AS_R01C5-055649178040_01_P001.TIF
