# File path <-> URL conversion

Given the (current) lack of cross-platform lightweight C/C++ libraries for converting paths to/from `file://` URLs, OpenAssetIO comes bundled with a built-in utility to accomplish this.

This is in the form of a class `FileUrlPathConverter` with two methods `pathToUrl` and `pathFromUrl`.

The implementation conforms to the large test case database used in the [swift-url](https://github.com/karwa/swift-url) project.

The `FileUrlPathConverter` class is not cheap to instantiate (it precompiles several regular expressions on construction). High-performance scenarios should construct a single instance and re-use it for each conversion.

In [1]:
from openassetio.utils import FileUrlPathConverter


converter = FileUrlPathConverter()

## The basics

The converter functions take a `PathType` parameter, to specify the intended platform, either `kWindows`, `kPOSIX` or `kSystem`.

Lets try a canonical Windows path

In [2]:
from resources.helpers import display_result
from openassetio.utils import PathType


url = converter.pathToUrl(r"C:\path\to\file.ext", PathType.kWindows)
path = converter.pathFromUrl(url, PathType.kWindows)

display_result((f"URL: {url}", f"Path: {path}"))

> **Result:**
> - `URL: file:///C:/path/to/file.ext`
> - `Path: C:\path\to\file.ext`

And similarly for POSIX

In [3]:
url = converter.pathToUrl(r"/path/to/file.ext", PathType.kPOSIX)
path = converter.pathFromUrl(url, PathType.kPOSIX)

display_result((f"URL: {url}", f"Path: {path}"))

> **Result:**
> - `URL: file:///path/to/file.ext`
> - `Path: /path/to/file.ext`

If we pass `PathType.kSystem`, or leave that argument empty, then the path will be converted assuming the current platform

In [4]:
# Convert path to URL.

url_default = converter.pathToUrl(r"//path/to/file.ext")
url_system = converter.pathToUrl(r"//path/to/file.ext", PathType.kSystem)
assert url_default == url_system

url_windows = converter.pathToUrl(r"//path/to/file.ext", PathType.kWindows)
url_posix = converter.pathToUrl(r"//path/to/file.ext", PathType.kPOSIX)

# Convert URL back to path.

path_default = converter.pathFromUrl(url_default)
path_system = converter.pathFromUrl(url_system, PathType.kSystem)
assert path_default == path_system

path_windows = converter.pathFromUrl(url_windows, PathType.kWindows)
path_posix = converter.pathFromUrl(url_posix, PathType.kPOSIX)

display_result(
    (f"System URL: {url_system}", f"POSIX URL: {url_posix}", f"Windows URL: {url_windows}",
     f"Path from system URL: {path_system}", f"Path from POSIX URL: {path_posix}", f"Path from Windows URL: {path_windows}"
     ))

> **Result:**
> - `System URL: file:////path/to/file.ext`
> - `POSIX URL: file:////path/to/file.ext`
> - `Windows URL: file://path/to/file.ext`
> - `Path from system URL: //path/to/file.ext`
> - `Path from POSIX URL: //path/to/file.ext`
> - `Path from Windows URL: \\path\to\file.ext`

Note that on POSIX, `//path/to` is valid (the leading double-`/` is implementation-dependent), and on Windows, `//path/to` refers to a UNC share path with host `path` and share name `to`.

## Platform-specific validation

The given path must be absolute. The structure of an absolute path is platform-specific. If we try the Windows path with a POSIX path type, we'll get an error because there is no leading `/`

In [5]:
from openassetio.errors import InputValidationException


try:
    url = converter.pathToUrl(r"C:\path\to\file.ext", PathType.kPOSIX)
except InputValidationException as exc:
    display_result(exc)

> **Result:**
> `Path is relative ('C:\path\to\file.ext')`

Similarly, if we try the POSIX path with a Windows path type, the same error is raised, because there is no leading drive letter or UNC host.

In [6]:
try:
    url = converter.pathToUrl(r"/path/to/file.ext", PathType.kWindows)
except InputValidationException as exc:
    display_result(exc)

> **Result:**
> `Path is relative ('/path/to/file.ext')`

There are many other examples where validation and conversion are platform-dependent beyond this.

## Windows specifics

Windows has several types of path, drive paths, UNC paths and UNC device paths, with and without normalisation.

In [7]:
# (Normalised) drive path
url_drive = converter.pathToUrl(r"C:/path\to\/\file.ext", PathType.kWindows)
path_drive = converter.pathFromUrl(url_drive, PathType.kWindows)

# (Normalised) UNC share path
url_unc_share = converter.pathToUrl(r"\\host/share\path\to\/\file.ext", PathType.kWindows)
path_unc_share = converter.pathFromUrl(url_unc_share, PathType.kWindows)

# Non-normalised UNC device drive path
url_device_drive = converter.pathToUrl(r"\\?\C:\path\to\file.ext", PathType.kWindows)
path_device_drive = converter.pathFromUrl(url_device_drive, PathType.kWindows)

# Non-normalised UNC device share path
url_device_share = converter.pathToUrl(r"\\?\UNC\host\share\path\to\file.ext", PathType.kWindows)
path_device_share = converter.pathFromUrl(url_device_share, PathType.kWindows)

display_result(
    (f"Drive URL: {url_drive}", f"UNC share URL: {url_unc_share}",
     f"UNC device drive URL: {url_device_drive}",
     f"UNC device share URL: {url_device_share}",
     f"Path from drive URL: {path_drive}", f"Path from UNC share URL: {path_unc_share}",
     f"Path from UNC device drive URL: {path_device_drive}",
     f"Path from UNC device share URL: {path_device_share}"
     ))

> **Result:**
> - `Drive URL: file:///C:/path/to/file.ext`
> - `UNC share URL: file://host/share/path/to/file.ext`
> - `UNC device drive URL: file:///C:/path/to/file.ext`
> - `UNC device share URL: file://host/share/path/to/file.ext`
> - `Path from drive URL: C:\path\to\file.ext`
> - `Path from UNC share URL: \\host\share\path\to\file.ext`
> - `Path from UNC device drive URL: C:\path\to\file.ext`
> - `Path from UNC device share URL: \\host\share\path\to\file.ext`

Note that information is lost when converting to a `file` URL, i.e. whether the path was originally a device path or not. This can have implications for device paths that cannot be normalised by the Windows API. 

As a special case, paths that exceed the Windows `MAX_PATH` limit will be automatically converted to non-normalised device paths (i.e. prefixed with `\\?\`):

In [8]:
long_drive_path = converter.pathFromUrl(
    "file:///C:/w/0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"
    "12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234",
    PathType.kWindows)

long_unc_path = converter.pathFromUrl(
    "file://host/share/"
    "01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"
    "12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456",
    PathType.kWindows)

display_result((f"Long drive path: {long_drive_path}", f"Long share path: {long_unc_path}"))

> **Result:**
> - `Long drive path: \\?\C:\w\012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234`
> - `Long share path: \\?\UNC\host\share\0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456`

### Unsupported Windows path features

Normalised UNC device paths of the form `\\.\` are not yet supported. 

Usage of `/` in device paths are also not (yet) supported. Non-normalised device paths should treat these as file name components, not path separators.

In [9]:
try:
    url = converter.pathToUrl(r"\\.\C:\path\to\file.ext", PathType.kWindows)
except InputValidationException as exc:
    display_result(exc)

try:
    url = converter.pathToUrl(r"\\?\C:\path/to\file.ext", PathType.kWindows)
except InputValidationException as exc:
    display_result(exc)

> **Result:**
> `Path references an invalid hostname ('\\.\C:\path\to\file.ext')`

> **Result:**
> `Unsupported Win32 device path ('\\?\C:\path/to\file.ext')`