Skip to content

Commit

Permalink
bpo-39090: Doc: Add a chapter on making paths absolute
Browse files Browse the repository at this point in the history
Discussing Path.resolve(), os.path.abspath() and Path.cwd() / 'otherpath' in
that order, to emphasize that resolve() is the preferred way.
  • Loading branch information
Floris Lambrechts committed Feb 12, 2020
1 parent 8c579b1 commit c146ad3
Showing 1 changed file with 49 additions and 0 deletions.
49 changes: 49 additions & 0 deletions Doc/library/pathlib.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1136,6 +1136,55 @@ call fails (for example because the path doesn't exist).
have the same meaning as in :func:`open`.

.. versionadded:: 3.5


.. _absolute-paths:

Absolute paths
--------------

A path is considered *absolute* (:func:`PurePath.is_absolute`) if it has
a *root*.

This comment has been minimized.

Copy link
@eryksun

eryksun Feb 12, 2020

An absolute Windows path has both a drive and a root directory. If it only has a root directory, it's relative to the drive of the working directory. ntpath.isabs gets this wrong. Path.is_absolute gets it right.

This comment has been minimized.

Copy link
@florisla

florisla Feb 24, 2020

Owner

Will add this to separate section 'Windows considerations'.


But there are *multiple* details which may change when transforming
a path into its full, canonical variant.

- Add ``root`` (local or global) if it's not already present.
- Add ``drive`` letter or name if the Path flavour allows it and it's not
already present.
- Replace releative parts ("``..``") with absolute ones.
- Replace symbolic links or junctions with their destination.
- Change case to the canonical case on case-insensitive but case-preserving
file systems.
- Replace ``drive`` with the UNC share name if the drive is a Windows mapped
network share ("``X:``" becomes "``\\filehost\folder``")

This comment has been minimized.

Copy link
@eryksun

eryksun Feb 12, 2020

I'd use "mapped share" instead of "mapped network share". A UNC provider can implement a filesystem redirector based on anything, not just a filesystem directory shared over a network protocol such as SMB or WebDAV.

This comment has been minimized.

Copy link
@eryksun

eryksun Feb 12, 2020

It also replaces a substitute drive with the final path. For example, given subst W: C:\Windows, "W:\System32" is resolved as "C:\Windows\System32".


FYI, mapped and substitute drives are non-canonical for a few reasons.

  • They're not system-wide. They're only defined for a particular user's logon session.
  • They're not an object link to a device object, but rather to a filesystem directory, so API functions that expect a drive to be a device may fail.
  • They're not registered with the mountpoint manager, so there's no simple lookup to map them from the final path back to a drive.

The function :meth:`Path.resolve` applies all of the above
transformations.

If the path is not yet absolute, it adds ``root``, ``drive`` and the base
folder path of the current working directory as retrieved by
:func:`Path.cwd`.

In contrast, :func:`os.path.abspath` also uses the current working directory
but it does *not* follow symbolic links, never modifies case, and does not

This comment has been minimized.

Copy link
@eryksun

eryksun Feb 12, 2020

ntpath.abspath normally preserves case, but it does change the case of a drive letter from lower to upper when resolving drive-relative paths. For example:

>>> ntpath.abspath('d:spam')
'D:\\spam'
replace network share ``drive`` with the UNC path.

It also behaves non-*strict*: :func:`os.path.abspath` never raises
:exc:`FileNotFoundError` -- no matter which Python version is used.

If you desire even less side effects and only want to ensure the path
has a ``root``, then simply prepend it with the current working
directory::

>>> Path.cwd() / '../file.txt'
PosixPath('/home/anne/../file.txt')

This technique preserves the path if it was already absolute:

>>> Path.cwd() / "/absolute/path"
PosixPath('/absolute/path')


Correspondence to tools in the :mod:`os` module
-----------------------------------------------
Expand Down

3 comments on commit c146ad3

@ChrisBarker-NOAA
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be a good idea to have a general section, and then a "Considerations on Windows" section clearly delineated.

Maybe one for *nix, too, though I think there's less to that, yes? That is, the Windows behavior is a superset of the *nix behavior.

Question: IIUC, Windows (ntfs, anyway) does support soft links -- but they are rarely used by end users (the UI doesn't provide a way to make them) -- does pathlib "do the right thing" with links on Windows?

Also: it would be really good to have, at the top, a TL;DR: that is:

"""
If you want an absolute path, in most cases, you can do this: ********

For more detail, read on ....
"""
Final question, from the gitHub issue, I got the impression that Path.resolve only works if the path actually exists -- is that the case? if so, that needs to be made clear.

@eryksun
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: IIUC, Windows (ntfs, anyway) does support soft links -- but they are rarely used by end users (the UI doesn't provide a way to make them) -- does pathlib "do the right thing" with links on Windows?

pathlib resolve() calls nt._getfinalpathname, which calls WINAPI CreateFileW and GetFinalPathNameByHandleW. This is limited to however CreateFileW parses the path, which does not match the expectations of Unix regarding ".." components.

In Unix, parsing "/spam/symlink/../eggs" resolves "symlink" before evaluating "..". Windows does not support this. CreateFileW first normalizes a DOS path into a native NT path, which has to resolve ".." components because ".." has no meaning in NT path parsing (except in the target of a relative symlink -- just to satisfy our expectations that nothing is ever consistent). The name ".." may not even be a reserved name in a filesystem. NTFS reserves it, but FAT32 and exFAT allow creating a file named "..".

For example, WINAPI CreateFileW(L"C:/spam/smylink/../eggs", ...) translates from DOS to the NT path "\??\C:\spam\eggs" and calls NTAPI NtCreateFile.

Handling symlinks according to Unix rules would require parsing a path forward, calling _getfinalpathname one component at a time until it fails. But this result would not be consistent with passing the path to open, so it's the wrong result in Windows.

Final question, from the gitHub issue, I got the impression that Path.resolve only works if the path actually exists -- is that the case? if so, that needs to be made clear.

resolve() defaults to non-strict mode, which tries _getfinalpathname in a loop that walks the path in reverse until it either succeeds or the path is exhausted. The remaining inaccessible components are appended to the result.

@florisla
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you all for the great feedback.

I've made a new revision here:
https://github.com/florisla/cpython/blob/pathlib-chapter-absolute-paths-2/Doc/library/pathlib.rst#absolute-paths

Changes:

  • Be more 'in your face' about Path.resolve() being the recommended
    approach.
  • Add separate section on Windows considerations
  • Explain difference between Path.resolve() and os.path.isabs() w.r.t. checking
    for drive.
  • Refer to 'mapped share' instead of 'mapped network share'.
  • Explain replacement of substitute drive with final path.
  • Mention os.path.abspath's upcasing of drive letter in case of
    a path missing a root.
  • Mention different handling of junctions versus symlinks w.r.t.
    relative parts.

For brevity, I've kept the wording on substitute drive and handling of
junctions very short.

For the same reason I did not not include eryksun's (interesting!) info
on why mapped and subsitute drives are non-canonical.

Not mentioning Path.resolve()'s behaviour w.r.t. non-existing files since
that's documented in resolve() itself.

Please sign in to comment.