Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.path.realpath on Windows does not follow symbolic links #54158

Closed
stutzbach mannequin opened this issue Sep 25, 2010 · 33 comments
Closed

os.path.realpath on Windows does not follow symbolic links #54158

stutzbach mannequin opened this issue Sep 25, 2010 · 33 comments
Assignees
Labels
3.8 (EOL) end of life 3.9 only security fixes OS-windows stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@stutzbach
Copy link
Mannequin

stutzbach mannequin commented Sep 25, 2010

BPO 9949
Nosy @pfmoore, @atsuoishimoto, @jaraco, @pitrou, @vstinner, @ericvsmith, @tjguk, @takluyver, @zware, @eryksun, @zooba, @ethanhs, @pablogsal, @miss-islington
PRs
  • bpo-9949: Enable symlink traversal for ntpath.realpath #15287
  • [3.8] bpo-9949: Enable symlink traversal for ntpath.realpath (GH-15287) #15367
  • bpo-9949: Call normpath first in realpath #15369
  • [3.8] bpo-9949: Call normpath() in realpath() and avoid unnecessary prefixes (GH-15369) #15376
  • Files
  • ntpath_fix_issue9949.zip: Python 3.2 fix for issue 9949
  • issue9949.tar.bz2: Python 3.3 fix for issue os.path.realpath on Windows does not follow symbolic links #54158
  • issue9949-v2.tar.bz2: Python 3.3 fix for issue os.path.realpath on Windows does not follow symbolic links #54158, version 2
  • issue9949-v3.patch: Python 3.3 fix for issue os.path.realpath on Windows does not follow symbolic links #54158, version 3
  • issue9949-v4.patch: Python 3.5 fix for issue os.path.realpath on Windows does not follow symbolic links #54158, version 4
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/zooba'
    closed_at = <Date 2019-08-22.01:06:21.481>
    created_at = <Date 2010-09-25.15:03:13.041>
    labels = ['3.8', 'type-bug', 'library', '3.9', 'OS-windows']
    title = 'os.path.realpath on Windows does not follow symbolic links'
    updated_at = <Date 2019-08-22.09:39:44.437>
    user = 'https://bugs.python.org/stutzbach'

    bugs.python.org fields:

    activity = <Date 2019-08-22.09:39:44.437>
    actor = 'vstinner'
    assignee = 'steve.dower'
    closed = True
    closed_date = <Date 2019-08-22.01:06:21.481>
    closer = 'steve.dower'
    components = ['Library (Lib)', 'Windows']
    creation = <Date 2010-09-25.15:03:13.041>
    creator = 'stutzbach'
    dependencies = []
    files = ['24770', '25225', '25232', '25810', '38057']
    hgrepos = []
    issue_num = 9949
    keywords = ['patch', 'needs review']
    message_count = 33.0
    messages = ['117374', '142585', '155270', '155371', '155372', '155376', '158353', '158390', '162231', '166197', '223299', '223801', '229059', '235615', '275287', '310567', '349771', '349879', '350107', '350109', '350111', '350114', '350116', '350119', '350120', '350135', '350138', '350139', '350140', '350142', '350145', '350149', '350178']
    nosy_count = 19.0
    nosy_names = ['paul.moore', 'ishimoto', 'jaraco', 'pitrou', 'vstinner', 'eric.smith', 'tim.golden', 'stutzbach', 'living180', 'takluyver', 'zach.ware', 'ncdave4life', 'eryksun', 'steve.dower', 'Christian \xc3\x85kerstr\xc3\xb6m', 'ethan smith', 'pablogsal', '\xc3\x89tienne Dupuis', 'miss-islington']
    pr_nums = ['15287', '15367', '15369', '15376']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue9949'
    versions = ['Python 3.8', 'Python 3.9']

    @stutzbach
    Copy link
    Mannequin Author

    stutzbach mannequin commented Sep 25, 2010

    In Lib/ntpath.py:

    # realpath is a no-op on systems without islink support
    realpath = abspath

    However, Windows Vista and newer support symbolic links and other Python methods support them.

    (noticed this through source code inspection; haven't actually tested it)

    @stutzbach stutzbach mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Sep 25, 2010
    @vstinner
    Copy link
    Member

    The issue bpo-12799 has been marked as a duplicate of this issue:
    "From a look at the code, realpath() does not seem to resolve symlinks under Windows, even though we have the _getfinalpathname function to do that. Please indulge with me if I'm wrong :)"

    @ncdave4life
    Copy link
    Mannequin

    ncdave4life mannequin commented Mar 9, 2012

    This is a patch for the os.path.realpath() bug under Windows, http://bugs.python.org/issue9949 "os.path.realpath on Windows does not follow symbolic links"

    ntpath.diff fixes the realpath() function to resolve symbolic links to their targets (tested under Windows 7).

    Note: I tried to make the equivalent patch for Python 2.7, but it didn't work because the "from nt import _getfinalpathname" fails.

    @ncdave4life
    Copy link
    Mannequin

    ncdave4life mannequin commented Mar 11, 2012

    It seems that the nt module is implemented in the posixmodule.c source file, and the Python 3 version contains the posix__getfinalpathname entry point, but the Python 2 version does not.

    I presume that PyWin32 could also be used to work around this. Too bad it isn't automatically included with Python:
    http://sourceforge.net/projects/pywin32/files/pywin32/Build%20217/

    @briancurtin
    Copy link
    Member

    file, and the Python 3 version contains the posix__getfinalpathname entry
    point, but the Python 2 version does not.

    I presume that PyWin32 could also be used to work around this. Too bad
    it isn't automatically included with Python:
    http://sourceforge.net/projects/pywin32/files/pywin32/Build%20217/

    We can't use external packages in the standard library. If you're looking
    for a Python which includes pywin32, ActiveState provides this.

    I think we can just implement the function we need.

    @ncdave4life
    Copy link
    Mannequin

    ncdave4life mannequin commented Mar 11, 2012

    Excellent!

    The ntpath.py change is nearly identical in Python 2.7 to the change for Python 3.2. The only difference is that instead of:

    + elif isinstance(path, bytes):
    + path = os.getcwdb()

    It is:

    + elif isinstance(path, unicode):
    + path = os.getcwdu()

    @living180
    Copy link
    Mannequin

    living180 mannequin commented Apr 15, 2012

    I have attached a series of patches with (hopefully) provide more robust fix for this issue, against the Python 3.3 branch. It handles both bytes and str objects, paths that do not actually exist on the filesystem, and removal of the '\\?\' prefix returned by _getfinalpathname, unless the supplied path had that prefix already. A couple of the patches contain some separate infrastructure that could hopefully be used to simplify some of the other Windows code in posixmodule.c (One immediate possibility would be to combine the code provided in these patches with the symbolic-link resolution code in the Windows stat functions - there is quite a bit of duplication there that could be eliminated.)

    One thing these patches do not address is resolving a broken symbolic link. The Windows API function GetFinalPathNameByHandle does not handle this case (because CreateFile cannot be used to get a handle if the symbolic link is broken). This functionality could be implemented by manually following the reparse points, but that would basically require reimplementing GetFinalPathNameByHandle.

    Finally, this patch could be fairly easily backported to Python 3.2, but that shouldn't be done without careful consideration. It changes the return value from os.path.realpath on Windows even when there are no symbolic links in the path (the returned value will have the actual casing as stored on the filesystem, instead of the casing supplied by the user). I don't think it should be backported to Python 2.7, because that version, like all Python versions before Python 3.2 are unaware of symbolic links on Windows (e.g. lexists is the same function as exists).

    The patches were generated using git (I use git-hg) - if that format is a problem, let me know and I can regenerate them.

    @living180
    Copy link
    Mannequin

    living180 mannequin commented Apr 16, 2012

    Uploading a new series of patches - they are all the same as the first set, except for 0006-Make-realpath-follow-symbolic-links-on-Windows.patch. I realized that I could use os.readlink to handle broken symbolic links, so I changed the logic of os.path.realpath slightly to do that (including recursive symlink detection).

    @living180
    Copy link
    Mannequin

    living180 mannequin commented Jun 3, 2012

    The previous version of this patch did not handle bytes arguments correctly and could fail in conjunction with a non-ASCII compatible encoding. Also, if the result was a UNC path, it was not being handled correctly (the returned value would have started with a single backslash, not two). Version 3 of the patch fixes these issues. I am also attaching a single, condensed patch as requested by the Lifecycle of a Patch devguide. I can also provide a patch series as before if desired.

    @atsuoishimoto
    Copy link
    Mannequin

    atsuoishimoto mannequin commented Jul 23, 2012

    Yet another patch to support symlink on Windows.

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Jul 16, 2014

    @zach can you do a patch review on this as it's holding up bpo-13837 ?

    @atsuoishimoto
    Copy link
    Mannequin

    atsuoishimoto mannequin commented Jul 24, 2014

    I have unlinked my patch since it doesn't looks correct now. Sorry for disturbing.

    @zware
    Copy link
    Member

    zware commented Oct 11, 2014

    Daniel: It's taken two years, but I've reviewed your patch :). There are a few things that need to be addressed, but the basic change looks pretty good. If you're still interested in seeing this fixed, I look forward to reviewing an updated patch; otherwise, I'll try to pick this up myself soon.

    @living180
    Copy link
    Mannequin

    living180 mannequin commented Feb 9, 2015

    @zach - thanks for the review. Sorry that it has taken a few months to pick this issue back up. Anyway, here is an updated patch. It is pretty different than the previous patch, to the point that I would consider it a completely new patch.

    Anyway, here's a basic rundown of what the patch contains:

    Two functions in Modules/posixmodule.c were modified to handle bytes objects as well as str objects: win_readlink and _getfinalpathname. This was necessary to allow os.path.realpath() to continue returning bytes when supplied with bytes. I know you said in your review that you didn't care about this because using bytes paths on Windows is deprecated, but I think it still important to maintain backwards compatibility.

    A new implementation of realpath() was added to Lib\ntpath.py, used when _getfinalpathname is available (when _getfinalpathname is not available, realpath remains a synonym for abspath, as it was previously). The logic here has been completely reworked and is roughly modeled on the realpath implementation from Lib\posixpath.py.

    A number of tests were added to Lib\test\test_ntpath.py for realpath() functionality. Three of the tests were based on realpath() tests from test_posixpath.py, but one is completely new: test_realpath_broken_symlinks.

    A note about improved Windows support was added to the documentation for realpath(), but no other doc changes have been made.

    Anyway, feedback is welcome - I'd love to get this code in sufficient shape to see a working realpath() implementation for Windows land in Python.

    @SilentGhost SilentGhost mannequin added the OS-windows label Dec 12, 2015
    @Christiankerstrm
    Copy link
    Mannequin

    Christiankerstrm mannequin commented Sep 9, 2016

    Any update on this? Would be great with a fix for python symlinks on Windows.

    @tienneDupuis
    Copy link
    Mannequin

    tienneDupuis mannequin commented Jan 24, 2018

    @zooba
    Copy link
    Member

    zooba commented Aug 14, 2019

    FYI, there's been some discussion of this on bpo-37834, as the issues quickly became conflated.

    There's also bpo-14094 which is a dup of this one, but with a different patch.

    ---

    To move the relevant discussion here, my current PR is basically the tests from the last attached patch here plus a simpler implementation for non-strict resolution.

    Right now the big open question is how to deal with the \\?\ prefix. My last two commits will leave it alone if it was in the initial path, and otherwise strip it if the non-prefixed path resolves to the same path as the prefixed path (the _getfinalpathname always returns a prefixed path).

    The downside of this is that any unresolvable symlinks (dangling links, cycles, etc.) will now be returned with the prefix, whereas previously they were being returned without any attempt to resolve them being made (not because they are invalid, but because the old code wasn't attempting to resolve anything at all).

    I kinda feel like this is okay, but would appreciate any other opinions on the matter. The alternative is to always strip the prefix, which could make some valid paths become invalid (for example, the one in the new test_realpath_symlink_prefix).

    @zooba zooba added 3.8 (EOL) end of life 3.9 only security fixes labels Aug 14, 2019
    @zooba
    Copy link
    Member

    zooba commented Aug 16, 2019

    Another minor change worth calling out - I added a note to the docs that when a symlink cycle is detected, realpath() could return any member of the cycle, but does not guarantee which one.

    For our test cases it's generally stable enough, but if you change the starting point (e.g. from absolute to relative path) then it can change the result. It seemed easier all round just to not guarantee anything about which member of the cycle will be returned, rather than determining the exact rules, reproducing them across all platforms, and documenting them.

    @zooba
    Copy link
    Member

    zooba commented Aug 21, 2019

    New changeset 75e0649 by Steve Dower in branch 'master':
    bpo-9949: Enable symlink traversal for ntpath.realpath (GH-15287)
    75e0649

    @miss-islington
    Copy link
    Contributor

    New changeset c30c869 by Miss Islington (bot) in branch '3.8':
    bpo-9949: Enable symlink traversal for ntpath.realpath (GH-15287)
    c30c869

    @pablogsal
    Copy link
    Member

    There are multiple failures on several buildbots after commit 75e0649 was merged:

    BUILDBOT FAILURE REPORT
    =======================

    Builder name: AMD64 Windows7 SP1 3.x
    Builder url: https://buildbot.python.org/all/#/builders/40/
    Build url: https://buildbot.python.org/all/#/builders/40/builds/2886

    Failed tests
    ------------

    • test_realpath_curdir (test.test_ntpath.TestNtpath)

    • test_unicode_in_batch_file (test.test_venv.BasicTest)

    Test leaking resources
    ----------------------

    Build summary
    -------------

    == Tests result: FAILURE then FAILURE ==

    384 tests OK.

    10 slowest tests:

    • test_multiprocessing_spawn: 2 min 45 sec
    • test_tools: 2 min 5 sec
    • test_distutils: 1 min 50 sec
    • test_tokenize: 1 min 41 sec
    • test_concurrent_futures: 1 min 39 sec
    • test_lib2to3: 1 min 14 sec
    • test_mmap: 1 min 11 sec
    • test_largefile: 1 min 8 sec
    • test_asyncio: 1 min 4 sec
    • test_io: 1 min 3 sec

    2 tests failed:
    test_ntpath test_venv

    33 tests skipped:
    test_curses test_dbm_gnu test_dbm_ndbm test_devpoll test_epoll
    test_fcntl test_fork1 test_gdb test_grp test_ioctl test_kqueue
    test_multiprocessing_fork test_multiprocessing_forkserver test_nis
    test_openpty test_ossaudiodev test_pipes test_poll test_posix
    test_pty test_pwd test_readline test_resource test_spwd
    test_syslog test_threadsignals test_tix test_tk test_ttk_guionly
    test_wait3 test_wait4 test_xxtestfuzz test_zipfile64

    2 re-run tests:
    test_ntpath test_venv

    Total duration: 13 min 17 sec

    Tracebacks
    ----------

    Traceback (most recent call last):
      File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_venv.py", line 337, in test_unicode_in_batch_file
        out, err = check_output(
      File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_venv.py", line 42, in check_output
        raise subprocess.CalledProcessError(
    subprocess.CalledProcessError: Command '['\\\\?\\C:\\Users\\Buildbot\\AppData\\Local\\Temp\\tmpsbwlxx8h\\\u03fc\u045e\u0422\u03bb\u0424\u0419\\Scripts\\activate.bat', '&', 'python_d.exe', '-c', 'print(0)']' returned non-zero exit status 1.
    
    
    Traceback (most recent call last):
      File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_venv.py", line 337, in test_unicode_in_batch_file
        out, err = check_output(
      File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_venv.py", line 42, in check_output
        raise subprocess.CalledProcessError(
    subprocess.CalledProcessError: Command '['\\\\?\\C:\\Users\\Buildbot\\AppData\\Local\\Temp\\tmpxl3ubg90\\\u03fc\u045e\u0422\u03bb\u0424\u0419\\Scripts\\activate.bat', '&', 'python_d.exe', '-c', 'print(0)']' returned non-zero exit status 1.
    
    
    Traceback (most recent call last):
      File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_ntpath.py", line 210, in test_realpath_curdir
        tester("ntpath.realpath('/'.join(['.'] * 100))", expected)
      File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_ntpath.py", line 30, in tester
        raise TestFailed("%s should return: %s but returned: %s" \
    
    

    Current builder status
    ----------------------

    The builder is failing currently

    Commits
    -------

    Other builds with similar failures
    ----------------------------------

    Steve, can you take a look?

    @zooba zooba closed this as completed Aug 21, 2019
    @zooba zooba self-assigned this Aug 21, 2019
    @zooba
    Copy link
    Member

    zooba commented Aug 21, 2019

    On it

    @zooba
    Copy link
    Member

    zooba commented Aug 21, 2019

    I suspect the relevant failure here (which is not listed in Pablo's post) is this one:

    ======================================================================
    ERROR: test_realpath_curdir (test.test_ntpath.TestNtpath)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_ntpath.py", line 210, in test_realpath_curdir
        tester("ntpath.realpath('/'.join(['.'] * 100))", expected)
      File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_ntpath.py", line 30, in tester
        raise TestFailed("%s should return: %s but returned: %s" \
    test.support.TestFailed: ntpath.realpath('/'.join(['.'] * 100)) should return: C:\buildbot.python.org\3.x.kloth-win64\build\build\test_python_5340\test_python_worker_1408 but returned: \\?\C:\buildbot.python.org\3.x.kloth-win64\build\build\test_python_5340\test_python_worker_1408\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.\.

    The path being normalized is longer than MAX_PATH, which is clearly causing failures with Windows APIs as implemented on Windows 7.

    I can add the normpath() call back in, which should remove all the dots before ever actually testing realpath() and get it back down under MAX_PATH. That won't help anyone using real long paths on Win7 though, but then again the only reason this test was passing before was because there was a normpath() call, so we're no worse off (apart from performance-wise).

    The venv issues seem unrelated. I'll get the realpath one fixed first and then take a look at those once I've finished today's tasks.

    @zooba
    Copy link
    Member

    zooba commented Aug 21, 2019

    Okay, the venv break is related (and it should have broken more frequently).

    The new realpath() implementation leaves the \\?\ prefix behind if the path doesn't exist, since that's the error you get when the path is longer than MAX_PATH and you're on a system that can't handle it.

    I think the best fix is probably to strip the prefix if the file didn't exist at the start and still doesn't exist afterwards.

    @zooba
    Copy link
    Member

    zooba commented Aug 21, 2019

    And sorry, the test_ntpath traceback was half listed in Pablo's message, but was missing the critical lines :)

    @zooba
    Copy link
    Member

    zooba commented Aug 21, 2019

    New changeset 06be2c7 by Steve Dower in branch 'master':
    bpo-9949: Call normpath() in realpath() and avoid unnecessary prefixes (GH-15369)
    06be2c7

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 22, 2019

    I'm tentatively reopening this issue for you to consider the following point, Steve.

    A real path is not always the same as a final path. We can find code that does relpath(realpath(target), realpath(start)) to compute the relative path to target for a symlink. The final path can't be relied on for this unless we always evaluate the symlink from the final path to start. In particular, it cannot be relied on if the relative path traverses a junction.

    What code like this needs from a realpath() implementation is a solid (real) path, not a final path. In other words, the caller wants a solidified form of start that can be used to compute the path to a target for a relative symlink, but one that works when accessed from start, not the final path of start. Generally this means resolving symlinks in the path, but not mount points. That's what Unix realpath() does, but of course there it's simpler because the only name surrogate in Unix is a symlink, which is never a mount point and never a directory.

    Here's an example. In this first case "scripts" is a junction mount point that targets "C:/spam/etc/scripts":

    >>> eggs = r'C:\spam\dlls\eggs.dll'
    >>> scripts = r'C:\spam\scripts'
    
        >>> rel_eggs_right = os.path.relpath(eggs, scripts)
        >>> print(rel_eggs_right)
        ..\dlls\eggs.dll
        >>> os.symlink(rel_eggs_right, 'C:/spam/scripts/eggs_right.dll')
        >>> os.path.exists('C:/spam/scripts/eggs_right.dll')
        True
    
        >>> scripts_final = os.path._getfinalpathname(scripts)[4:]
        >>> print(scripts_final)
        C:\spam\etc\scripts
        >>> rel_eggs_wrong = os.path.relpath(eggs, scripts_final)
        >>> print(rel_eggs_wrong)
        ..\..\dlls\eggs.dll
        >>> os.symlink(rel_eggs_wrong, 'C:/spam/scripts/eggs_wrong.dll')
        >>> os.path.exists('C:/spam/scripts/eggs_wrong.dll')
        False

    If we remove the junction and replace it with a 'soft' symlink that targets the same directory, then using the final path works, and using the given path no longer works.

        >>> print(os.readlink('C:/spam/scripts'))
        C:\spam\etc\scripts
        >>> scripts_final = os.path._getfinalpathname(scripts)[4:]
        >>> rel_eggs_right_2 = os.path.relpath(eggs, scripts_final)
        >>> print(rel_eggs_right_2)
        ..\..\dlls\eggs.dll
        >>> os.symlink(rel_eggs_right_2, 'C:/spam/scripts/eggs_right_2.dll')
        >>> os.path.exists('C:/spam/scripts/eggs_right_2.dll')
        True
    
        >>> rel_eggs_wrong_2 = os.path.relpath(eggs, scripts)
        >>> print(rel_eggs_wrong_2)
        ..\dlls\eggs.dll
        >>> os.symlink(rel_eggs_wrong_2, 'C:/spam/scripts/eggs_wrong_2.dll')
        >>> os.path.exists('C:/spam/scripts/eggs_wrong_2.dll')
        False

    When the kernel traverses "scripts" as a soft link, it collapses to the target (i.e. "C:/spam/etc/scripts"), so our relative path that was computed from the final path is right in this case. On the other hand, if "scripts" is is a mount point (junction), it's a hard (solid) component. It does not collapse to the target (the kernel even checks the junction's security descriptor, which it does not do for a symlink), so ".." in the relative symlink traverses the junction component as if it were an actual directory.

    What we need is an implementation of realpath("C:/spam/scripts") that returns "C:\\spam\\scripts" when "scripts" is a mount point and returns "C:\\spam\\etc\\scripts" when "scripts" is a symlink.

    This means we need an implementation of realpath() that looks a lot like posixpath.realpath. Generally a mount point should be walked over like a directory, just as mount points are handled in Unix. The difference is that a mount point in Windows is allowed to target a symlink. (This is a design flaw; Unix doesn't allow it.) Since we need to know the target of a junction, we have to read the reparse point, until we hit a real directory target. As long as it targets another junction, it remains a hard component. As soon as it targets a symlink, however, it becomes a soft component that needs to be resolved. If the junction targets a name surrogate reparse point that we can't read, then our only option is to get a final path. This is dysfunctional. We should raise an exception for this case. Code can handle the exception and knowingly get a final path instead of a real path.

    This also means we can't reliably compute a real path for a remote path (UNC) because we can't manually evaluate the target of a remote junction. A remote junction is meaningless to us. If we're evaluating a UNC path and reach a junction, we have to give up on a real path and settle for a final path. We can get a final path because that lets the kernel in the server talk to our kernel to resolve any combination of mount points (handled on the server side) and symlinks (handled on our side). This case should also raise an exception. Aware code can handle it by getting a real path and taking appropriate measures.

    @eryksun eryksun reopened this Aug 22, 2019
    @zooba
    Copy link
    Member

    zooba commented Aug 22, 2019

    We can find code that does relpath(realpath(target), realpath(start)) to compute the relative path to target for a symlink.

    In other words, the caller wants a solidified form of start that can be used to compute the path to a target for a relative symlink, but one that works when accessed from start, not the final path of start.

    I don't know how common this scenario is, but I can certainly say that it's never worked on Windows. You'd also end up with a relative symlink in a real directory somewhere (that the junction was pointing at) that is unable to reach target, because it's now being resolved against the wrong start.

    Relative symlinks are nearly as evil as symlink loops :)

    Given there is no POSIX concept of a "final" path, a real path is the closest analogy. If this is the quality of edge case where that happens to break down, I'm okay with leaving it broken.

    (Similarly for the junction/symlink combination on remote systems. I don't see any way that POSIX emulation can properly support that, but it seems like the far more rare case - and I say this as someone whose employer *lives and breathes* unnecessarily complicated SMB shares ;) I've never seen this become an issue.)

    @zooba
    Copy link
    Member

    zooba commented Aug 22, 2019

    New changeset a50d2f7 by Steve Dower in branch '3.8':
    bpo-9949: Call normpath() in realpath() and avoid unnecessary prefixes (GH-15376)
    a50d2f7

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 22, 2019

    Aware code can handle it [the exception] by getting a real path and
    taking appropriate measures.

    That should be "by getting a final path".

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 22, 2019

    We can find code that does relpath(realpath(target), realpath(start)) to compute the relative path to target
    for a symlink.
    ...
    I don't know how common this scenario is, but I can certainly
    say that it's never worked on Windows. You'd also end up with a
    relative symlink in a real directory somewhere (that the
    junction was pointing at) that is unable to reach target,
    because it's now being resolved against the wrong start.

    With ntpath.realpath == ntpath.abspath, it actually works fine for just junctions because in most cases a mount point in Windows should be handled simply as just a directory, not resolved as its target. Revisit my example. This is what I showed when "C:/spam/scripts" is a junction. It's only wrong if the junction targets a directory symlink -- a chimeric mount-point-link that NT has made the mistake of allowing. (The kernel developers should have known to disallow this when they introduced Unix-like symlinks. When evaluating a mount point that targets a symlink, or any name-surrogate reparse point, it should fail the call as an invalid reparse point. For simplicity and our sanity, a mount point should always have consistent semantics in path parsing as a hard component that behaves like a regular directory. Then we could *always* handle it as just a directory -- whether it's local or remote. They messed up badly, IMO.)

    @zooba
    Copy link
    Member

    zooba commented Aug 22, 2019

    At this point, I'm inclined to say let's wait and see what the 3.8.0b4 feedback looks like.

    Given that WSL has been fudging the boundaries here and hasn't suffered as a result, I don't expect much of a problem (and this change does actually make realpath() consistent between Windows native and WSL, which is nice). But I'm definitely open to making changes to this again if it turns out we need to.

    @zooba zooba closed this as completed Aug 22, 2019
    @vstinner
    Copy link
    Member

    It seems like Windows buildbots are back to green (fixed), thanks ;-)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 (EOL) end of life 3.9 only security fixes OS-windows stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants