Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-117596: Fix realpath ValueError on null byte #117573

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

stefanhoelzl
Copy link
Contributor

@stefanhoelzl stefanhoelzl commented Apr 5, 2024

Calling os.path.realpath with a path containg a null byte ("\x00") it raised an ValueError on posix systems.

This fix will catch the ValueError similar to how other path operations are handling such an Error. e.g. os.path.exists, os.path.ismount, os.fspath, os.path.islink

The ValueError is initially raised by os.lstat the other mentioned path operations are handing this ValueError of os.lstat
https://github.com/python/cpython/blob/main/Lib/genericpath.py#L30
https://github.com/python/cpython/blob/main/Lib/genericpath.py#L20
https://github.com/python/cpython/blob/main/Lib/genericpath.py#L64
https://github.com/python/cpython/blob/main/Lib/posixpath.py#L197
https://github.com/python/cpython/blob/main/Lib/posixpath.py#L213

@stefanhoelzl
Copy link
Contributor Author

@barneygale or @encukou maybe on of you could have a quick look at this fix? Since you merged/reviewed a PR in this code area very recently.

Thank you!

@aisk
Copy link
Member

aisk commented Apr 6, 2024

According to the devguide, I think this is not a trivial change like a typo fix, so we need to create an issue first if not exist and add gh-{issue number}: at the beginning of the PR title. The news entry file is also required.

@stefanhoelzl stefanhoelzl changed the title Fix realpath ValueError on null byte gh-117596: Fix realpath ValueError on null byte Apr 7, 2024
@stefanhoelzl
Copy link
Contributor Author

According to the devguide, I think this is not a trivial change like a typo fix, so we need to create an issue first if not exist and add gh-{issue number}: at the beginning of the PR title. The news entry file is also required.

Thanks! Created an issue and added entries to Misc/NEWS.d/ and Misc/ACKS

Calling os.path.realpath with a path containg a null byte ("\x00")
it raised an ValueError on posix systems.

This fix will catch the ValueError similar to how other path operations are handling such an Error.
e.g. os.path.exists, os.path.ismount, os.fspath, os.path.islink
Copy link
Contributor

@barneygale barneygale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Please could you add a test that ValueError is still raised when strict=True? Also mention that your change affects non-strict mode in the NEWS please.

@bedevere-app
Copy link

bedevere-app bot commented Apr 7, 2024

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@barneygale
Copy link
Contributor

For future reference, please don't force-push to CPython PR branches -- it makes the changes a little harder to follow for reviewers, and every PR gets squashed anyway.

@stefanhoelzl
Copy link
Contributor Author

Looking good! Please could you add a test that ValueError is still raised when strict=True? Also mention that your change affects non-strict mode in the NEWS please.

Thank you, for your review!
I have made the requested changes; please review again

@bedevere-app
Copy link

bedevere-app bot commented Apr 7, 2024

Thanks for making the requested changes!

@barneygale: please review the changes made to this pull request.

@bedevere-app bedevere-app bot requested a review from barneygale April 7, 2024 15:25
Copy link
Contributor

@barneygale barneygale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you. I'll wait a few days before merging just in case another reviewer spots an issue I've missed.

@eryksun
Copy link
Contributor

eryksun commented Apr 7, 2024

Looking good! Please could you add a test that ValueError is still raised when strict=True? Also mention that your change affects non-strict mode in the NEWS please.

In 3.11+ on Windows, realpath() raises OSError for an embedded null in strict mode. This was changed by a PR for issue #106242. I was against changing normpath() to raise ValueError for an embedded null, but the realpath() behavior was a separate case. I summarized the realpath() case in this comment. On Windows, it had been mistakenly raising ValueError for an embedded null since 3.8. When the strict parameter was added in 3.10, it should have been fixed to ignore the ValueError in the default non-strict mode. Steve fixed it in 3.11, but for some reason he also decided to convert the ValueError to OSError in strict mode.

ntpath source:

        except ValueError as ex:
            # gh-106242: Raised for embedded null characters
            # In strict mode, we convert into an OSError.
            # Non-strict mode returns the path as-is, since we've already
            # made it absolute.
            if strict:
                raise OSError(str(ex)) from None

example:

>>> os.path.realpath('\0')
'C:\\Temp\\\x00'
>>> os.path.realpath('\0', strict=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<frozen ntpath>", line 704, in realpath
OSError: _getfinalpathname: embedded null character in path

@stefanhoelzl
Copy link
Contributor Author

Steve fixed it in 3.11, but for some reason he also decided to convert the ValueError to OSError in strict mode.

Good to know, I am not able to test with Windows.
Should I convert the ValueError to OSError in strict mode to be consistent with the Windows implementation?

The missing tests you mentioned in your comment are also added with this PR.

@eryksun
Copy link
Contributor

eryksun commented Apr 8, 2024

Should I convert the ValueError to OSError in strict mode to be consistent with the Windows implementation?

That's up to @barneygale and @zooba.

@zooba
Copy link
Member

zooba commented Apr 8, 2024

IIRC, we changed to OSError because someone pointed out that we didn't specify that ValueError might be raised (and apparently I either didn't win the argument that "ValueError is always implied when you pass a bad value", or I couldn't be bothered arguing that day).

Strictly speaking, since it's never going to work without the caller changing the value (as opposed to the OS changing), ValueError is correct.

@eryksun
Copy link
Contributor

eryksun commented Apr 8, 2024

Steve, in #106242 I argued against changing normpath() to raise ValueError. You opted to modify the C implementation of _path_normpath() instead, so the discussion about ValueError became irrelevant.

The behavior of realpath() also came up. On Windows, it started raising ValueError for embedded null characters in 3.8, and similarly on POSIX in 3.10. Personally, I just wanted realpath() to ignore ValueError in the default permissive mode.

Strictly speaking, since it's never going to work without the caller changing the value (as opposed to the OS changing), ValueError is correct.

Is that a vote for changing it back to raising ValueError in strict mode on Windows? Or has that ship sailed?

@zooba
Copy link
Member

zooba commented Apr 8, 2024

Ah, I was going for consistency with the earlier behaviour, which was the wrong behaviour as well by my definition above. (I do remember someone arguing that it wasn't documented though, so perhaps that was another issue? I don't think it was Eryk Sun.)

I guess that's a vote for changing the Windows implementation to not replace that particular error, but there's no particular urgency (i.e. not in this PR). And I think non-strict mode should be as garbage-in-garbage-out as we can make it, so suppressing the error in that case is fine as long as the final string includes the text after the null.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants