os.path.expanduser should not use HOME on windows #80445
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
assignee = 'https://github.com/zooba' closed_at = <Date 2019-03-12.22:16:16.883> created_at = <Date 2019-03-11.16:09:57.334> labels = ['3.8', 'type-bug', 'library', 'OS-windows'] title = 'os.path.expanduser should not use HOME on windows' updated_at = <Date 2020-05-16.17:18:51.381> user = 'https://github.com/asottile'
activity = <Date 2020-05-16.17:18:51.381> actor = 'jaraco' assignee = 'steve.dower' closed = True closed_date = <Date 2019-03-12.22:16:16.883> closer = 'steve.dower' components = ['Library (Lib)', 'Windows'] creation = <Date 2019-03-11.16:09:57.334> creator = 'Anthony Sottile' dependencies =  files =  hgrepos =  issue_num = 36264 keywords = ['patch'] message_count = 27.0 messages = ['337683', '337750', '337752', '337754', '337755', '337756', '337763', '337781', '337784', '337813', '338162', '357182', '357192', '357194', '357197', '360846', '360853', '368931', '368932', '368933', '368935', '368963', '369025', '369034', '369045', '369049', '369060'] nosy_count = 10.0 nosy_names = ['paul.moore', 'jaraco', 'blueyed', 'tim.golden', 'zach.ware', 'eryksun', 'steve.dower', 'lazka', 'Anthony Sottile', 'miss-islington'] pr_nums = ['12282', '12294', '17961', '18229'] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue36264' versions = ['Python 3.8']
The text was updated successfully, but these errors were encountered:
The current code for
I can't find any documentation of
Seems to be (one of) the direct causes of a downstream bug for me: pre-commit/pre-commit#960
My proposal is to remove these lines and change the
I like the patch, but I'm not sure all the tests are properly preserving the real value of USERPROFILE.
Modifying this value could have a real impact on the rest of the process, so we should be very careful to undo it regardless of test result. (Modifying HOME is not a as big a deal since, as you point out, it's not "real" ;) )
Also, it's probably better to get the current value and check against that, rather than setting it to a known value. One day we might do the right thing and switch expanduser() from the environment variables to the correct API, which would break the test.
Agreed, though I'd like to write it off as an existing problem (if it is a problem, it looks to me like the tests are doing proper environment cleanup). If they are broken, they'd already be leaking environment variables on posix since these tests run on both
Auditing the tests I modified:
so actually, looks like this is already good to go on that front!
Definitely agree on using the true value in the long term -- I'd like to defer that until that's actually happening though as it'll be a much more involved patch!
Guido intentionally added support for HOME in ntpath.expanduser way back in Python 1.5 (circa 1997), and now we're removing it over 20 years later. I expect this to break some deployments, but it's not a serious problem.
My feeling here is "meh". Practically all use of expanduser() is dubious in Windows, varying from going against convention to being completely wrong. We're long due for a module in the standard library that abstracts access to platform-specific configuration and special directories, but attempts to adopt one keep losing momentum. I guess it's too prone to bike shedding and arguments.
Just as a sidenote: the other day I was checking Python's source to see how to override a user's home (in tests, when os.path.expanduser` is used in code), and found it easy to just set $HOME in the environment in the end.
I guess this change will cause quite some regressions in this regard - even though $HOME might not be used in practice on Windows.
Was pathlib forgotten here? Pathlib.home() is documented to return the same as expanduser("~") but it still prefers HOME instead of USERPROFILE.
Note that this change has some effect on cygwin/mingw environments which all set HOME and now potentially lead to programs no longer being able to find their config files. Not that I disagree with this change, just that it seems a bit rushed to me.
Yes, it was forgotten (why doesn't it just use expanduser?). We should file a new bug for that.
Firstly these are not supported environments, so it's not "rushed" for us to not preemptively consider them (though we'll happily merge most PRs that fix them without impacting supported environments).
And I thought the idea was that they'd use posixpath as os.path rather than ntpath? Cygwin in particular, which provides the full environment. MinGW is a bit lighter to be closer to normal Windows behaviour, which would suggest that using the Windows variables is preferable.
I'll file one.
Yeah, you're right, sorry, my comment wasn't appropriate.
I meant executing Python scripts with official Python in bash on Windows which sets HOME. But I just checked with "git for Windows" and it sets HOME to USERPROFILE anyway, so that might only affect cygwin/msys2 which have their own user dir.
Today I ran into an issue due to this change. I have a test that sets
First, I'll nitpick that Windows is posix-compliant. What you probably mean here is "Unix".
But the important thing here is that one of the beautiful things about Python is how it bridges the gap between Unix and Windows, in many ways providing a uniform interface across various platforms. By supporting
With this change, a user is now required to provide separate guidance for invocation on Windows.
In addition to the Setuptools test above that failed, I know I've relied on this behavior before in uses where I've written documentation advising users to set "HOME" to override a location derived from ~ (perhaps for .pypirc, but probably for others as well). Now guidance like that changes from:
Set HOME to the target directory.
If on Windows and Python 3.8 or later, set USERPROFILE to the target directory. Otherwise, set HOME to the target directory.
It also means that code like the tests in Setuptools need to implement this branching logic as well.
Can we restore _some_ mechanism by which a caller can supply the home directory in a unified way?
Config on Windows should go into APPDATA not USERPROFILE/HOME, on macOS it should go to "$HOME/Library/Application Support" and on Linux $XDG_CONFIG_HOME or $HOME/.config. So using using HOME on all platforms for config is not what those platforms recommend, though I understand why it's still done this way by many tools.
What about supporting a MYFANCYTOOL_CONFIG_DIR env var that the user can override? That's what I do for example.
Using HOME in Windows is not reliable because it is not a system environment variable. That makes it fair game for administrators, users, and applications to use however they like.
Platform differences can be papered over with functions. For example, add set_user_home(dir) and get_user_home(name=''). set_user_home sets a private global variable named _user_home, which defaults to None. get_user_home returns _user_home if it's not None and name is an empty string. Otherwise it gets the home directory per platform rules.
This is unfortunate. Modifying USERPROFILE is highly unusual. USERPROFILE is the location of the user's "NTUSER.DAT" registry hive and local application data ("AppData\Local"), including "UsrClass.dat" (the "Software\Classes" registry hive). It's also the default location for a user's known shell folders and home directory. Modifying USERPROFILE shouldn't cause problems with any of this, but I'm not completely at ease with it.
If a user profile defines a custom home directory (e.g. net user <username> /homedir:<dir>), it will be reflected in "%HOMEDRIVE%%HOMEPATH%", and not in USERPROFILE. In a more perfect world, in which no one ever depended on the default location of shell folders, I'd recommend flipping the order around in ntpath.expanduser to prefer "%HOMEDRIVE%%HOMEPATH%".
That said, a major snag with HOMEDRIVE and HOMEPATH is that they're sometimes wrong or missing. CreateEnvironmentBlock() will enumerate but won't populate the user's "Volatile Environment" registry key, which is where HOMEDRIVE and HOMEPATH are normally set. This key gets populated during startup of a desktop session, and, since it's volatile, it gets deleted when the profile is unloaded (i.e. the user logs off). Consequently the Secondary Logon service (i.e. CreateProcessWithLogonW), which runas.exe uses, will only set the correct HOMEDRIVE and HOMEPATH values if the user is already logged on as a desktop session user. Otherwise it uses a default value of "%SystemRoot%\System32" as the home directory, which is spectacularly wrong. Similarly, a scheduled task that uses an S4U batch logon won't even have HOMEDRIVE and HOMEPATH defined if the task user isn't already logged on as a desktop session user. USERPROFILE will still be correct in these cases.
The Windows subsystem has never been POSIX compliant. At most, Microsoft's C runtime library provides rudimentary POSIX API support (e.g. open, read).
On the other hand, NT's original POSIX subsystem (psxss.exe) was POSIX.1 (1990) compliant. In 1999, Microsoft purchased the more capable OpenNT Unix system from Softway Systems and rebranded it as Interix. In Server 2003 R2, Interix was integrated in NT as the Subsystem for UNIX-based Applications (SUA). It was removed in Server 2012 R2. Microsoft abandoned the old conception of an NT subsystem in favor of virtualization concepts such as pico processes and pico providers (WSL 1) or hosting a kernel in a light-weight VM (WSL 2).
Really, we shouldn't be using any environment variables on Windows here, because they open up too many security risks. There are API calls that are canonical, but the environment vars are compatibility helpers.
Breakage due to HOME being overridden is serious because it won't show up in any other cases - Python will be the first to suffer the consequences, which means we are facing a targeted exploit. Not really much choice but to fix it (though there was a choice whether to release a security advisory or not... ;-) )
The documentation was definitely updated, and it was in NEWS, but you're right there was no DeprecationWarning, not that we'd have been able to show it to most impacted library developers anyway.
Perhaps the best approach for the sake of POSIX compatibility is to set HOME on startup to the correct value? It won't normally be set, so anyone using it is likely broken on Windows, but if we make it valid then everyone can just rely on it?
If Python starts setting
Certainly, it's arguable that such programs are not respecting Windows conventions correctly, but in practical terms that's not really that relevant. The user will just see it as "Python can't call my program correctly".
This is a very real issue, that I used to hit a lot in the past, when native Windows support was a lot less common in open source code than it is now. And it was nearly always made worse by the programs that tried to be "too clever" about hiding the differences between Windows and Unix.
So I'm a strong -1 on Python doing anything with
I’m not suggesting that from within Python there should be another way to detect a home directory. The expanduser functionality as written is just fine for that, though I agree with Steve that using an API on Windows would be even better.
The issue is that from outside Python there’s now no platform-agnostic way to override the home directory, either for tests or to suppress user-local configuration. In addition to the test case, there is a user-facing use-case that's made more complicated:
It was previously difficult enough because “set an environment variable” is a shell-dependent operation, so already requires the user to have some expertise on how to set environment variables in their environment.
This change requires the guidance to be more complicated or for the expertise to be greater.
Evidence is to the contrary.
It seems other tools like 'git' honor the HOME directory:
PS C:\Users\jaraco> $env:HOME = "C:\Users\jaraco\temp"
Interestingly, I notice that Powershell does set the $HOME environment variable on Windows, reinforcing the concept that $HOME is a platform agnostic way to communicate a home directory.
It does appear as if Ruby honors HOME and sets it if unset on Windows. I'm not a fan of setting the variable, but there is precedent.
It seems to me Python 3.8 is the outlier and that honoring $HOME, while somewhat awkward on Windows, was a valuable unifying behavior, and its removal without any plans for a replacement is causing harm. Just searching through some of the projects I have checked out, I find a few.
Devpi client relies on setting home for tests.
What is the exploit? I don't think the downsides of honoring HOME on Windows were captured above.
From what I could tell, there was one (fairly obscure) case where HOME was set unexpectedly. That hardly seems like a justification to reverse a long-standing documented feature that aligns with the expectation of other tools in the ecosystem.
I thought it might be useful for testing purposes if os.path (i.e. ntpath and posixpath) had a way to set the home directory that it uses in way that wouldn't affect other libraries and child processes.
Many cross-platform projects follow Unix conventions, which is simpler for development and documentation. These projects are often targeted at developers, who usually have a strong understanding of Unix conventions and system configuration. But Python is a general-purpose scripting language used for everything from system administration to major applications. It needs to follow platform conventions, but it should also abstract away platform differences as much as possible.
Not for me in PowerShell 5 and 7:
I was thinking about this, and I'd argue that in the general case, you _want_ to be able to set the home directory in a way that affects child processes. If you care enough to try to isolate the behavior from the user-local state, you probably want other libraries and child processes to honor that isolation.
This argument extends to there being a (preferably one) way override a home directory that's honored by tools and processes of all flavors (e.g. overriding USERPROFILE has no effect on git commands in a subprocess).
I agree it may prove useful not to affect the global and subprocess state if possible, but I'd argue in that case, you could probably just patch the library under test directly.
These processes patching $HOME really are attempting to suppress the user's local state (or simulate a specific state).
Weird. You're right, I get the same thing for test-path. And indeed it's not set in the Python process.
PS C:\WINDOWS\system32> test-path env:home
But strangely, $HOME seems to have some effect.
PS C:\WINDOWS\system32> echo $HOME C:\Users\jaraco
Do you know what that's about?
And this Powershell documentation makes reference to $HOME.
I see. Powershell defines $HOME as an Automatic Variable internally but doesn't expose it as an environment variable.