Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On Windows, shutil.which incorrectly resolves to non-executable extensionless files (python >= 3.12.0b1) #109590

Closed
dairiki opened this issue Sep 20, 2023 · 14 comments
Labels
3.12 bugs and security fixes 3.13 new features, bugs and security fixes OS-windows stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@dairiki
Copy link

dairiki commented Sep 20, 2023

Bug report

Bug description:

This change (935aa45, PR #103179) changed the behavior of shutil.which so that it checks for the extensionless filename first, before checking extensions from PATHEXT. Previously, the extensionless name was not checked at all.

This is causing problems because, at least in my case, there is both an extensionless and extensioned version of the program, and the extensionless version is a bash script (which apparently can't be executed directly on Windows.)


Full Description

On a GitHub Windows shared runner, node.js is installed.

Run Get-ChildItem -Path "C:\Program Files\nodejs"

    Directory: C:\Program Files\nodejs

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d----           9/10/2023 11:53 PM                node_modules
[... elided ...]
-a---            8/8/2023 11:20 PM       71207576 node.exe
-a---           6/23/2023  6:02 AM            702 nodevars.bat
-a---            8/8/2023 10:22 PM           1365 npm
-a---           6/23/2023  6:01 AM            483 npm.cmd
-a---            8/8/2023 10:22 PM           1567 npx
-a---           6/23/2023  6:01 AM            539 npx.cmd

Here, npm is a bash script, while npm.cmd is a cmd.exe script.

In python 3.11 (and before), shutil.which("npm") returns "C:\\Program Files\\nodejs\\npm.CMD". That can be executed successfully by subprocess.run.

In python 3.12rc3, shutil.which("npm") returns "C:\\Program Files\\nodejs\\npm", which, being a bash script, appears not to be by executable by subprocess.

subprocess.run((shutil.which("npm"), "install"))

fails with OSError: [WinError 193] %1 is not a valid Win32 application on python 3.12rc3.

Notes

On the system in question Get-Command npm returns C:\Program Files\nodejs\npm.cmd. So PowerShell gets it right.

(PATHEXT is .COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.CPL.)


I'm not absolutely convinced that this is a bug. But again, I'm also not sure how to work around it without essentially reimplementing shutil.which.

CPython versions tested on:

3.12

Operating systems tested on:

Windows

Linked PRs

@dairiki dairiki added the type-bug An unexpected behavior, bug, or error label Sep 20, 2023
@dairiki
Copy link
Author

dairiki commented Sep 20, 2023

More research as to the history of the change:

The original name, without added extension from PATHEXT, was added to the search list in #103179 to address #75586:

On windows, shutil.which does not match the semantics of built-in command lookup. If you pass the name of a script like foo.py and the PATHEXT variable doesn't include .py it will search for foo.py.exe, foo.py.bat, foo.py.cmd, etc. but not foo.py, which should be the first name checked.

But, later in #75586, there is this:

Note that CMD will only try to execute an extensionless file if "." is in PATHEXT. (In the Windows file namespace, "name" and "name." are equivalent.) [...]

Assuming that is true, then the original name should only be prepended to the search list if it has an extension. Extensionless names should not be considered when searching for an executable.

@dairiki dairiki changed the title Changes in the handling of PATHEXT by shutil.which are causing issues on Windows On Windows, shutil.which incorrectly resolves to extensionless files (python > 3.12.0b1) Sep 20, 2023
@dairiki dairiki changed the title On Windows, shutil.which incorrectly resolves to extensionless files (python > 3.12.0b1) On Windows, shutil.which incorrectly resolves to extensionless files (python >= 3.12.0b1) Sep 20, 2023
@dairiki dairiki changed the title On Windows, shutil.which incorrectly resolves to extensionless files (python >= 3.12.0b1) On Windows, shutil.which incorrectly resolves to non-executable extensionless files (python >= 3.12.0b1) Sep 21, 2023
@csm10495
Copy link
Contributor

@eryksun can you give some thoughts here?

I'm a bit nervous about this type of case hitting others in 3.12 and wondering your thoughts.

We could add logic to shutil.which to do:

if there is a directory (/ or \\) in the cmd OR there is a dot in it (denoting an extension), check the given cmd first. 

OR 

IF '.' is in PATHEXT, also check cmd first if it doesn't have a dot.

I think this gets it in-line with what CMD would do, but I'm not 100% on it.

@csm10495
Copy link
Contributor

Some testing with CMD:

C:\Users\csm10495\Desktop\cpython>PCbuild\amd64\<TAB>
_testembed.exe     python.exe         pyw.exe            venvwlauncher.exe
py.exe             pythonw.exe        venvlauncher.exe
C:\Users\csm10495\Desktop\cpython>PCbuild\amd64\python
Python 3.13.0a0 (heads/customization_modules_test:cbad68d4d2, Sep 23 2023, 21:14:32) [MSC v.1936 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()
C:\Users\csm10495\Desktop\cpython>echo "TEST" > PCbuild\amd64\python

C:\Users\csm10495\Desktop\cpython>PCbuild\amd64\python
Python 3.13.0a0 (heads/customization_modules_test:cbad68d4d2, Sep 23 2023, 21:14:32) [MSC v.1936 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()

C:\Users\csm10495\Desktop\cpython>PCbuild\amd64\python.
'PCbuild\amd64\python.' is not recognized as an internal or external command,
operable program or batch file.

C:\Users\csm10495\Desktop\cpython>type PCbuild\amd64\python
"TEST"

C:\Users\csm10495\Desktop\cpython>echo %PATHEXT%
.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.PY;.PYW

So i guess if you give a directory component, it still consults PATHEXT first.

csm10495 added a commit to csm10495/cpython that referenced this issue Sep 24, 2023
An extensionless file will on be attempted if '.' is in PATHEXT

Fix up the tests to make better use of the bytes testing of shutil.which
csm10495 added a commit to csm10495/cpython that referenced this issue Sep 24, 2023
An extensionless file will on be attempted if '.' is in PATHEXT

Fix up the tests to make better use of the bytes testing of shutil.which
@csm10495
Copy link
Contributor

I implemented the '.' in pathext to check in this draft pr: #109797
.. but i couldn't replicate the described behavior in CMD in Win 10.0.19045.3448:

I cleared C:\Program Files\nodejs\npm to be an empty file. If it was called it shouldn't really do anything.. but it kept hitting npm.cmd instead.

C:\Windows\System32>where npm
C:\Program Files\nodejs\npm
C:\Program Files\nodejs\npm.cmd

C:\Windows\System32>echo %PATHEXT%
.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.PY;.PYW

C:\Windows\System32>type "C:\Program Files\nodejs\npm"

C:\Windows\System32>npm --version
8.5.2

C:\Windows\System32>set PATHEXT=%PATHEXT%;.;

C:\Windows\System32>echo %PATHEXT%
.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.PY;.PYW;.;

C:\Windows\System32>where npm
C:\Program Files\nodejs\npm
C:\Program Files\nodejs\npm.cmd

C:\Windows\System32>type "C:\Program Files\nodejs\npm"

C:\Windows\System32>npm --version
8.5.2

C:\Windows\System32>set PATHEXT=.;%PATHEXT%;.;

C:\Windows\System32>npm --version
8.5.2

C:\Windows\System32>echo %PATHEXT%
.;.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.PY;.PYW;.;;.;

I'll wait for @eryksun 's thoughts.

@csm10495
Copy link
Contributor

csm10495 commented Sep 24, 2023

One last thing before bed :)

Maybe it does work but only if the rest of PATHEXT can't resolve something first (as in: it ignores the order in PATHEXT. Even if the dot is first, it will do all the other extensions before checking for an extensionless file).

I renamed npm.cmd to .npm.cmd, then:

C:\Windows\System32>npm
'npm' is not recognized as an internal or external command,
operable program or batch file.

C:\Windows\System32>echo %PATHEXT%
.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.PY;.PYW

C:\Windows\System32>set PATHEXT=%PATHEXT%;.

C:\Windows\System32>npm
<opened the npm file which led to the 'how should windows open this dialog'>

C:\Windows\System32>dir "C:\Program Files\nodejs"
 Volume in drive C has no label.
 Volume Serial Number is CAC3-BECF

 Directory of C:\Program Files\nodejs

09/23/2023  11:16 PM    <DIR>          .
09/23/2023  11:16 PM    <DIR>          ..
02/02/2022  06:00 AM               483 .npm.cmd
01/10/2022  04:22 PM               335 corepack
01/10/2022  04:22 PM               218 corepack.cmd
02/02/2022  06:00 AM             3,032 install_tools.bat
03/10/2022  06:31 PM        62,530,184 node.exe
10/14/2021  12:30 AM               702 nodevars.bat
03/15/2022  04:07 PM    <DIR>          node_modules
09/23/2023  11:09 PM                11 npm
02/02/2022  06:00 AM             1,552 npx
02/02/2022  06:00 AM               539 npx.cmd
               9 File(s)     62,537,056 bytes
               3 Dir(s)  390,529,925,120 bytes free

C:\Windows\System32>

If that's the desired behavior, it wouldn't be too hard to update the PR.

@csm10495
Copy link
Contributor

@zooba any thoughts on this?

@zooba
Copy link
Member

zooba commented Sep 26, 2023

Assuming that is true, then the original name should only be prepended to the search list if it has an extension. Extensionless names should not be considered when searching for an executable.

This sounds reasonable at face value.

CMD is not the hallmark of consistent behaviour. I wouldn't want to set it as our standard. If you want CMD behaviour, pass shell=True (no, don't really).

The closer we get to the underlying API behaviour, the happier I'll be - in this case, it's SearchPathW.

@csm10495
Copy link
Contributor

So unfortunately (or fortunately), SearchPathW doesn't match CMD behavior here and actually matches Python 3.12's current behavior:

Via pywin32:

In [1]: from win32api import *

In [2]: SearchPath(None, 'npm')
Out[2]: ('C:\\Program Files\\nodejs\\npm', 24)

In [3]: import os

In [4]: os.environ.get('pathext')
Out[4]: '.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.PY;.PYW'

So it seems fine giving back a the file without an extension. So I guess we match right now.

I can't seem to find any definitive documentation on the expectation of PATHEXT and extensionless. So I guess at the moment if we're matching SearchPath.. we match.

@AA-Turner AA-Turner added OS-windows stdlib Python modules in the Lib dir 3.12 bugs and security fixes 3.13 new features, bugs and security fixes labels Sep 27, 2023
@zooba
Copy link
Member

zooba commented Sep 27, 2023

Okay, I have another option.

When mode & os.X_OK is True (which would be the default), on Windows we should require that the returned path end with an extension from PATHEXT. This matches the intent of that argument to find an executable file.

If not (mode & os.X_OK), then we use normal SearchPath behaviour.

We should document this clearly (as well as marking it as a change in 3.12.1, and probably clarifying that it's more consistent with 3.11 and earlier as a result of the change).

@csm10495
Copy link
Contributor

How'd you come up with that behavior combo? I figured out behavior should match some existing documentation or behavior but can't think of how you came up with that. :P

Would it allow a direct match without an extension if a dot is in PATHEXT? Also if a dot is in there would it match a file with a dot as the last character but nothing after?

@zooba
Copy link
Member

zooba commented Sep 27, 2023

How'd you come up with that behavior combo?

os.startfile (internally, ShellExecute), which prefers a file with a PATHEXT over one without. So does Get-Command in PowerShell, and even Cmd when you actually go to run the progrem. It's only Cmd's where command that prioritises the extensionless file.

But basically, it's a combo that works for our own venv activate scripts. All of Activate, activate.bat and Activate.ps1 are there, but the shell you run in chooses the right one if you just write activate. For generic apps, it's probably fair to assume PATHEXT as the list of extensions that are executable, and so when we're specifically looking for executables, we should prioritise that.

Would it allow a direct match without an extension if a dot is in PATHEXT?

I don't think any of the examples above care about a dot in PATHEXT, but some of them will treat a trailing dot as a sign to ignore PATHEXT (and then the dot gets normalised away to match the dotless file).

I don't think you can't create a file association for an extensionless file (only by creating an association for all files), which means it will never be executable, so I'd be inclined to not support it unless someone comes up with a legitimate use (perhaps Git Bash? I'd rather special case "bash that works on Windows" though, rather than abusing PATHEXT).

@csm10495
Copy link
Contributor

@zooba take a look at the new pr: #109995 when you get a chance. I think I did what you described.

Too bad we missed Python 3.12.0... Is it worth having the release hold a bit and pull this in?

@zooba
Copy link
Member

zooba commented Sep 28, 2023

Having seen the bugs that are getting in vs. being deferred, I don't think we want this one in. The risk far outweighs the value right now.

If we'd found it before RC1, then definitely. Right now, the important thing to do is to make sure that anyone who fixes this in 3.12.0 is still going to work in 3.12.1 (which is why we check if there's already a PATHEXT extension on the requested name).

csm10495 added a commit to csm10495/cpython that referenced this issue Sep 29, 2023
zooba pushed a commit that referenced this issue Oct 2, 2023
…on on executable files (GH-109995)

The default arguments for shutil.which() request an executable file, but extensionless files are not executable on Windows and should be ignored.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 2, 2023
…xtension on executable files (pythonGH-109995)

The default arguments for shutil.which() request an executable file, but extensionless files are not executable on Windows and should be ignored.
(cherry picked from commit 29b875b)

Co-authored-by: Charles Machalow <csm10495@gmail.com>
Yhg1s pushed a commit that referenced this issue Oct 2, 2023
…extension on executable files (GH-109995) (#110202)

gh-109590: Update shutil.which on Windows to prefer a PATHEXT extension on executable files (GH-109995)

The default arguments for shutil.which() request an executable file, but extensionless files are not executable on Windows and should be ignored.
(cherry picked from commit 29b875b)

Co-authored-by: Charles Machalow <csm10495@gmail.com>
dairiki added a commit to dairiki/lektor that referenced this issue Oct 31, 2023
This works around a bug in python 3.12.0's `shutil.which` when running
on Windows.

See python/cpython#109590
dairiki added a commit to dairiki/lektor that referenced this issue Nov 1, 2023
This works around a bug in python 3.12.0's `shutil.which` when running
on Windows.

See python/cpython#109590
dairiki added a commit to dairiki/lektor that referenced this issue Nov 1, 2023
This works around a bug in python 3.12.0's `shutil.which` when running
on Windows.

See python/cpython#109590
dairiki added a commit to dairiki/lektor that referenced this issue Nov 1, 2023
This works around a bug in python 3.12.0's `shutil.which` when running
on Windows.

See python/cpython#109590
dairiki added a commit to dairiki/lektor that referenced this issue Nov 1, 2023
This works around a bug in python 3.12.0's `shutil.which` when running
on Windows.

See python/cpython#109590
dairiki added a commit to dairiki/lektor that referenced this issue Nov 1, 2023
This works around a bug in python 3.12.0's `shutil.which` when running
on Windows.

See python/cpython#109590
dairiki added a commit to lektor/lektor that referenced this issue Nov 2, 2023
* test: test under python 3.12

* fix: use shell=True rather than shutil.which to find npm binary

This works around a bug in python 3.12.0's `shutil.which` when running
on Windows.

See python/cpython#109590
@hugovk
Copy link
Member

hugovk commented Nov 9, 2023

Closing as the PRs have been merged. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 new features, bugs and security fixes OS-windows stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants