Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛/🙏🏻? ~ CLI apps need original command line (WinOS) #9871

Open
rivy opened this issue Mar 23, 2021 · 27 comments
Open

🐛/🙏🏻? ~ CLI apps need original command line (WinOS) #9871

rivy opened this issue Mar 23, 2021 · 27 comments
Labels
needs investigation requires further investigation before determining if it is an issue or not suggestion suggestions for new features (yet to be agreed) windows Related to Windows platform

Comments

@rivy
Copy link
Contributor

rivy commented Mar 23, 2021

For Windows, CLI applications need access to the original command line to supply basic services that the Windows shells (CMD/PowerShell) do not, such as wildcard/glob expansion. For example:

In *nix (bash-shell):

$ # POSIX with bash shell
$ ls -l *.mkd
-rwxrwxrwx 1 toor toor 1.2K Jul 28  2020 kb-ToDO.mkd*
-rwxrwxrwx 1 toor toor 1.1K Jul 19  2020 kb-info.mkd*

$ deno eval "console.log(Deno.args)" "*.mkd" *.mkd
[ "*.mkd", "kb-ToDO.mkd", "kb-info.mkd" ]

The Windows versions:

>:: CMD shell
>dir /b *.mkd
kb-info.mkd
kb-ToDO.mkd

>deno eval -T "console.log(Deno.args)" "*.mkd" *.mkd
[ "*.mkd", "*.mkd" ]

>:: PowerShell
>powershell
...

PS> deno eval "console.log(Deno.args)" "*.mkd" *.mkd
[ "*.mkd", "*.mkd" ]

You'll note that the information required to perform reasonable wildcard/glob expansion is removed in the Deno.args array. Specifically here, the quotes are removed, so there is no application-detectable difference between the two arguments. For *nix/bash, it doesn't matter as the shell does the expansion for the application and supplies those tailored arguments, but Windows applications are expected to do that work themselves. The only way to do that is to have access to the original, unparsed command line (such as via GetCommandLineW).

@rivy rivy changed the title 🐛 (maybe feat request) CLI apps need original command line (Windows) 🐛 (or feat request?) CLI apps need original command line (Windows) Mar 23, 2021
@rivy
Copy link
Contributor Author

rivy commented Mar 27, 2021

From reading other posts (thread #3892), I gather that the project doesn't currently want to supply access to OS specific APIs. And I don't know if it's relevant to any other platforms, but really only adds potential parity with POSIX systems for Windows applications.

This capability could very simply be added as something like Deno.commandLine, which would be undefined for platforms which don't supply it (currently, only non-Windows). It's really the only way to correctly add wildcard/globbing to Windows CLI applications.

@stale
Copy link

stale bot commented May 27, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label May 27, 2021
@rivy
Copy link
Contributor Author

rivy commented May 27, 2021

Not stale.

@stale stale bot removed the stale label May 27, 2021
@lucacasonato
Copy link
Member

@rivy What is the Node / Go behaviour here?

cc @piscisaureus

@lucacasonato lucacasonato added needs investigation requires further investigation before determining if it is an issue or not windows Related to Windows platform labels May 27, 2021
@piscisaureus
Copy link
Member

I don’t think processes have access to the original command line (except cmd on windows, but cmd doesn’t do shell globbing)

@piscisaureus
Copy link
Member

Re-reading this, seems that @rivy's feature request is implied here:

Specifically here, the quotes are removed, so there is no application-detectable difference between the two arguments

On unix, the quotes are interpreted by the shell; the application won't have access to it.
On windows, it is technically possible to use GetCommandLineW to "see" the quotes. However you really shouldn't! No applications do this except for some dos legacy command line tools and cmd.exe itself.

So in order to protect yourself and the rest of the world, Deno won't provide access to the raw command line ;)

PS: the thing to do on windows is to always expand glob patterns, quoted or not. * and ? can never appear in valid file names.

@rivy
Copy link
Contributor Author

rivy commented May 29, 2021

@lucacasonato , go has direct access via sys/GetCommandLine(...). NodeJS seems to use a similar process to Deno's recipe, but has access to GetCommandLine() via ffi (an admittedly crufty method of access).

@piscisaureus , TLDR; this is needed for reasonable cross-platform support and I've prototyped a full-featured alternate solution to show it's utility.

... long post ...

Although what you said about *nix/POSIX is generally true, the rest is just false.

First, for *nix/POSIX, I know that the command line is not even available, hence my suggestion that Deno.commandLine() be (or return) undefined for those platforms. *nix/POSIX platforms already have strong shell support and attempting to reparse/re-expand the command line is not a fruitful endeavor.

Windows, however, has a long tradition of minimal shell support, forcing applications to go their own route to correctly parse and expand the command line. Most applications, when they do a reasonable and reliable job of it, reparse the command line from GetCommandLineA/W. And there a digital ton of modern command line utilities which reparse the command line just to support correct basic globbing (see most rust utilities, [eg, bat, coreutils, ... (many more) ...] which use wild [all of which I've made significant contributions to...; I'm not naive to this problem space]).

Given that you ;) emoji'ed me, I'll take the next comment in good humor. But, all joking aside, you are not protecting me/world by not providing access to the raw command line on platforms which don't have strong shell parsing and globbing. That "protection", in fact, just makes it more difficult for developers to create utilities which are strongly cross-platform.

And, addressing your PS aside... No, you shouldn't always expand glob patterns on the Windows platform. It's a more nuanced problem. For example, just the simple command string deno run -A echo.ts "*bold*" will never be correctly/reliably interpreted on a Windows platform if you simply use the arguments as parsed by CMD (and currently Deno) and use blind glob expansion. The output will depend on the file-system context, which is crazy and un-fixable without having information about the raw text.

Deno seems to have made strong efforts for cross-platform portability and to support Windows/CMD. This step (or some other variation) can help make that effort much more robust and work toward a goal of "a command line execution string means the same thing regardless of platform". I can make big strides towards parity between the *nix/POSIX and Windows platforms when I'm able to access that initial Windows command line.

As proof-of-concept, while waiting on replies to this, #9873, and #9874, I've gone ahead and prototyped a process library which works with enhanced script runners and/or an enhanced shim to supply improved quoting and bash-like command line expansion for Windows platforms. I leveraged the well-regarded braces and picomatch NPMjs libraries to supply full feature brace expansion, fully implemented advanced glob expansion (notably with with path-separator independence), sane double and single quoting, and support for ANSI-C quoted strings (eg, $'\n'). The prototype is somewhat raw and almost certainly has some rough corner cases, but it's already very capable.

This allows for the exact same command line expressions on Windows or *nix (bash/POSIX-compatible shell) platforms; for example, when installed with the enhanced shim, dxr SCRIPT_URL 'single-quoted argument' "double-quoted argument" ../{.,}?([a-m]-)[n-z].globs* $'endsWithNewlineThenExt\n.ext' will expand to the same set of arguments on both Windows and *nix/POSIX platforms.

See dxr and dxi from the dxx repo.

deno install -Af https://deno.land/x/dxx@v0.0.2/src/dxi.ts
dxi -Af https://deno.land/x/dxx@v0.0.2/src/dxr.ts
dxr https://deno.land/x/dxx@v0.0.2/eg/args.ts --debug --lines "*" {,.}* $'\e[31mANSI-C string\e[m' 'single quotes'

I also have some ideas about adding in command line variable expansion (environment variables and sub-shell expansions) but that will entail a more complicated parsing step and some development time.

And I plan to back-port all of the bash-like parsing/expansion to wild whenever I get the time (or can convince @kornelski to do it for me 😄).

P.S. The enhanced shim fixes the "Terminate batch job (Y/N)?" issue as well.

@piscisaureus
Copy link
Member

piscisaureus commented May 29, 2021

That "protection", in fact, just makes it more difficult for developers to create utilities which are strongly cross-platform.

Sorry, I'm not buying this at all. What you're trying to do is not useful; I'm sure that with access to the raw command like you can write a program that distinguishes between "*foo*" and *foo* but you won't be able to actually invoke your program with these arguments and maintain the distinction, except if you're calling it directly from the cmd shell. But try invoking it from powershell, or python, or node, or java, or cygwin/mingw/wsl bash. In all of these environments it will be be either super difficult or outright impossible to pass arguments like that to your program.

In the meantime, all windows software treats "*foo*" and *foo* as equivalent, because that's been the convention for the past 30 years. git add "*.md" does the same as git add *.md. dir "c:\*" and dir c:\* and dir "c:"\* and dir "c:""\*" all do exactly the same thing.

Single quotes (') are also a losing proposition: yourprogram.exe 'hello^fun"characters' will be mogrified; whether you or I agree with it, and whether we could've designed a system that deals with it properly (sure!) really doesn't matter.

@rivy
Copy link
Contributor Author

rivy commented May 29, 2021

This seems to be devolving from discussion, but I'll address your points...

Sorry, I'm not buying this at all. What you're trying to do is not useful; I'm sure that with access to the raw command like you can write a program that distinguishes between "*foo*" and *foo* but you won't be able to actually invoke your program with these arguments and maintain the distinction, except if you're calling it directly from the cmd shell. But try invoking it from powershell, or python, or node, or java, or cygwin/mingw/wsl bash. In all of these environments it will be be either super difficult or outright impossible to pass arguments like that to your program.

Simply, and provably, not true.

deno install -Af https://deno.land/x/dxx@v0.0.3/src/dxi.ts
dxi -Af https://deno.land/x/dxx@v0.0.3/eg/args.ts

args '*' *
node -e "const {exec} = require('child_process'); exec('args \'*\' *', (e, out, err) => console.log(out));"
perl -e "system(q{args '*' *})"
powershell -c args --% '*' *
python -c "import subprocess; subprocess.run('args \'*\' *', shell=True)"
C:> wsl bash --login
$ deno -V
deno 1.8.1
$ deno run -A https://deno.land/x/dxx@v0.0.2/eg/args.ts '*' *
Download https://deno.land/x/dxx@v0.0.2/eg/args.ts
Check https://deno.land/x/dxx@v0.0.2/eg/args.ts
* CHANGELOG.mkd LICENSE README.md eg src tests tools tsconfig.json

All of these invocations are parsed correctly and have the same output.

And I think the utility is obvious. Command line tools which are called and work the same between platforms? And how many times have you wanted to pass some unusual string construction to a Windows command tool? This makes most constructions simple, even passing control characters.

I haven't had the time to test MSYS or Cygwin (mostly just to install deno), but I believe they should function in the same manner as wsl/bash. I'm sure there will be corner cases and minor caveats, but this method will operate correctly for the vast majority of use cases.

In the meantime, all windows software treats "*foo*" and *foo* as equivalent, because that's been the convention for the past 30 years. git add "*.md" does the same as git add *.md. dir "c:\*" and dir c:\* and dir "c:"\* and dir "c:""\*" all do exactly the same thing.

This is hyperbole; "all windows software ..." is untrue. I've given specific counter examples.

Single quotes (') are also a losing proposition: yourprogram.exe 'hello^fun"characters' will be mogrified; whether you or I agree with it, and whether we could've designed a system that deals with it properly (sure!) really doesn't matter.

No, single quotes are not an issue. But this is partially true in ways unrelated to single quotes. Assuming this is run from the CMD shell, the only thing unretrievably 'mogrified' in your example is the ^ character. And, yes, there are always shell-based caveats and problematic characters, as exemplified by the myriad shellQuote functions that protect text when sending it on to a specific shell. Certain constructions are always going to be more portable than others between shells. Here, args 'hello'$'\x5e''fun"characters' would be a portable version of that construction (and will work when invoked by CMD, wsl/bash, node, perl, ...).

This can be done, as I have here, without Deno support. But it involves more code contortions, especially for sub-processes. I believe it would be simpler if Deno could just provide the raw text that it parses for Deno.args. There's nothing nefarious or dangerous. Providing the line text just makes things simpler for users and can lead to much improved portability.

@piscisaureus
Copy link
Member

piscisaureus commented May 30, 2021

So I did a quick test: [getcmd.c source code]

cmd.exe
~~~~~~~
C:\>d:\getcmd\getcmd '*' "*" * END
d:\getcmd\getcmd  '*' "*" * END

powershell
~~~~~~~~~~
PS C:\> d:\getcmd\getcmd '*' "*" * END
"d:\getcmd\getcmd.exe" * * * END

bash (wsl)
~~~~~~~~~~
piscisaureus@guru:/mnt/c$ ~/d/getcmd/getcmd.exe '*' "*" * END
getcmd.exe * * $RECYCLE.BIN $WinREAgent "Documents and Settings" Intel LocalStorage PerfLogs "Program Files" "Program Files (x86)" ProgramData Recovery "System Volume Information" Users Windows hiberfil.sys pagefile.sys swapfile.sys END                                                          

So the actual command line that getcmd.exe sees is different each time.
Unsurprising of course: these three different shells (all commonly used) each have their own rules for quoting and escaping.
I can't imagine how getcmd.exe would be able to recover the original "intent", as it doesn't know which shell(?) cooked its command line.

It seems that msys/ does something similar, and people love it: msys2/msys2-runtime#36 msys2/MSYS2-packages#522 git-for-windows/git#1220 curl/curl#1813 magit/magit#2246 magit/magit#2246 magit/magit#2711 git-for-windows/git#561 git-for-windows/git#1019

@rivy
Copy link
Contributor Author

rivy commented May 30, 2021

So, PowerShell is currently a special child going through many growing pains, one of which is command line argument passing (see PowerShell/PowerShell#13089 and PowerShell/PowerShell#15143). That's one of the reasons why, to a large extent, most applications will use cmd as the shell to run sub-processes.

But, generally, using the --% (as I did above...) will stop argument handling and leave it to the executable. So, I'd document that... if you're using PowerShell and want portable behavior, use something like getcmd.exe --% '*' "*" * END. Additionally, the standard CMD environment variable COMSPEC is cleared under PowerShell, so that could be used to signal prior command line processing.

For *nix/POSIX shells, you should be using an executable built for that system (which would then know to leave the command line alone), eg, use *nix/POSIX executables under wsl/bash. That's what the rust executables that I mentioned are designed to do. And that's why the command deno run -A https://deno.land/x/dxx@v0.0.2/eg/args.ts '*' * works correctly. It's using the deno for ubuntu/bash which tells the script that it's not 'windows' (ie, Deno.build.os !== 'windows'), so the script leaves the args as pass-throughs. That's why I recommended that deno should pass undefined as a value for Deno.commandLine to scripts running under non-Windows platforms with strong shell support, indicating that the command line is pre-processed by the shell into Deno.args.

What does Deno.build.os return for MSYS/Cygwin? Is there a platform-specific build for them or do you recommend installing a Windows-executable? Even if it does return "windows", command line processing could be bypassed based on a signal that the shell is more capable (such as the SHELL environment variable, which is usually /bin/bash or /usr/bin/bash). Though I'm not sure that planning for executing "through" another platform would really be common enough to make bullet-proof fallbacks.

To be clear, I'm not asking for Deno to process the command line, just asking for it to be supplied so that a script can do as all current regular Windows executables can do ... process the raw command line if they want to (preferably through a well tested library).

The prototype just shows that a lot can be done to make the scripts more useful and flexible at the command line for users.

Wouldn't you like to be able to use bash-like expansion, globs, etc from the Windows command line? I, personally, frequently miss it when I switch from bash back to CMD.

@rivy
Copy link
Contributor Author

rivy commented May 30, 2021

Talking beyond the ask here, but, after thinking about your points, I was able to test my prototype further under MSYS and WSL.

Based on that testing and the discussion, I made some modifications to add shell detection (via SHELL) in addition to using Deno.build.os.

args now works correctly when used under CMD, WSL/bash, or MSYS (which calls the Win32 deno.exe as a passthru). (No Cygwin currently installed to test upon.)

deno install -Af https://deno.land/x/dxx@v0.0.4/src/dxi.ts
dxi -Af https://deno.land/x/dxx@v0.0.4/eg/args.ts
args --debug '*' "*" * END

Direct execution of the bash/sh shell shim script works as long as deno is installed in WSL. And WSL (passthru) execution of deno.exe ... works if you push the shell variable out to the Win32 process on invocation:

WSLENV=$WSLENV:SHELL/w deno.exe run -A https://deno.land/x/dxx@v0.0.4/eg/args.ts '*' "*" * END

I'm not sure how to (or if there's a way) to detect the passthru execution of deno.exe without that environment variable signal, but the "normal", *nix-specific WSL/bash execution works just fine.

@rivy
Copy link
Contributor Author

rivy commented Jul 13, 2021

See also: https://github.com/rust-lang/rust/blob/7f9ab0300cd66f6f616e03ea90b2d71af474bf28/library/std/src/os/windows/process.rs#L113-L127

I'm not sure what you're trying to say here... But if you're implying that rust only supports blindly quoting arguments for process execution, that's no longer the case (see https://github.com/rust-lang/rust/blob/955b9c0d4cd9176b53f518e01cbe175545c69947/library/std/src/os/windows/process.rs#L130-L136). The discussion about the problem and need for change is at rust-lang/rust#29494 and the commit adding the change is at rust-lang/rust@d868da7.

I'm not sure that the referenced code (issue, discussion, commit) is directly relevant, but it does show that there is significant nuance and complexity to the problem of reading, creating, and using command line arguments for Windows.

@lucacasonato
Copy link
Member

Duplicate of #8852?

@rivy
Copy link
Contributor Author

rivy commented Jul 27, 2021

Duplicate of #8852?

No, not really. #8852 is about generating command lines for execution which is really the reverse of this problem (accessing the original command line).

@stale
Copy link

stale bot commented Oct 28, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Oct 28, 2021
@rivy
Copy link
Contributor Author

rivy commented Oct 29, 2021

Not stale.

@stale stale bot removed the stale label Oct 29, 2021
@rivy rivy changed the title 🐛 (or feat request?) CLI apps need original command line (Windows) 🐛/🙏🏻? ~ CLI apps need original command line (WinOS) Dec 21, 2021
@stale
Copy link

stale bot commented Feb 19, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Feb 19, 2022
@rivy
Copy link
Contributor Author

rivy commented Feb 20, 2022

Not stale.

@stale stale bot removed the stale label Feb 20, 2022
@stale
Copy link

stale bot commented Apr 23, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 23, 2022
@rivy
Copy link
Contributor Author

rivy commented Apr 23, 2022

Not stale.

@stale stale bot removed the stale label Apr 23, 2022
@FKPSC
Copy link

FKPSC commented May 13, 2022

I spent a long time trying to get a command that looks like start "app\\Cool app.exe" to run through Deno.run.
It seems to potentially be related to this?

@stale
Copy link

stale bot commented Jul 14, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jul 14, 2022
@rivy
Copy link
Contributor Author

rivy commented Jul 14, 2022

Not stale.

@stale stale bot removed the stale label Jul 14, 2022
@kitsonk kitsonk added the suggestion suggestions for new features (yet to be agreed) label Jul 15, 2022
@Artoria2e5
Copy link

Artoria2e5 commented Feb 15, 2024

@piscisaureus In the meantime, all windows software treats "foo" and foo as equivalent, because that's been the convention for the past 30 years.

This is not true, if you check the David Deley’s closest-to-authoritative document on Windows command-line parsing: https://daviddeley.com/autohotkey/parameters/parameters.htm#WINCRULES. Specifically:

  • Your git example uses msys, which uses its own parser in dcrt0.dll. That's a different parser with demonstrably different quote-handling behavior than the msvcrt one used by node and (I assume) deno.
  • msvcrt had a change in 2008. Every other MS product also parses differently.

Deley does not address globbing, but I recommend reading the part quoted in remkop/picocli#1761 (comment). There used to be a trove of Java complaints around the time they updated their MS C++ runtime & switched the setting. That was less than 30 years ago.

@rivy This allows for the exact same command line expressions on Windows or nix (bash/POSIX-compatible shell) platforms; for example, when installed with the enhanced shim, dxr SCRIPT_URL 'single-quoted argument' "double-quoted argument" ../{.,}?([a-m]-)[n-z].globs $'endsWithNewlineThenExt\n.ext' will expand to the same set of arguments on both Windows and *nix/POSIX platforms.

This is also not totally wise. Among Windows programs, very few things give you access to the raw command-line as the main interface. Other things, from .NET Arguments, Python spawn, Rust (whatever it's called), and Cygwin's wrapped stuff, all present an interface of an argument-array (argv) to be used; they have to do something to quote them into a command-line. Assuming nothing goes wrong, they use a quoting method that works for the majority of applications, one that gets reversed as-is by the MS C++ runtime into argv.

The point is: whatever extensions you bring in, it MUST NOT break dquoted msvcrt-style arguments. You still have the liberty of messing with unquoted things.

  • The single-quote, dollar-single-quote, and glob stuff can go in.
  • The double quote will not work like in UNIX if you don't want to break stuff.
  • There's also a layer of cmd's special character handling to wrestle with for cmd users. Nothing can be done for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs investigation requires further investigation before determining if it is an issue or not suggestion suggestions for new features (yet to be agreed) windows Related to Windows platform
Projects
None yet
Development

No branches or pull requests

6 participants