Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arguments for external executables aren't correctly escaped #1995

Closed
be5invis opened this issue Aug 21, 2016 · 201 comments
Closed

Arguments for external executables aren't correctly escaped #1995

be5invis opened this issue Aug 21, 2016 · 201 comments
Labels
Committee-Reviewed PS-Committee has reviewed this and made a decision Resolution-Fixed The issue is fixed. WG-Engine core PowerShell engine, interpreter, and runtime
Milestone

Comments

@be5invis
Copy link

be5invis commented Aug 21, 2016

Steps to reproduce

  1. write a C program native.exe which acquires ARGV
  2. Run native.exe "`"a`""

Expected behavior

ARGV[1] == "a"

Actual behavior

ARGV[1] == a

Environment data

Windows 10 x64

Name                           Value
----                           -----
PSVersion                      5.1.14393.0
PSEdition                      Desktop
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}
BuildVersion                   10.0.14393.0
CLRVersion                     4.0.30319.42000
WSManStackVersion              3.0
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
@andyleejordan
Copy link
Member

Why do you think that "\"a\"" the expected behavior? My understanding of PowerShell escapes says that the actual behavior is the correct and expected behavior. ""a"" is a pair of quotes surrounding an escaped pair of quotes surrounding an a, so PowerShell interprets the outer unescaped pair as "this is a string argument" and so drops them, then interprets the escaped pair as escaped quotes and so keeps them, leaving you with "a". At no point was a \ added to the string.

The fact that Bash uses \ as an escape character is irrelevant. In PowerShell, the escape character is a backtick. See PowerShell escape characters.

If you want to pass literally "\"a\"", I believe you would use:

> echo `"\`"a\`"`"
"\"a\""

@be5invis
Copy link
Author

@andschwa
Yes, escapes works fine for internal cmdlets, but things get weird when communicate with native binaries, especially on Windows.
When running native.exe ""a"", the ARGV[1] should be

"a"

(three characters)

instead of

a

(one character).

@be5invis
Copy link
Author

be5invis commented Aug 31, 2016

Currently to make native.exe correctly receive an ARGV with two quotes and an a character, you have to use this weird call:

native.exe "\`"a\`""

@andyleejordan
Copy link
Member

Ah, I see. Re-opening.

@andyleejordan andyleejordan reopened this Aug 31, 2016
@andyleejordan
Copy link
Member

Out of a strong curiosity, what happens if you try a build using #1639?

@be5invis
Copy link
Author

@andschwa The same. You HAVE to double-esacpe to satisify both PowerShell and CommandLineToArgvW. This line:

native.exe "`"a`""

results a StartProcess equalivent to cmd

native.exe ""a""

@andyleejordan
Copy link
Member

@be5invis @douglaswth is this resolved via #2182?

@be5invis
Copy link
Author

be5invis commented Sep 20, 2016 via email

@vors
Copy link
Collaborator

vors commented Sep 20, 2016

Since ""a"" is equal to '"a"', do you suggest that native.exe '"a"' should result in "\"a\""?

@douglaswth
Copy link
Contributor

This seems like a feature request that if implemented could break a large number of already existing PowerShell scripts that use the required double escaping, so extreme care would be required with any solution.

@be5invis
Copy link
Author

@vors Yes.
@douglaswth The double-escaping is really silly: why do we need the “inner” escapes made in the DOS era?

@be5invis
Copy link
Author

be5invis commented Sep 20, 2016

@vors @douglaswth
This is a the C code used to show GetCommandLineW and CommandLineToArgvW results:

#include <stdio.h>
#include <wchar.h>
#include <Windows.h>

int main() {
  LPWSTR cmdline = GetCommandLineW();
  wprintf(L"Command Line : %s\n", cmdline);

  int nArgs;
  LPWSTR *szArglist = CommandLineToArgvW(cmdline, &nArgs);
  if (NULL == szArglist) {
    wprintf(L"CommandLineToArgvW failed\n");
    return 0;
  } else {
    for (int i = 0; i < nArgs; i++) {
      wprintf(L"argv[%d]: %s\n", i, szArglist[i]);
    }
  }
  LocalFree(szArglist);
}

@be5invis
Copy link
Author

Here is the result

$ ./a "a b"
Command Line : "Z:\playground\ps-cmdline\a.exe" "a b"
argv[0]: Z:\playground\ps-cmdline\a.exe
argv[1]: a b

$ ./a 'a b'
Command Line : "Z:\playground\ps-cmdline\a.exe" "a b"
argv[0]: Z:\playground\ps-cmdline\a.exe
argv[1]: a b

$ ./a 'a"b'
Command Line : "Z:\playground\ps-cmdline\a.exe" a"b
argv[0]: Z:\playground\ps-cmdline\a.exe
argv[1]: ab

$ ./a 'a"b"c'
Command Line : "Z:\playground\ps-cmdline\a.exe" a"b"c
argv[0]: Z:\playground\ps-cmdline\a.exe
argv[1]: abc

$ ./a 'a\"b\"c'
Command Line : "Z:\playground\ps-cmdline\a.exe" a\"b\"c
argv[0]: Z:\playground\ps-cmdline\a.exe
argv[1]: a"b"c

@douglaswth
Copy link
Contributor

@be5invis I do not disagree with you about the double escaping being annoying, but I am merely saying that a change to this would need to be backward compatible with what existing PowerShell scripts use.

@be5invis
Copy link
Author

How many are them? I do not think there are script writers know about such double-quoting. It is a bug, not feature, and it is not documented.

???? iPhone

? 2016?9?21??01:58?Douglas Thrift <notifications@github.commailto:notifications@github.com> ???

@be5invishttps://github.com/be5invis I do not disagree with you about the double escaping being annoying, but I am merely saying that a change to this would need to be backward compatible with what existing PowerShell scripts use.

You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com//issues/1995#issuecomment-248381045, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAOp20f_W0mTl2YiJKi_flQBJUKaeAnLks5qsB7ZgaJpZM4JpVin.

@douglaswth
Copy link
Contributor

PowerShell has been around for 9 years so there are very likely a good number of scripts out there. I found plenty of information about the need for double escaping from StackOverflow and other sources when I ran into the need for it so I don't know if I agree with your claims about nobody knowing about the need for it or that it is not documented.

@vors
Copy link
Collaborator

vors commented Sep 20, 2016

For the additional context, I'd like to talk a little bit about the implementation.
PowerShell calls .NET API to spawn a new process, which calls a Win32 API (on windows).

Here, PS creates StartProcessInfo that is uses
https://github.com/PowerShell/PowerShell/blob/master/src/System.Management.Automation/engine/NativeCommandProcessor.cs#L1063

The provided API takes a single string for arguments and then it's re-parsed into an array of arguments to do the execution.
The rules of this re-parsing are not controlled by PowerShell. It's a Win32 API (and fortunately, it consistent in dotnet core and unix rules).
Particularly, this contract describes the \ and " behavior.

Although, PowerShell may try to be smarter and provide a nicer experience, the current behavior is consistent with cmd and bash: you can copy native executable line from them and use it in powershell and it works the same.

@be5invis If you know a way to enhance the expirience in non-breaking way, please line up the details. For the breaking changes, we would need to use RFC process, as described in https://github.com/PowerShell/PowerShell/blob/master/docs/dev-process/breaking-change-contract.md

@SteveL-MSFT SteveL-MSFT added the WG-Engine core PowerShell engine, interpreter, and runtime label Sep 29, 2016
@TSlivede
Copy link

TSlivede commented Oct 3, 2016

This applies to Windows, but when running commands on Linux or Unix, its strange that one needs to double escape quotes.

On Linux processes don't have a single commandline but instead an array of arguments.
Therefore arguments in powershell should be the same as those, that are passed to the executable, instead of merging all arguments and then resplitting.

Even on windows, the current behavior is inconsistent:
If an argument contains no spaces, it is passed unchanged.
If an argument contains spaces, if it will be surrounded by quotes, to keep it together through CommandLineToArgvW call. => Argument is changed to meet CommandLineToArgvW requirement.
But if argument contains quotes, those are not escaped. => Argument is not changed, although CommandLineToArgvW requires this.

I think arguments should either never be changed, or always be changed to meet CommandLineToArgvW requirements, but not in half of the cases.

Regarding breaking-the-contract:
As I couldn't find any official documentation about double escaping, I'd consider this as category "Bucket 2: Reasonable Grey Area", so there are chances to change this, or am I wrong?

@be5invis
Copy link
Author

be5invis commented Oct 3, 2016

@vors This is extremely annoying if your argument is an variable or something else: you have to manually escape it before sending it into a native app.
An "auto-escaping" operator may help. like ^"a"" -> "a\""`

@vors
Copy link
Collaborator

vors commented Oct 3, 2016

I think @TSlivede put it right with the inconsistency in the behavior.

I think arguments should either never be changed, or always be changed to meet CommandLineToArgvW requirements, but not in half of the cases.

I'm not sure about the bucket, but even the "clearly breaking change" bucket could potentially be changed. We want to make PowerShell better, but backward compatibility is one of our highest priorities. That's why it's not so easy.
We have a great community and I'm confident that we can find consensus.

Would anybody want to start an RFC process?

@lzybkr
Copy link
Member

lzybkr commented Oct 3, 2016

It would be worth investigating the use of P/Invoke instead of .Net to start a process if that avoids the need for PowerShell to add quotes to arguments.

@vors
Copy link
Collaborator

vors commented Oct 3, 2016

@lzybkr as far as I can tell, PInvoke would not help.
And this is where unix and windows APIs are different:

https://msdn.microsoft.com/en-us/library/20y988d2.aspx (treats spaces as separators)
https://linux.die.net/man/3/execvp (doesn't treat spaces as separators)

@lzybkr
Copy link
Member

lzybkr commented Oct 3, 2016

I wasn't suggesting changing the Windows implementation.

@vors
Copy link
Collaborator

vors commented Oct 3, 2016

I'd try to avoid having platform-specific behavior here. It will hurt scripts portability.
I think we can consider changing windows behavior in a non-breaking way. I.e. with preference variable. And then we can have different defaults or something like that.

@lzybkr
Copy link
Member

lzybkr commented Oct 3, 2016

We're talking about calling external commands - somewhat platform dependent anyway.

@TSlivede
Copy link

TSlivede commented Oct 3, 2016

Well, i think it can't be really platform independent, as Windows and Linux just have different ways to call executables. In Linux a process gets an argument array while on Windows a process just gets a single commandline (one string).
(compare the more basic
CreateProcess -> commandline (https://msdn.microsoft.com/library/windows/desktop/ms682425)
and
execve -> command array (https://linux.die.net/man/2/execve)
)

As Powershell adds those quotes when arguments have spaces in them, it seems to me, that powershell tries** to pass the arguments in a way, that CommandLineToArgvW splits the commandline to the arguments that were originally given in powershell. (This way a typical c-program gets the same arguments in its argv array as a powershell function gets as $args.)
This would perfectly match to just passing the arguments to the linux systemcall (as suggested via p/invoke).

** (and fails, as it doesn't escape quotes)

PS: What is necessary to start an RFC process?

@lzybkr
Copy link
Member

lzybkr commented Oct 3, 2016

Exactly - PowerShell tries to make sure CommandLineToArgvW produces the correct command and after reparsing what PowerShell has already parsed.

This has been a longstanding pain point on Windows, I see on reason to bring that difficulty over to *nix.

To me, this feels like an implementation detail, not really needing an RFC. If we changed behavior in Windows PowerShell, it might warrant an RFC, but even then, the right change might be considered a (possibly risky) bug fix.

@SteveL-MSFT
Copy link
Member

This appears to be fixed in 7.3:

PS> testexe -echoargs "`"a`""
Arg 0 is <"a">

Shell automation moved this from In progress to Done Aug 11, 2022
@jiasli
Copy link

jiasli commented Aug 11, 2022

This appears to be fixed in 7.3:

Thanks a lot @SteveL-MSFT. May I know which PR is responsible for the fix?

@jiasli
Copy link

jiasli commented Aug 11, 2022

I can verify double quotes are escaped correctly in v7.3.0-preview.6 when calling a .exe.

> echo $PSVersionTable

Name                           Value
----                           -----
PSVersion                      7.3.0-preview.6
...

> python -c "import sys; print(sys.argv)" '"ab"'
['-c', '"ab"']

However, special characters (", |, &) are still not escaped correctly when invoking a cmd script. Consider a very simply script test.cmd:

echo %*
python -c "import sys; print(sys.argv)" %*

For |, running test.cmd in PowerShell v7.3.0-preview.6 gives:

PS D:\> ./test.cmd "a|b"
'b' is not recognized as an internal or external command,
operable program or batch file.

This differs from the behavior of Command Prompt:

D:\>test.cmd "a|b"

D:\>echo "a|b"
"a|b"

D:\>python -c "import sys; print(sys.argv)" "a|b"
['-c', 'a|b']

This causes problem for Azure CLI when there is | or & in argument value: Azure/azure-cli#20972

For ", running test.cmd in PowerShell v7.3.0-preview.6 gives:

PS D:\> ./test.cmd "`"ab`""

D:\>echo "ab"
"ab"

D:\>python -c "import sys; print(sys.argv)" "ab"
['-c', 'ab']

This differs from the behavior of Command Prompt:

D:\>test.cmd "\"ab\""

D:\>echo "\"ab\""
"\"ab\""

D:\>python -c "import sys; print(sys.argv)" "\"ab\""
['-c', '"ab"']

@rivy
Copy link

rivy commented Aug 11, 2022

As noted by @jiasli, this issue is not actually fixed.

@brendandburns
Copy link
Contributor

brendandburns commented Aug 11, 2022

The PowerShell behavior here is the same as bash

test.sh:

#!/bin/bash

echo $@
bburns@DESKTOP-VHJJ57D:~$ ./test.sh "a|b"
a|b
bburns@DESKTOP-VHJJ57D:~$ ./test.sh "a&b"
a&b
bburns@DESKTOP-VHJJ57D:~$

In my opinion, the bash/powershell behavior makes more sense than the cmd behavior. In the above example, the quotes are intended for the shell interpreter, they are not part of the user's input. The shell's job is to pass each value as a String to the command that it calls. It is the executable/command scripts responsibility to re-escape as necessary before shelling out again.

@brendandburns
Copy link
Contributor

brendandburns commented Aug 11, 2022

Furthermore this seems to be a specific behavior related to cmd.exe calling command scripts.

import sys
print(sys.argv)
C:\Users\bburns>py test.py 'a|b'
'b'' is not recognized as an internal or external command,
operable program or batch file.

C:\Users\bburns>py test.py 'a&b'
['test.py', "'a"]
'b'' is not recognized as an internal or external command,
operable program or batch file.

@yecril71pl
Copy link
Contributor

As noted by @jiasli, this issue is not actually fixed.

This issue is about passing arguments to native executables that support passing arguments. The executable cmd.exe is not one of them, it uses the whole command line instead. The only reliable way to call cmd.exe is to print the expected command line out into a temporary command script and run that script without any arguments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Committee-Reviewed PS-Committee has reviewed this and made a decision Resolution-Fixed The issue is fixed. WG-Engine core PowerShell engine, interpreter, and runtime
Projects
Shell
  
Done
Development

Successfully merging a pull request may close this issue.