Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

normalize command line arguments #4724

Closed
wants to merge 7 commits into from
Closed

normalize command line arguments #4724

wants to merge 7 commits into from

Conversation

german-one
Copy link
Contributor

@german-one german-one commented Feb 26, 2020

Summary of the Pull Request

Use "" instead of treating \" as escaped ".
Try to ensure that the app name is always the argument with index 0. obsolet

References

#4632

PR Checklist

Detailed Description of the Pull Request / Additional comments

  • Parse the command line and tokenize it using an algorithm that treats "" rather than \" as escaped quotation mark to preserve it in the argument
  • Check if at least one command line argument has been received. Check if the first argument is reasonable to represent the own call. If either of these isn't true, add the app path as first argument. obsolet

Validation Steps Performed

  • Manually called WT with option -d and quoted paths with trailing backslash.
  • Used the CreateProcess function in a test app to simulate the missing application name in the first argument.

Copy link
Contributor

@DHowett-MSFT DHowett-MSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(comments left separately. thanks for looking into this!)

@ghost ghost added Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something and removed Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something labels Feb 26, 2020
Copy link
Member

@zadjii-msft zadjii-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm only not signing off because of the (arg0name.length() < 2U || std::towlower(arg0name.front()) != L'w') thing. Otherwise I'm happy with this.

src/cascadia/WindowsTerminal/AppHost.cpp Outdated Show resolved Hide resolved
@german-one
Copy link
Contributor Author

german-one commented Feb 29, 2020

After thinking twice we might actually need an alternative way to preserve quotes in an argument. E.g. in commands that shall be executed in a tab.
I updated the tokenizing to treat "" rather than \" as an escaped ". The figure below shows what this means in terms of the argument you have to pass.

expected parameter using CommandlineToArgvW using own algorithm
C:\Progam Files\ "C:\Program Files\\" "C:\Program Files\"
parameter "with" quotes "parameter \"with\" quotes" "parameter ""with"" quotes"

Doubling the quotes is not unusual on Windows. That's what you have to do in VB strings or what you have to pass to find.exe. And it doesn't lead to unbalanced quotation marks in the calling shell. However, consider to kick me out at this point and close #4571 as by design.

@DHowett
Copy link
Member

DHowett commented May 30, 2020

For the life of me, I cant understand why CommandLineToArgvW even exists.

The answer to this is typically "somebody made a decision 30 years ago and now it exists, and its deficiencies are not adequately documented such that people would know to stop using it." Consider some other parts of the Win32 API surface...

@DHowett
Copy link
Member

DHowett commented May 30, 2020

Part of the problem is that we're not launching via main. We're using wWinMain, which the CRT does not provide argv to. Now, there's probably a weird global __argv we could use . . .

@DHowett
Copy link
Member

DHowett commented May 30, 2020

We're a Windows subsystem application. Windows subsystem applications use wWinMain. If there's a reasonable way for us to use main or wmain without changing the subsystem of our application (and picking up the issues that come with that), I'm all ears. 😄

@german-one
Copy link
Contributor Author

The internal parser that generates argc and argv seems to use the same algorithms like CommandLineToArgvW. Same quirk with \" parsed as escaped ". So, this doesn't solve the issue.
Besides of that, parsing the command line using an own algorithm provides another opportunity which is not yet included here. Usually quoted arguments may contain characters that have to be escaped in unquoted context to change its semantics. In particular, I'm thinking about the semicolon. It's specified as a separator in the Terminal command line. At the same time it's a line separator in PowerShell. Passing PowerShell commands to the Terminal requires the escaping of the semicolon, even if it occurs in a quoted argument. This is rather unusual.
The own implementation can easily be extended to make unquoted semicolons a separate argument in the vector. Further processing should take the semicolon as a separator if an argument consists of a semicolon only. That behavior would look much more familiar to the users.
However, since this PR has not been touched for a while and decreased in priority, I'll get back to that when the time has come.

@DHowett
Copy link
Member

DHowett commented May 30, 2020

Couple points:

  1. I am not certain python 2.6 is, by any means, new
  2. Given that cmd is complaining, the set of arguments isn’t even making it to the argument splitter in Python. I’m not sure that’s a very good example.

We need to be sensitive of what the shell is splitting and what the recipient is splitting, especially when we are discussing a change to the algorithm on the receiving side.

@german-one
Copy link
Contributor Author

This PR addresses the quirk that \" becomes ". In particular a quoted path like
"C:\"
becomes
C:"
after parsing which is an invalid path.

@DHowett
Copy link
Member

DHowett commented May 31, 2020

I dunno if that holds on Windows for Windows tools though.

Ideally, we would have something like exec and parameter splitting would have been done before we were even started (where quotes wouldn’t be an issue), but as it stands we don’t, and a Windows path is making it into my process with a \ and a “ in it, and something has to handle that appropriately.

Escaping should have happened outside my process, and unquoting should have happened outside my process, but that’s just not the world we live in in Windows.

We will probably need our own parser anyway: these command strings are going to come from places that aren’t just argv/the environment block in the future(1) and the quicker we can converge how they’re handled the better the experience will be.

1: my phone browser isn’t letting me do the GitHub issue reference search, but we have one for supporting -f/--file to pass a list of commands in a file, and one for supporting a remote WT instance passing commands to a running copy to support “open tab or split pane in existing window”

@DHowett
Copy link
Member

DHowett commented May 31, 2020

Anyway... if our options are “fight the holy war to make all argument parsing in Windows happen before spawn instead of after spawn, and then wait ten years for the operating system parts to shake out” or “have our own parser for now that works on all existing versions for the use cases we need” I know which one I’m voting for

@german-one
Copy link
Contributor Author

I am not convinced that is a parser error. With several languages, backslashes
need to be escaped by the user, not by the parser. For example JavaScript:

I'm sure it was not intended to be an error. Indeed it is still a possibility to preserve a quotation mark in an argument. But we already have an alternative by doubling them. And doubling the backslash is the way out with the current parser. But think about an argument like "C:\program files\\".
Doesn't it feel wrong if the user has to double the last backslash only? And it's not the CMD shell which requires the double backslash...
@DHowett is IMO absolutely right. It's easy to prove that the app receives the "C:\" from my example above. Just print the command line that GetCommandLineW() returns. Various shells may require additional escaping in order to pass a final "C:\" to the recipient after internal parsing. But that's not ours. We have to ensure that the received arguments are parsed in a comprehensible way that meets the user's expectations.

@german-one
Copy link
Contributor Author

Command line options are supported for a couple of months now. I guess meanwhile users are used to escaping the trailing backslash or they just append a point. Changing the bahavior now may break scripts of too many people.
Closing this now. It's stale anyway.

@german-one german-one closed this Jul 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Commandline wt.exe's commandline arguments Product-Terminal The new Windows Terminal.
Projects
None yet
4 participants