Add Filename.quote_command function#1492
Conversation
|
This is a great leap forward, though the essay I've just spent an hour writing still needs more investigation, so for now a holding comment to say that I've been looking at this, and to link to the original issue ocaml/dune#322 which led to MPR#7672 and which isn't (yet) addressed by this GPR. The TL;DR is that this still doesn't entirely deal with nasty escaping cases - to give a brief example, consider the correctly formatted set -50="&rd/s/q %USERPROFILE%\Documents&"
dir "25%-50% growth data.txt"tries to erase your home profile... |
|
Your example is too complicated for me to follow. Can you make it shorter? Like 2 characters long? Also, is the problem in the quoting algorithm given here or in my implementation of it? |
|
I can't make it much shorter, no! The issue is the combination of:
The third point is the original problem in MPR#7672 which is that the command For that horrific example, your implementation escapes the outermost quotes, which doesn't work - I'm not yet certain that that counts as a bug in your implementation per se, because you explain that Unfortunately, I imagine that what may be required is a more expressive datatype than It's worth remembering that even cmd gets all this wrong - tab completion on a filename with I'm concerned that if we end up adding a function which is hard to use correctly then we end up appearing to be "doing it right", when in reality unexpected results like that "attack" I outline above are still possible. |
I think that would be a good solution as well — export something like |
|
All right, let me make one last attempt at being constructive and designing a proper solution. (As opposed to the terribly negative "it's completely broken" replies above.) To make the discussion more understandable, keep in mind that my current proposal encodes Special characters in First, I claim that this encoding, taken from here, is correct for the arguments to the command. In particular, Now, there is a problem with the first word of the command line: the path to the file to be executed. This path is parsed entirely by and So, we need to use
With this approach the encoding of I'll try to code this approach when I find enough energy. |
|
I'm not totally sure how you get from "This is a great leap forward" to "it's completely broken", but let's not fight that out here - I'll owe you a coffee/beer at the next dev meeting instead! I agree that the quoting for arguments completely works - the only reason we might want to avoid the extra escaping of quote characters is to keep the command line length down so we don't hit the limit. I'm not certain how that SS64 reference arrives at "%%" as a means of escaping "%" characters (I'd like to know, because it might help to refute it) but the only place where I know of that working is batch parameters inside shell scripts (i.e. it's Adding the quotes around the whole command creates an issue with your example for redirections ( (* This would be literally ""echo" ^"foo^"" *)
let command = Filename.quote_command ["echo"; "foo"];;
(* This would be literally ""C:\"^%"PATH"^%" Bar"" *)
let file = Filename.quote_command ["C:\\%PATH% Bar"];;
(* As given, the Sys.command below will echo
foo">C:\^%PATH^% Val
to the console. If instead command and file have
their outer quotes removed and the outer quotes go
over the whole command then it works. *)
Sys.command (command ^ ">" ^ file)So I think that either quoting the entire argument should be by a new version of # Sys.command "echo";;
ECHO is on.
- : int = 0
# Sys.command "\"echo\"";;
ECHO is on.
- : int = 0
# Sys.command "\"\"echo\"\"";;
- : int = 0
# Sys.command "findstr";;
FINDSTR: Bad command line
- : int = 2
# Sys.command "\"findstr\"";;
FINDSTR: Bad command line
- : int = 2
# Sys.command "\"\"findstr\"\"";;
FINDSTR: Bad command line
- : int = 2perhaps |
|
Thanks a lot for the feedback, very useful indeed. I think we're close. Not quoting builtin commands such as I also think we have a show-stopper, namely the issue with redirections ( At any rate, I wasn't too proud of the documentation I wrote explaining how to call I can think of two APIs:
|
|
What's the detail of your experiments with One other might be to do the ocamlbuild-like API (describing it that way makes me shudder a little!) but also have a simpler function. Say |
|
Actually, having run a search for any filenames containing |
|
My observations concerning |
|
@dra27 @xavierleroy How are we doing on this one -- could it make 4.07? |
|
FWIW, I've had recently to generate .bat scripts of the form "SET foo=bar", with bar coming from an OCaml string, with the intent of getting the variable defined to that string. Hence the need to escape. And I observed that % characters in the OCaml string need to be escaped as %%, and ^% does not work. I don't know if there is specific to the SET command, and if the same rules apply to command-line arguments for calling external programs from cmd, but I thought I would report this finding, just in case. |
|
Thanks for the ping! I swapped out a lot of the background knowledge (that's a known strategy to survive Windows-induced trauma), so I need to swap it in back... But I think we can get something to work, we just have to choose between two possible APIs:
|
|
we can have both, simplified api being a wrapper for precise one? |
Unfortunately, the SET command has its own quoting rules. |
f09b853 to
d4bd388
Compare
|
I closed this PR by accident. Actually I meant to push a new implementation of The API is now with the command In a nutshell, The novelty is that, under Win32, Finally, for good measure, the whole command line is wrapped in double quotes to please cmd.exe. |
|
I will try to find some time for a proper review again of this... in the meantime, there are some check-typo nits: |
damiendoligez
left a comment
There was a problem hiding this comment.
The code looks good, although I can't say I master all the convolutions of the Windows quoting conventions, so let's wait for @dra27's review.
A question though: is it on purpose that you don't allow redirecting standard error output?
I vaguely remembered that Windows's cmd.exe would not support stderr redirection. I was very wrong: https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-xp/bb490982(v=technet.10) I can certainly add a Also, and independently: the optional arguments could be labeled |
3df3ccc to
3c09f8c
Compare
|
Rebased on trunk and fixed typo. |
|
I vote for For redirecting both, I see two solutions:
|
2cadaa1 to
4e6ca01
Compare
|
Thanks for the review. I added stderr redirection following @damiendoligez's approach #2 above, and also renamed the optional arguments. The PR was rebased and neatly squashed. From my viewpoint, it is ready for merging, and I will not develop it further. There's a long URL in a comment that causes check-typo to throw a fit, but I'll let the check-typo masters handle this. |
|
@xavierleroy: I initially thought about relaxing this restriction, but I see why it makes sense. For example, someone using a screen reader can decide to skip to the next line when encountering an URL, but they must be able to assume that there won't remain any useful text after the URL on the same line. |
| if any are quoted using {!Filename.quote}, then concatenated. | ||
| Under Win32, additional quoting is performed as required by the | ||
| [cmd.exe] shell that is called by {!Sys.command}. | ||
| *) |
There was a problem hiding this comment.
Could you make it explicit that quote_command can raise (and in which condition)?
Also, I'm not fully convinced that Invalid_arg should be raised, and not Failure. Invalid_arg reports programming errors, and should not usually be caught; in this case, this would mean that the caller needs to validate the redirection filenames, since they could easily be derived from some program inputs. The validation is not overly difficult, but not straightforward and it is platform dependent. In practice, programmers will not implement the validation.
If one decides to raise Failure, one could go away with a more informal contract with the client ("The function will raise Failure if some arguments cannot be properly escaped on the current platform"), and programmers might actually decide to catch the exception.
@dbuenzli : as our champion for Failure vs Invalid_arg debate, feel free to comment.
There was a problem hiding this comment.
It should be Failure rather than Invalid_argument because it's a failure of the library itself rather than the program: in principle, every file name should be quotable, it's just that we don't (yet) know how to do it on Windows.
968fe93 to
86b729d
Compare
|
Rebased and changed the exception to |
86b729d to
eadee6f
Compare
|
A review and a decision, please? This would close #6107 among others. |
This function takes care of quoting the command and its arguments so that they are correctly parsed by the system shell (/bin/sh for Unix, cmd.exe for Win32). Redirections for std input (< file) and std output (> file) and std error (2> file) can also be specified as optional arguments. A 2>&1 redirection is used if stdout and stderr are redirected to the same file. The result is a string that can be passed directly to Sys.command or to the Unix functions that expect shell command lines.
eadee6f to
5b9f9b6
Compare
|
Rebased again. @dra27 could you review the Windows-specific parts? The rest is OK for me. |
nojb
left a comment
There was a problem hiding this comment.
I would like to get this nice PR merged, which helps solve a very real and tricky problem when trying to write platform-independent code in OCaml.
I looked over the code and it looks fine to me; the intricacies of quoting on Windows means that it may not be perfect (and indeed, the docstring of Filename.quote_command makes it clear that it is a best-effort solution). On the basis of that, I am approving the PR and move that we merge it.
|
Thanks everyone! |
|
#9289 proposes a change of |
As reported in MPR#6107 and MPR#7672, it is not easy to properly quote the command name and its arguments in a shell command executed through
Sys.commandor theUnix.open_process*functions. For a POSIX shell it is enough to applyFilename.quoteon every argument. But for the Windows shellcmd.exe, additional quoting is required on top of that, as explained here.This pull requests grants the feature wish from MPR#7672 by adding a
Filename.quote_commandfunction that takes a command and its arguments (as a list of strings) and produces a command line appropriately quoted for the shell that is going to parse and execute it.Documentation of
Sys.command,Unix.systemandUnix.open_process*was updated to encourage the use ofFilename.quote_command.Preventive FAQ
Q: why
string list -> stringand notstring array -> stringfor consistency withUnix.exec?A: it felt right. MPR#7672 suggested
string list. It's easier to build a list (with::and@) than an array. Perhaps it'sUnix.execthat is wrong in its use of astring array.