-
-
Notifications
You must be signed in to change notification settings - Fork 739
std.process: Command escaping functions #457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std.process: Command escaping functions #457
Conversation
Have you looked at http://blogs.msdn.com/b/twistylittlepassagesallalike/archive/2011/04/23/everyone-quotes-arguments-the-wrong-way.aspx ? I've tried implementing that C# algorithm in D but with no luck. I definitely need this sort of thing. :) |
No, I haven't (yet), but I think my unit tests are quite comprehensive! They literally brute-force a string and pass the output through the Win32 parsing function. |
I think that blog post is wrong about the way cmd handles double quotes on its command line. From what I understood, it contradicts the output of |
Thanks for the link, Andrej! I looked at the code again today and did some more tests and turns out I was wrong. I'll need to do more changes and testing to make sure I got this right. |
I am not sure if we're talking about the same thing, but @schveiguy wrote something like this for Tango's process module, and he has reimplemented it for the new std.process. (BTW, Walter included the necessary changes to snn.lib in 2.058, which means we can finally submit the new std.process for Phobos review!) |
It seems to be the same idea, but his implementation seems to be a good deal behind mine (original research which seems to miss some subtle issues, suboptimal code which needlessly uses unicode iteration, no tests).
Great! I'm really excited about popen functionality on Windows. You should be able to practically copy-paste the bottom of the current module into the new one to get my changes. |
The last time I've ran into issues with spaces it was with Optlink. When you have paths with spaces you have to use quotation marks, but this is problematic. E.g.:
Here optlink missed the last backslash and is treating the import path as a path to a library file:
To work around it, I have to pass an extra backslash:
Of course this wouldn't even be an issue if optlink didn't use such a silly command-line syntax where it differentiates between import paths and file paths based on whether there's a backslash at the end. |
Yes, that sounds like the usual CommandLineToArgvW algorithm. Backslashes at the end of quoted strings are treaded differently. |
OK, should be solid now. |
Fantastic,
Thanks for the hard work. |
Clarified. |
* Remove self-contradiction * Clarify escapeWindowsArgument implementation * Remove a redundant if subcondition in a unit test.
My std.process rules are correct, I spent a lot of time testing them. The MSDN docs are useless/wrong! But my implementation is lacking. In particular I don't handle repeated backslashes properly (contrary to what my rules state!). Good call about not doing unicode. I think when I did that I didn't understand unicode, and so I did what I thought was the most robust. I'll also cite a limitation in your code, your function returns a string per argument, but if you are building a command line you want a string for the whole command (CreateProcess requires a single command line string). This is a waste of allocations. I'd suggest having an internal function that takes an output range that it appends the data to, and wrap that with a version that returns a string. Then I'll probably use it in my std.process rewrite :) |
I don't see any unit tests. I'd like to say something from my history of writing this code. Initially, some of these functions were written for rdmd. I thought my unit test was comprehensive enough - turns out it was not. I had omitted the case of backslashes preceding quotes in the middle of a string. After I posted this pull request, I found another problem with the Therefore, I reserve my right to remain skeptical about statements like "don't worry, I tested it".
I admit, I skipped the text and looked at the code.
The cost of a few heap allocations is negligible compared to that of creating a process. Comparably, your code does a character append for each character. (Yes, I know |
If your string is so big that you worry about allocations you're going to have to output that long command to a response file anyway. And that is where File IO is probably going to overshadow any optimizations you had. Speaking of which, is there any way to programmatically figure out the maximum length of a command on Posixes? On win32 it's pretty simple since it's predefined.. |
I think Autotools configure scripts find it as part of the configuration process. |
The testing was from a previous life (when I wrote the code implementing these rules in tango). In fact, I don't mean that the implementation is well tested, I mean the rules are well tested. Clearly the implementation is poorly tested since I had an obvious error in it :) The only way to test the rules is to spawn processes and see if windows thinks you got it right. There is no accurate or valid documentation anywhere. It was about 2 or 3 days of playing with cmd.exe to try and nail down the Windows argument parsing black-box. Your implementation follows my rules, so I think we are in agreement on that.
Creating processes is not occurring in your argument processing function, nor is it a prerequisite for using the function. My point is, if it can be better implemented, it should be. I dislike the philosophy that one can be sloppy in writing software because most users won't notice. Especially in library code. If it's a design for performance tradeoff, that's one thing, but this should be free. |
Just to clarify, I'm perfectly happy having someone else implement this horrible convention of windows, but I think it should be as efficient as possible. Your code is certainly more advanced than mine, and I'm happy to use it. |
There is nothing philosophical about premature optimization. It is a misallocation of resources, and nothing more. You are forgetting about the cost of maintaining more code, or more complicated code, as well as the cost to actually do the change - my time would be better spent working on something else D-related. |
You understand that as library code, application writers don't have a choice to optimize this. And this isn't just an optimization, it's providing extra design flexibility. It's ok by me to pull this, I'll just fix it when the new std.process is ready to use it. Not a big deal. |
CommandLineToArgvW). | ||
*/ | ||
|
||
string escapeWindowsArgument(string arg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in char[]
, the result is always a new buffer.
After a quick look, it looks like it should be pure
, nothrow
and @trusted
too (it looks like assumeUnique
isn't @safe
)
In terms of computational cost when allocating, indeed it's negligible. But the memory cost is significant, since we don't have any compacting garbage collection. Please don't produce more garbage than necessary, especially not in the standard library - you can never predict the demands of the user. edit: Oh, and, it looks you haven't considered |
Good point. I'll take care of it. |
Optimized functions: * escapeWindowsArgument * escapePosixArgument * escapeWindowsShellCommand
Now? |
Oops, forgot to remove the [WIP] tag. No further objections? |
Fix the test failures, I have no problem with the design. The allocator is perfect hook for my purposes. |
Thanks. Should be fixed now. |
std.process: Command escaping functions
Merged. |
This adds functions to escape shell commands for use with the
system
andshell
functions.This is a cleanup of the code I've written for rdmd. Let me know how I did with docs / semantics. "Real-life" unit tests are, unfortunately, hard to write.