Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sh_binary launcher does not quote arguments correctly on Windows #17487

Open
seh opened this issue Feb 14, 2023 · 6 comments
Open

sh_binary launcher does not quote arguments correctly on Windows #17487

seh opened this issue Feb 14, 2023 · 6 comments
Labels
area-Windows Windows-specific issues and feature requests help wanted Someone outside the Bazel team could own this P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-Server Issues for serverside rules included with Bazel type: bug

Comments

@seh
Copy link

seh commented Feb 14, 2023

Description of the bug:

When using the sh_binary rule to invoke a bash program on Windows, the BashBinaryLauncher C++ class used to invoke bash composes the command-line arguments incorrectly, quoting only those arguments that include at least one space character and escaping only double quotation and backslash characters. The launcher fails to quote arguments that include other characters that bash winds up misinterpreting.

In short, even when the sh_binary author quotes the command-line arguments as necessary to satisfy bash, the BashBinaryLauncher program consumes those arguments but then fails to emit them again in a manner that will satisfy bash.

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Find attached here an archive of a Bazel repository named bazel-sh-binary-test. After extracting that archive, run the following command:

bazel test //test:print_args_test

This test compares the would-be printed output with expected "golden" output. This test succeeds on macOS and Linux.

When run on Windows, though, we never even make it as far as invoking the bash program in file test/print-args. Instead, bash fails while parsing the command-line arguments passed via the -c flag by the BashBinaryLauncher program.

In the genrule target named "capture_printed_args," we invoke the print-args program as follows:

print_args a 'b c' 'f(x).y'

Note that the final argument is f(x).y surrounded by single quotation marks. Quoting the argument like that is necessary because the parentheses are shell metacharacters. We don't intend for bash to interpret those parentheses; they're meant for the print-args program to receive and consume.

Bazel's genrule implementation invokes the program created by the sh_binary rule as follows:

.../bin/test/print_args.exe a 'b c' 'f(x).y' > "bazel-out/.../bin/test/printed-args.txt"

Note that the second b c and third f(x).y arguments are both surrounded by single quotation marks. Next, the print_args.exe program invokes bash as follows:

...\usr\bin\bash.exe -c '...\bin\test\print_args a "b c" f(x).y'

Note that while the second b c argument is surrounded with double quotation marks, the third f(x).y argument is not quoted at all. BashBinaryLauncher decided that that third argument didn't warrant quoting, but then bash fails as follows:

/usr/bin/bash: -c: line 1: syntax error near unexpected token `('

Which operating system are you running Bazel on?

Windows

What is the output of bazel info release?

release 6.0.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

Somewhat related issues:

Somewhat related PRs:

Any other information, logs, or outputs that you want to share?

There have been a few discussions in the "Bazel" Slack workspace on this subject, most recently focused on interpreting the failures with the test case supplied here. As part of that discussion, @fmeum submitted #17484 to probe the argument inspection and escaping done in the BashEscapeArg function.

The bash source code offers the sh_contains_shell_metas function that shows which characters bash itself considers to warrant special treatment. Among other things, bash uses that function to determine when to quote words when printing commands. Looking there, space characters are just one of twenty-five possible characters that may warrant quoting.

Without us being more conservative in which arguments we quote before passing on to the sh_binary-wrapped program, we can't pass many otherwise valid arguments through, leaving us unable to support invocations on Windows that work fine in other operating systems, even though we have bash at our disposal.

@seh
Copy link
Author

seh commented Feb 14, 2023

Note that while sh_binary uses this intermediate launcher program on Windows, on other platforms it doesn't bother, and instead uses the lone file mentioned in the "srcs" attribute as the executable to invoke.

@seh
Copy link
Author

seh commented Feb 14, 2023

Note that Bazel's own Starlark Args type—or whatever digests the "arguments" parameter for ctx.actions.run—doesn't always quote arguments that need to be quoted on Windows, and fights back against a caller's attempt to surround argument values preemptively in quotation marks.

@ShreeM01 ShreeM01 added type: bug team-Rules-Server Issues for serverside rules included with Bazel untriaged labels Feb 14, 2023
@meteorcloudy meteorcloudy added area-Windows Windows-specific issues and feature requests P2 We'll consider working on this in future. (Assignee optional) help wanted Someone outside the Bazel team could own this and removed untriaged labels Feb 20, 2023
@meteorcloudy
Copy link
Member

@seh Thanks for reporting this one! We do have a test at least trying to make sure passing arguments work correctly with the launchers, see https://cs.opensource.google/bazel/bazel/+/master:src/test/py/bazel/launcher_test.py;l=307-328

Compared to your repo, the shell scripts deal with the arguments differently, but I'm not completely sure which way is "correct".

@seh
Copy link
Author

seh commented Feb 20, 2023

the shell scripts deal with the arguments differently, but I'm not completely sure which way is "correct".

If you look at @fmeum's #17484, you can see that he augmented the _buildAndCheckArgumentPassing Python function to include an argument with parentheses and a period, along with another test to exercise genrule. The period probably doesn't wind up making a difference, but the parentheses definitely do. There are many other characters that count as shell metacharacters for which we could check as well.

@meteorcloudy
Copy link
Member

I see. Parsing command line is always harder on Windows since the CreateProcessW only allows you to pass one single string for all arguments (while on Unix systems you can pass a list of strings), so it really depends on how the binary itself to parse arguments from the command line string.

Overall, I think it definitely makes sense to fix cases with special characters and keep the behaviour consistent with other platforms. The Bazel team probably don't have capacity to look into this right now, but happy to review a PR if someone wants to give it go. ;)

@seh
Copy link
Author

seh commented Feb 21, 2023

Parsing command line is always harder on Windows since the CreateProcessW only allows you to pass one single string for all arguments (while on Unix systems you can pass a list of strings)

Keep in mind, though, that here, the problem is not how we're composing several command-line arguments, but rather the single argument supplied to bash -c (witness how we compose and use the bash_command variable's value in BashBinaryLauncher::Launch()). We pass each individual argument through BashEscapeArg, and then pass the whole command line through BashEscapeArg one more time. That second invocation is strange.

@comius comius added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed P2 We'll consider working on this in future. (Assignee optional) labels Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Windows Windows-specific issues and feature requests help wanted Someone outside the Bazel team could own this P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-Server Issues for serverside rules included with Bazel type: bug
Projects
None yet
Development

No branches or pull requests

4 participants