Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On Windows, empty args aren't wrapped with quotes, leads to wrong argv #4778

Closed
alexeagle opened this issue Mar 6, 2018 · 16 comments
Closed

Comments

@alexeagle
Copy link
Contributor

I add some empty args with an action, like this

args.add([], join_with=",")
args.add("hello")

On linux/mac, the program is run with
my_action '' hello
so I parse two arguments.

On windows it's run with
my_action hello

I can make a minimal repro if that's useful

@alexeagle
Copy link
Contributor Author

alexeagle commented Mar 6, 2018

Interestingly if I spill to a params file, it contains

''
hello

Which seems backwards - a blank line would have been more useful to parse args from a file, while the empty quotes would have worked in the argv

@laszlocsomor
Copy link
Contributor

Yes, this is definitely a bug.

C:\tmp>type print-argv.cc
#include <iostream>

int main(int argc, char** argv) {
  std::cout << "argc=" << argc << std::endl;
  for (int i = 0; i < argc; ++i) {
    std::cout << "argv[" << i << "]=(" << argv[i] << ")" << std::endl;
  }
  return 0;
}

C:\tmp>print-argv.exe '' foo "" bar
argc=5
argv[0]=(print-argv.exe)
argv[1]=('')
argv[2]=(foo)
argv[3]=()
argv[4]=(bar)

@laszlocsomor
Copy link
Contributor

...I should've added that the code is a demonstration of how empty args work on Windows.

Apparently '' is taken literally but "" means an empty string.

FYI here's an essay about cmd.exe quoting rules: https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/

@laszlocsomor laszlocsomor added type: bug P1 I'll work on this now. (Assignee required) platform: windows labels Mar 6, 2018
@laszlocsomor laszlocsomor self-assigned this Mar 6, 2018
@laszlocsomor
Copy link
Contributor

/cc @meteorcloudy

@laszlocsomor laszlocsomor removed the P1 I'll work on this now. (Assignee required) label Mar 6, 2018
@laszlocsomor
Copy link
Contributor

I'm not sure about the right priority, so I'm not assigning any.

@meteorcloudy
Copy link
Member

I'm taking a look at this.

@laszlocsomor
Copy link
Contributor

Thank you Yun!

@meteorcloudy
Copy link
Member

meteorcloudy commented Mar 6, 2018

@alexeagle I tried to use @laszlocsomor 's print-argv.cc in a skylark rule, it seems the parameters are passed correctly.

See the following files:

execute.bzl:

def _impl(ctx):
  # The list of arguments we pass to the script.
  args = ctx.actions.args()
  args.add([], join_with=",")
  args.add("hello")
  # Action to call the script.
  ctx.actions.run(
      inputs=[],
      outputs=[ctx.outputs.out],
      arguments=[args],
      executable=ctx.executable._print_bin)

print_param = rule(
  implementation=_impl,
  attrs={
      "out": attr.output(mandatory=False),
      "_print_bin": attr.label(executable=True, cfg="host", allow_files=True,
                                default=Label("//actions_run:print"))
  }
)

and BUILD:

load(":execute.bzl", "print_param")

print_param(
    name = "run",
    out = "output.txt",
)

cc_binary(
    name = "print",
    srcs = ["print.cc"],
)

The output:

SUBCOMMAND: # //actions_run:run [action 'SkylarkAction actions_run/output.txt']
cd C:/src/tmp/_bazel_pcloudy/8oryhygq/execroot/__main__
bazel-out/host/bin/actions_run/print.exe  hello
INFO: From SkylarkAction actions_run/output.txt:
argc=3
argv[0]=(C:\src\tmp\_bazel_pcloudy\8oryhygq\execroot\__main__\bazel-out\host\bin\actions_run\print.exe)
argv[1]=()
argv[2]=(hello)

Bazel shows the command is "bazel-out/host/bin/actions_run/print.exe hello", but actually the empty argument is passed correctly because we invoke the binary through system API instead of running in terminal.

@meteorcloudy
Copy link
Member

As I explained, this doesn't seem to be a bug in Bazel, so closing. Reopen if it's still a problem.

@alexeagle
Copy link
Contributor Author

I can still reproduce this:

BUILD.bazel

load("@build_bazel_rules_nodejs//:defs.bzl", "nodejs_binary")

nodejs_binary(
    name = "bin",
    entry_point = "__main__/repro.js",
    data = ["repro.js"],
)

load(":index.bzl", "my_rule")
my_rule(name = "my_rule", out="params.txt")
filegroup(name = "node_modules")

WORKSPACE

git_repository(
    name = "build_bazel_rules_nodejs",
    remote = "https://github.com/bazelbuild/rules_nodejs.git",
    tag = "0.5.3")

load("@build_bazel_rules_nodejs//:defs.bzl", "node_repositories")
node_repositories(package_json = [])

index.bzl

def _my_rule(ctx):
  args = ctx.actions.args()
  args.add([], join_with=",")
  args.add("hello")
  ctx.actions.run(
      inputs=[],
      executable = ctx.executable._bin,
      outputs = [ctx.outputs.out],
      arguments = [args],
      )

my_rule = rule(_my_rule,
    attrs = {
        "out": attr.output(mandatory=False),
        "_bin": attr.label(default=Label("//:bin"),
            executable=True, cfg="host"),
    },
)

repro.js

console.error(process.argv);
bazel build :my_rule
DEBUG: C:/users/alexeagle/appdata/local/temp/_bazel_alexeagle/znpsit4n/external/bazel_tools/tools/cpp/lib_cc_configure.bzl:37:3:
Auto-Configuration Warning: 'BAZEL_VC' is not set, start looking for the latest Visual C++ installed.
DEBUG: C:/users/alexeagle/appdata/local/temp/_bazel_alexeagle/znpsit4n/external/bazel_tools/tools/cpp/lib_cc_configure.bzl:37:3:
Auto-Configuration Warning: Looking for VS%VERSION%COMNTOOLS environment variables,eg. VS140COMNTOOLS
DEBUG: C:/users/alexeagle/appdata/local/temp/_bazel_alexeagle/znpsit4n/external/bazel_tools/tools/cpp/lib_cc_configure.bzl:37:3:
Auto-Configuration Warning: Visual C++ build tools found at C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\
INFO: Analysed target //:my_rule (0 packages loaded).
INFO: Found 1 target...
INFO: From SkylarkAction params.txt:
WARNING: source-map-support module not installed.
   Stack traces from languages like TypeScript will point to generated .js files.

[ 'C:\\users\\alexeagle\\appdata\\local\\temp\\_bazel_alexeagle\\znpsit4n\\external\\nodejs\\node.exe',
  '__main__/repro.js',
  'hello' ]

Maybe it's somehow related to nodejs specifically, but if I just run the program like this, I don't see the problem:

 C:\users\alexeagle\appdata\local\temp\_bazel_alexeagle\znpsit4n\external\nodejs\node.exe repro.js '' 'hello'
[ 'C:\\users\\alexeagle\\appdata\\local\\temp\\_bazel_alexeagle\\znpsit4n\\external\\nodejs\\node.exe',
  'C:\\Users\\alexeagle\\Projects\\repro_4778\\repro.js',
  '\'\'',
  '\'hello\'' ]

@meteorcloudy
Copy link
Member

I debugged this issue again. And found, if I modify

if (arg.isEmpty()) {
result.append("\"\"");
continue;
}

to

      if (arg.isEmpty()) {
        result.append("\'\'");
        continue;
      }

The nodejs example will have the expected output:

[ 'C:\\users\\pcloudy\\_bazel_pcloudy\\4isq3p4t\\external\\nodejs\\node.exe',
  '__main__/repro.js',
  '',
  'hello' ]

But the C++ example will output:

argc=3
argv[0]=(C:\users\pcloudy\_bazel_pcloudy\4isq3p4t\execroot\__main__\bazel-out\host\bin\print.exe)
argv[1]=('')
argv[2]=(hello)

The reason is different binaries have different command line parser. While the C++ binary treats "" as an empty argument, the nodejs binary just ignores it.

There is a very interesting article about this: http://daviddeley.com/autohotkey/parameters/parameters.htm

Saddly, I don't think there is any good solution from Bazel side, because Bazel doesn't know how the command line will be parsed. :(
@laszlocsomor @alexeagle

@laszlocsomor
Copy link
Contributor

@meteorcloudy : thanks for looking into this issue!

What exact command line did Bazel pass to CreateProcessW in these two cases? Is there a way to pass an empty argument to a C++ binary?

@meteorcloudy
Copy link
Member

Before the change:
<binary> "" hello
after the change:
<binary> '' hello

The first one is able to pass an empty argument to a C++ binary, but not to a nodejs binary.
The second one works for nodejs binary, but the C++ binary will receive a '' instead of an empty string.

@alexeagle
Copy link
Contributor Author

Interesting, I also read https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/ which doesn't mention empty arguments but does agree that parsing the command line is up to the process.
I tried one more thing, same repro.js as above but this time calling node directly:

C:\Users\alexeagle\Projects\repro_4778                   
λ node repro.js 1 2 3 ""                                 
[ 'C:\\cmder\\bin\\node.exe',                            
  'C:\\Users\\alexeagle\\Projects\\repro_4778\\repro.js',
  '1',                                                   
  '2',                                                   
  '3',                                                   
  '' ]                                                   
                                                         
C:\Users\alexeagle\Projects\repro_4778                   
λ node repro.js 1 2 3 "" ''                              
[ 'C:\\cmder\\bin\\node.exe',                            
  'C:\\Users\\alexeagle\\Projects\\repro_4778\\repro.js',
  '1',                                                   
  '2',                                                   
  '3',                                                   
  '',                                                    
  '\'\'' ]                                               
                                                         
C:\Users\alexeagle\Projects\repro_4778                   
λ node -v                                                
v8.11.1                                                  

As you can see, the double quotes do create an empty argument. So I guess the problem is related to how we spawn the nodejs process under Bazel.

@meteorcloudy
Copy link
Member

meteorcloudy commented Sep 19, 2018

@alexeagle You are absolutely correct! I finally found the bug.
Because we launch the nodejs binary through a bash script, and the bash script is again launched by the native binary launcher created by Bazel. We didn't parse empty argument correctly in the native launcher, see:

wstring GetEscapedArgument(const wstring& argument, bool escape_backslash) {

After adding

  if (argument.empty()) {
    return L"\"\"";
  }

The output is now correct:

$ bazel build //:my_rule
WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
C:\Users\pcloudy/.bazelrc
DEBUG: C:/users/pcloudy/_bazel_pcloudy/4isq3p4t/external/build_bazel_rules_nodejs/internal/common/check_bazel_version.bzl:46:5:
Current Bazel is not a release version, cannot check for compatibility.
DEBUG: C:/users/pcloudy/_bazel_pcloudy/4isq3p4t/external/build_bazel_rules_nodejs/internal/common/check_bazel_version.bzl:48:5: Make sure that you are running at least Bazel 0.5.4.
INFO: Analysed target //:my_rule (0 packages loaded).
INFO: Found 1 target...
INFO: From SkylarkAction params.txt:
WARNING: source-map-support module not installed.
   Stack traces from languages like TypeScript will point to generated .js files.

[ 'C:\\users\\pcloudy\\_bazel_pcloudy\\4isq3p4t\\external\\nodejs\\node.exe',
  '__main__/repro.js',
  '',
  'hello' ]

I'll send a fix for this issue.

@alexeagle
Copy link
Contributor Author

Awesome work again @meteorcloudy thank you!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants