Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String arguments that contain dashes do not parse properly #145

Closed
andyburke opened this issue Jan 29, 2016 · 18 comments
Closed

String arguments that contain dashes do not parse properly #145

andyburke opened this issue Jan 29, 2016 · 18 comments
Labels

Comments

@andyburke
Copy link

Eg:

foo -a 'aaa' -b 'bbb' command "-- string with dashes in it --"

This will fall afoul of this case:

https://github.com/bcoe/yargs/blob/master/lib/parser.js#L103

I ran into this trying to encrypt a private ssh key using cryptex.

@andyburke
Copy link
Author

You can get around it with the bare -- option, but it was non-intuitive and took me quite a bit of digging in the debugger to understand what was happening.

@bcoe
Copy link
Member

bcoe commented Jan 29, 2016

@andyburke good find, would you like to submit a failing test?

@achingbrain
Copy link

achingbrain commented Sep 1, 2016

Here's one:

it('allows dashes in argument values', function () {
  var argv = yargs("foo --arg '--bar --baz'")
    .command('foo [options]', 'the foo command')
    .describe('arg', 'Should contain some values')
    .argv

  // fails with AssertionError: expected true to equal '--bar --baz'
  argv.should.have.property('arg').and.equal('--bar --baz')
})

@nklayman
Copy link

Going to go ahead and bump this. I could really use this feature as well, and it's been awhile since this was updates.

@masaeedu
Copy link

@bcoe Here's a failing test:

    it('parses positional arguments starting with a hyphen as positional arguments', () => {
      const raw = [
        'run',
        'foo',
        'bar',
        '-Baz quux'
      ]
      yargs
        .command({ command: 'run <foo> <bar> [args]', builder: ysub => ysub.strict().demandCommand(0, 0).options({}) })
        .strict()
        .demandCommand()
        .parse(raw)
    })

This fails (and apparently crashes the test runner) with:

_mocha run <foo> <bar> [args]

Options:
  --help      Show help                                                [boolean]
  --version   Show version number                                      [boolean]
  --reporter                                                   [default: "text"]

Unknown arguments: B, a, z

I'm not familiar enough with yargs to figure out how to express this as a testcase that results in a single testcase failure instead of crashing mocha.

@masaeedu
Copy link

The issue is actually arising in the separate yargs-parser library, so it should probably be addressed there. The argument string is tokenized correctly, but somewhere in the body of this loop (lines 147 to 305), the element corresponding to the contents of the quotes get broken apart and processed as options.

@tristanMatthias
Copy link

Hi guys, any update on this one? I'm running into this issue for a CLI I'm writing https://github.com/tristanMatthias/wc-sass-render/.

@bcoe bcoe transferred this issue from yargs/yargs Nov 8, 2018
@bcoe
Copy link
Member

bcoe commented Nov 8, 2018

@tristanMatthias @masaeedu I've transferred this issue to yargs-parser, would happily accept a patch if anyone feels like take a stab at the problem 👍

@tristanMatthias
Copy link

tristanMatthias commented Nov 9, 2018

@bcoe working on one now 😀

tristanMatthias added a commit to tristanMatthias/yargs-parser that referenced this issue Nov 9, 2018
@tristanMatthias
Copy link

tristanMatthias commented Nov 9, 2018

Ok @bcoe, I've submitted a pull request with the fix (#146). Please note the solution required a little workaround (detailed in the PR) in which I add an extra set of quotation marks (then remove them later in the parser). The reason for this is because previously the tokenizer would create an array with [..., '--flag', '-valueStartingWithDash'], and there was no way to determine if '-valueStartingWithDash' were flags or a string value (hence the extra wrapping). Let me know what you think!

@bcoe
Copy link
Member

bcoe commented Nov 9, 2018

@tristanMatthias @andyburke having thought on this a bit more, here's the ultimate problem:

node foo.js --hey "--test" '--test--'
[ '/Users/benjamincoe/.nvm/versions/node/v10.12.0/bin/node',
  '/private/tmp/foo.js',
  '--hey',
  '--test',
  '--test--' ]

the operating system strips " and ' before they're even passed in, so there is no way for the parser to end up with a string like "--foo".

perhaps a better solution would be introducing an escape character, e.g., \--hey, which would then stop parsing for whatever the next hyphenated option is.

@masaeedu
Copy link

masaeedu commented Nov 9, 2018

@bcoe I think the issue I'm having is a little different (maybe I need to open a different issue). When you pass something like the following: node foo.js --hey "-what -is -going -on", this will be parsed by the shell into ['node', 'foo.js', '--hey', '-what -is -going -on'] and then passed into node. When yargs looks at this list, it should interpret the array element after --hey as a positional argument instead of as a flag, because it is meaningless as a flag. I think if the regexen in the parser were anchored to the end they wouldn't accept -what -is -going -on as a valid flag, and so it would be treated positional argument as desired.

As far as escaping quotes goes, we can double up quotes so they survive the shell's expansion process and make it into node's argv. node foo.js '"-- string with dashes in it --"' will produce the argv ['node', 'foo.js', '"-- string with dashes in it --"'].

Then it's yargs-parser's responsibility to parse the argv element "-- string with dashes in it --" distinctly from -- string with dashes in it --.

@bcoe
Copy link
Member

bcoe commented Nov 9, 2018

@masaeedu aha! I'd missed this edge-case, interesting.

@tristanMatthias thanks for a first pass at the solution, I wonder if we can figure out a way to avoid the temporary addition of quotes? working in the parser feels like a tech interview question doesn't it?

@tristanMatthias
Copy link

It does indeed! Is there a way we can definitively differentiate between them with the quotes stripped? I thought about holding something in memory, but that doesn't work when it's passed from the Tokenizer to the Parser

@masaeedu
Copy link

@bcoe btw, it isn't the operating system that's stripping quotes; it will faithfully forward whatever gets passed into execvp (or whatever syscall was actually used to create the child process). The problem is our shells work with string command lines, but the system call for creating a new process requires an array of strings. So each shell has its own quirky way of tokenizing whatever string you type into it into a argument list, which it then uses to make the syscall. The syscall just forwards this information faithfully.

This gets even hairier on Windows where things are precisely backwards; the system call for creating a new process expects a string instead of a list of arguments, so a shell like Powershell or cmd or whatever will first take the string you passed into it, do some substitutions and expansions and whatnot, and then prepare a string to pass into the system call for creating a new process. The system call will provide Node with a big string, which then Node will parse (only only Windows), into an argv which is what we see when we console.log(process.argv).

I'm just braindumping all this stuff i've learned about the surprisingly convoluted world of command line parameters that I've learnt over the past couple of weeks, this may already have been common knowledge for you and @tristanMatthias.

@masaeedu
Copy link

The relevance of this stuff is that it's probably going to be pretty complicated to guess at what the original string and its quote wrapping was, because every shell can parse the string into an argv differently; whatever we get in the argv should be treated as the "source of truth". If a shell is stripping quotes before they ever make it into Node's argv, that should be avoided by using the shell's own rules for escaping and doubling up quotes.

@tristanMatthias
Copy link

Sounds like this should be the final answer to me. I agree with @masaeedu. For us to escape these issues, we can also use the = too! EG: cli --foo="-bar"

@bcoe
Copy link
Member

bcoe commented Jan 29, 2019

@andyburke @masaeedu @tristanMatthias please try:

+ yargs-parser@12.0.0

I think it solves a bunch of issues around quotes and dashes in strings \o/

@tristanMatthias thanks for helping with the initial work, it got us going in the right direction.

@bcoe bcoe closed this as completed Jan 29, 2019
markus456 added a commit to mariadb-corporation/MaxScale that referenced this issue Aug 12, 2021
The Yargs parser doesn't seem to play nicely with option-like arguments
that are passed as positional arguments. For more information, refer to
the following issues:

  yargs/yargs-parser#381
  yargs/yargs-parser#145
  yargs/yargs#1821
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants