Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ambiguous behavior #2

Closed
nhz-io opened this issue Oct 15, 2022 · 25 comments
Closed

Ambiguous behavior #2

nhz-io opened this issue Oct 15, 2022 · 25 comments

Comments

@nhz-io
Copy link
Contributor

nhz-io commented Oct 15, 2022

This is confusing

Case 1: as expected

$ node example/parse.js -a1
{ _: [], a: 1 }

Case 2: everything turned into boolean flags

$ node example/parse.js -a1b
{ '1': true, _: [], a: true, b: true }

Case 3: no more flags, just value again

node example/parse.js -a1b2
{ _: [], a: '1b2' }

Case 4: back to booleans

node example/parse.js -a1b2c
{ '1': true, '2': true, _: [], a: true, b: true, c: true }

So, when it ends with a letter - its booleans, and when it ends with a number its value
Any thoughts?

@shadowspawn
Copy link
Collaborator

There is a bug in the code detecting whether the remaining characters in the argument are a number. If the argument ends in a number it is treated as a number, even if there are letters before that.

The search /-?\d+(\.\d*)?(e-?\d+)?$/ is not anchored at the start and thinks 1b2 is a number because it ends in a number. So it treats -a1b2 like -a123 and assigns the "number" to the a option.

&& /-?\d+(\.\d*)?(e-?\d+)?$/.test(next)) {

@nhz-io
Copy link
Contributor Author

nhz-io commented Oct 15, 2022

So, the correct behavior would be to treat everything as boolean flags? (even for repeat sequence) for example :

-a1b22

should produce:

{ '2': true, _:[], a: true, b:true }

Is it correct?

(I am making a Go port of minimist and wouldnt want to port buggy behavior as well) Not anymore. I gave up :)

@shadowspawn
Copy link
Collaborator

shadowspawn commented Oct 15, 2022

No, the intended parsing would be to assign the number 22 to the option b.

Working through argument:

  • -a1b22: we have some short options
  • a + 1b22: we have an option a and the rest is not a number, set a to true
  • 1 + b22: we have an option 1, set option 1 to true
  • b + 22: we have an option b and the rest is a number, set option b to 22
-a1b22

should produce:

{ '1': true, _: [], a: true, b: 22 }

@shadowspawn
Copy link
Collaborator

For comparison, yargs also started from the Optimist code and does not have this issue. Here is the commit with the fix adding the anchor to match the start of the string:

@nhz-io
Copy link
Contributor Author

nhz-io commented Oct 15, 2022

Thank you very much for the assist!

@ljharb
Copy link
Member

ljharb commented Oct 15, 2022

Thanks, a PR would be most appreciated.

@nhz-io
Copy link
Contributor Author

nhz-io commented Oct 15, 2022

Now, another discrepancy

minimist

-1
{  '1': true, _: [] }

yargs-parser

-1
{ _: [ -1 ] }

Which one is correct? (I prefer minimist handling over yargs-parser tbh)

@shadowspawn
Copy link
Collaborator

shadowspawn commented Oct 15, 2022

In this case the parsing of -1 as an option is intended behaviour. Utilities can use a digit as a short option. For example:

ls -1

This is tested here:

parse([ '-123', '456' ]),

@nhz-io
Copy link
Contributor Author

nhz-io commented Oct 15, 2022

And another one

minimist doesn't handle correctly:

-a=b=c
{ _: [], a: 'b' }

yargs-parser does handle correctly

-a=b=c
{ _: [], a: 'b=c' }

@nhz-io nhz-io changed the title Ambigous behavior Ambiguous behavior Oct 15, 2022
@shadowspawn
Copy link
Collaborator

Please open -a=b=c as a new issue so it can be tracked separately.

(For interest, I fixed a similar bug in early iteration of parseArgs! pkgjs/parseargs#43)

@nhz-io
Copy link
Contributor Author

nhz-io commented Oct 15, 2022

Another one, this one seems intentional

Both minimist and yargs-parser:

-a-b-c
{ _: [], a: '-b-c' }

And in general any first non-alphanumeric occurrence after alphanumeric will be treated as option value.

@nhz-io
Copy link
Contributor Author

nhz-io commented Oct 15, 2022

And no unicode support either

Both minimist and yargs-parser:

-абв
{ _: [], 'а': 'бв' }

@nhz-io
Copy link
Contributor Author

nhz-io commented Oct 16, 2022

And another edge-case

Both minimist and yargs-parser:

-=1
{ '1': true, _: [], '=': true }

@shadowspawn
Copy link
Collaborator

-a-b-c

I think this is by design.

-абв

This falls under the non-alphanumeric handling of the first case, and probably as intended.

-=1

Not sure this is intentional, but the first character after the - is assumed to be an option. Likewise for -+.

@nhz-io
Copy link
Contributor Author

nhz-io commented Oct 19, 2022

Consider the following case:

console.log(require('minimist')(["--no-foo"], {string: ["foo"]}))

Will output:

{ _: [], foo: false }

Getting boolean, but we have it specified as a string.

Comes from here:

minimist/index.js

Lines 157 to 159 in 62fde7d

} else if ((/^--no-.+/).test(arg)) {
key = arg.match(/^--no-(.+)/)[1];
setArg(key, false, arg);

@shadowspawn
Copy link
Collaborator

I think setting false when --no-foo is specified is probably a feature rather than a bug.

This allows the author to tell the difference between --foo==string and --foo= and --no-foo.

However, I don't think there is a "right" answer as the user did specify string as you noted so there is a counter argument!

In the absence of a clear bug, I support sticking with the existing behaviour. It does not currently appear in the README or the tests.

@nhz-io
Copy link
Contributor Author

nhz-io commented Oct 19, 2022

Another edge-case, looks like a bug:

console.log(require('minimist')(['--bool', '--str', 'foobar'], {
  alias:{str:['bool']}, string:["str"], boolean: true
}))

Results in:

{ _: [ 'foobar' ], bool: [ '', '' ], str: [ '', '' ] }

However, when using explicit assignment, the result is more sensible:

console.log(require('minimist')(['--str=foobar', '--bool'], {
  alias:{str:['bool']}, string:["str"], boolean: true
}))

Results in:

{ _: [], str: [ 'foobar', '' ], bool: [ 'foobar', '' ] }

@shadowspawn
Copy link
Collaborator

I think this is likely as intended:

  • boolean: true means "treat all double hyphenated arguments without equal signs as boolean", so the option does not take an option value from the following argument
  • opts.string means always treat as strings, which in particular disables the automatic conversion of numbers
  • auto arrays

So --str foobar is parsed with --str as a lone option, and the "string" conversion means the value is stored as a string rather than as true.

Several subtle behaviours in combination!

@nhz-io
Copy link
Contributor Author

nhz-io commented Oct 19, 2022

Yes, but there is an inconsistency between two cases. The one with explicit = assignment and the other one with positional even though they technically mean the same thing

@shadowspawn
Copy link
Collaborator

They don't mean the same thing when boolean:true. Quoting the whole README item:

  • opts.boolean - a boolean, string or array of strings to always treat as
    booleans. if true will treat all double hyphenated arguments without equal signs
    as boolean (e.g. affects --foo, not -f or --foo=bar)

@nhz-io
Copy link
Contributor Author

nhz-io commented Oct 20, 2022

Yeah, the way you frame it is correct, but the output shows that in both cases both args are not treated as booleans, both time they are treated as strings, but with inconsistent value depending on how you pass your values

@shadowspawn
Copy link
Collaborator

I'm comfortable with the current behaviour as being reasonable, but it will be a while before I can double-check my understanding and put a story together.

With the input configuration specifying both opts.boolean:true and opts.string the result is almost by definition ambiguous, so that does fit under the issue title of "Ambiguous behaviour", whether or not it is intended. 😄

@nhz-io
Copy link
Contributor Author

nhz-io commented Oct 20, 2022

What i meant by ambiguity is the inconsistent results, whenever the similar arguments are passed.
In both cases the result is an array of strings, but in one case its ['', ''] and in the other its ['foobar', ''] even though both invocation are semantically similar

@shadowspawn
Copy link
Collaborator

A parser has to decide how to process --foo bar. The default approach in minimist is that options are assumed to take an option-argument if one is available, so this is by default parsed like --foo=bar.

The boolean configuration option changes this default behaviour. A boolean option is assumed not to take an option argument, so --foo bar is parsed like bar --foo.

An argument with an explicit = like --foo=bar is parsed as an option and its value, whatever the configuration.

@ljharb
Copy link
Member

ljharb commented Mar 2, 2024

I'm not certain there's anything further to discuss here, so I'll close this. Please file a new issue (one per thing, please) if I'm mistaken!

@ljharb ljharb closed this as completed Mar 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants