OptionParser rewrite with subcommands #9009

RX14 · 2020-04-05T23:47:48Z

I got angry with OptionParser for being a pile of bad code so I rewrote the implementation. It's simpler now, and easier to understand the implementation, and not O(n^2) so probably miles faster. And it has a completely compatible interface, no changes were required to any existing usages across the stdlib.

While I was rewriting it I had a universe brain moment and realised that I could make it handle subcommands very simply, so I threw that in as a bonus.

Example subcommand usage:

OptionParser.parse(args) do |opts|
  opts.on("subcommand", "Description") do
    opts.on("--foo arg", "Foo") { |v| foo = v }
    opts.on("--bar", "Bar") { bar = true }
    opts.on("-z", "--baz", "Baz") { z = true }
  end
  opts.on("--verbose", "") { verbose = true }
end

There's some tweaks still required to get the to_s output to not suck when using subcommands but I wanted to submit this for comment before I forgot about it tomorrow morning.

Fixes #8937.

RX14 · 2020-04-05T23:48:30Z

spec/std/option_parser_spec.cr

-  it "has required option with = (3) raises" do
-    expect_missing_option ["--flag="], "--flag=FLAG", "--flag"
+  it "has required option with = (3) handles empty" do
+    expect_capture_option ["--flag="], "--flag=FLAG", ""


I changed the behaviour of OptionParser here because I thought this behaviour was incorrect.

oprypin · 2020-04-06T08:59:09Z

I think this fixes #8937

RX14 · 2020-04-06T09:12:01Z

Yeah, me too

I'll add a spec.

straight-shoota · 2020-04-06T10:57:08Z

Awesome!

This might also enable to implement #5338 (could be a follow-up though).

RX14 · 2020-04-06T11:11:44Z

Ah, that seems rather difficult with the current code because there's no way to "peek" at the next argument and see if it's valid or not

I might just get rid of the "iterator" business and use indexing in the array, then.

straight-shoota · 2020-04-06T11:42:07Z

I suppose it might be best to support only --optional-flag=VALUE or --optional-flag (no VALUE), an not --optional-flag VALUE because the latter is always kind of ambiguous, even with a fixed set of values and it can't handle arbitrary values at all.

RX14 · 2020-04-06T11:44:51Z

It's definitely doable to have a "next argument doesn't start with -" heuristic.

I switched the code to use array indexing and it's neater anyway.

Blacksmoke16 · 2020-04-06T13:17:12Z

Thoughts on allowing multiple shortflags to be grouped?

require "option_parser"

OptionParser.parse do |opts|
  opts.on("-f", "Prints foo") { pp "foo" }
  opts.on("-b", "Prints bar") { pp "bar" }
end

Works:
./test -f -b

Does not work:
./test -fb

RX14 · 2020-04-06T13:49:29Z

@Blacksmoke16 I thought about adding that feature, but I'd rather leave it for another PR.

Blacksmoke16 · 2020-04-06T22:07:41Z

Would also close #3357.

Another thought I had, probably better for another PR; but what about parsing options with more than one value? An example of this is the --arg name value within https://stedolan.github.io/jq/manual/.

oprypin · 2020-04-06T22:34:29Z

@Blacksmoke16 It is indeed unrelated and is also not part of any spec or common use.

oprypin · 2020-04-06T22:43:14Z

I wrote this reply before realizing that something else was being asked, but

if one wants to support an arbitrary number of values, it should definitely not be done as
--stuff foo bar baz (non-compliant and very ambiguous in terms of what can follow it)
but rather --stuff foo --stuff=bar --stuff=baz,
or, as a shell would let you write, --stuff={foo,bar,baz}.

And this is already supported, just push it to an array:

Code

require "option_parser"

argv = ["--stuff", "foo", "--stuff=bar"]

stuff = [] of String

OptionParser.parse(argv) do |parser|
  parser.on("--stuff=NAME", "") { |name| stuff << name }
end

p! stuff # => ["foo", "bar"]

Blacksmoke16 · 2020-04-06T23:13:49Z

@oprypin I wasn't imagining supporting an arbitrary number of values, just a way to handle a fixed number that is more than one.

Like parser.on("--arg NAME VALUE", "Add an arg") { |name, value| # Do whatever }

oprypin · 2020-04-06T23:15:22Z

I wrote this reply before realizing that something else was being asked, but

straight-shoota · 2020-04-06T23:17:49Z

@Blacksmoke16 IMO that looks like a very uncommon feature. I'd rather not have such specialized things in OptionParser.

Btw. why not just use --arg NAME=VALUE for example? Would be much more convenient IMO.

Blacksmoke16 · 2020-04-06T23:55:08Z

Oh I don't really have a use case. Was just something I noticed in jq.

sudo-nice · 2020-04-08T17:45:35Z

Thank you for the great improvement of the OptionParser!
Will those sticky arguments (like command -ufoo -tbar) still be working after the update?

Currently working example

require "option_parser"

ary = [] of String
options = %w[-ufoo -tbar]

OptionParser.parse(options) do |p|
  p.on("-u FOO", "Description") { |o| ary << o }
  p.on("-t FOO", "Description") { |o| ary << o }
end

pp ary # => ["foo", "bar"]

oprypin · 2020-04-08T17:49:26Z

You can just copy paste the file before your example to try.

Yes, it works before and after.

RX14 · 2020-04-09T18:39:32Z

I've just added commits to unregister subcommands when they're called, this means that when you call a subcommand foo, all subcommands are unregistered, so foo foo doesn't handle foo twice.

I've also added two new features: OptionParser#stop and OptionParser#every(&). The latter takes a callback to run on every single argument before it's parsed. This is neccesary, because performing a map or each on ARGV wouldn't be able to tell the difference between flags and their arguments, whereas every is not passed flag arguments. Then OptionParser#stop allows you to bail out of parsing at any point. Specs for all of these showcase them well.

This PR is still WIP because I need to fix the OptionParser documentation, but I'd like to get feedback on the current code before doing that.

straight-shoota

I love it =)

src/option_parser.cr

spec/std/option_parser_spec.cr

RX14 · 2020-04-10T12:31:25Z

every has been renamed to before_each and documentation is complete. This is ready for final review then squish.

straight-shoota · 2020-04-10T14:49:39Z

I'm not entirely sure about using the same method for defining a subcommands and flags when the only difference is whether the string starts with a dash.
Maybe a different method name for sucommand would be better to more clearly communicate the different concepts. For example opts.subcommand("subcommand", "") do.

RX14 · 2020-04-10T16:25:59Z

I mean, we could have different methods for short and long flags because they have differet behaviours too.

I think this is fine.

asterite · 2020-04-14T20:18:29Z

Do you think there's anything wrong with explicitly using clone if you want to reuse an instance?

Yes, it's unexpected. OptionParser defines how options should be parsed. I find it pretty strange that if I use it twice, it behaves differently.

asterite · 2020-04-14T20:19:06Z

Crystal is not a functional programming language, but we tend to use and like purity. That is, methods shouldn't have unexpected side effects.

RX14 · 2020-04-16T17:28:18Z

I made #parse restore state on exit.

I tried going with the route of cloning the OptionParser and creating a DSL like @didactic-drunk suggested, but it's not neat to synchronise @stop, and the @before_each handlers etc. between the original instance which is passed to the OptionParser.new do |parser| block, and the instance which is passed to parser.subcommand("foo", "") do |parser|. Basically you need to unify all the instance variables between the two except @flags and @handlers, which is much uglier than what I just comitted.

didactic-drunk · 2020-04-16T17:46:55Z

Can you have an additional OptionParser per subcommand and start parsing ARGV from the leftovers? Doesn't the parent OptionParser stop when a subcommand is reached (eliminating the @stop problem)? Perhaps @asterite's state suggestion would solve the issue?

def parse(args = ARGV, state = State.new)

straight-shoota

I'd rather merge this without the last commit, but I'm fine with it.
I'm still convinced this is completely unnecessary and more a maintenance burdon than anything useful.
I went through 14 pages of search results for OptionParser.new on GitHub and couldn't find a single shard that re-uses an OptionParser instance. Only in two or three cases it would be technically possible from the code to parse multiple times, but that's in CLI builders that wrap OptionParser. I'm sure no application would use it that way.

asterite · 2020-04-16T18:00:47Z

I guess with that proof we can go ahead without the last commit, and add it later if needed.

RX14 · 2020-04-16T19:03:39Z

@didactic-drunk in short, no.

RX14 · 2020-04-16T19:04:27Z

We might as well leave the last commit. Why not?

Sija · 2020-04-16T20:34:46Z

src/option_parser.cr

+  ensure
+    @flags = old_flags.not_nil!
+    @handlers = old_handlers.not_nil!
+    @stop = false
+    @banner = old_banner
+    @unknown_args = old_unknown_args
+    @missing_option = old_missiong_option.not_nil!
+    @invalid_option = old_invalid_option.not_nil!
+    @before_each = old_before_each


Wouldn't be better to use begin ... end block instead of using .not_nil!?

I thought about it but I don't think it's worth the extra indent. I prefer it this way.

OK then how about clumping it all into one variable (tuple? record? maybe store the whole state in a sub-record in the first place?)

It should be either as is or without the reset.

@RX14 Using .not_nil! should be treated as a last resort, so if there's a way to avoid it, even at the "cost" of an extra indent, then I'd say it's worth it.

Just put the begin after you save all those variables. I agree with Sija.

RX14 · 2020-04-17T09:54:04Z

I am really ambivalent about leaving the last commit in or not, since it's less work to do so, I'll leave it in.

I'll merge this today if there's no more API-level objections.

Sija · 2020-04-17T11:59:38Z

src/option_parser.cr

+    old_handlers = @handlers.clone
+    old_banner = @banner
+    old_unknown_args = @unknown_args
+    old_missiong_option = @missing_option


Suggested change

old_missiong_option = @missing_option

old_missing_option = @missing_option

waj · 2020-04-17T14:08:36Z

@RX14 thanks for working on this. I agree that the current OptionParser is bad designed and buggy. This implementation is better than the one we currently have and from that perspective I have no objections to merge this PR.

Now, I wouldn't consider this implementation "good code". I don't need to check usage stats to understand that the approach used for subcommands is not well designed. Mutating the parser is unnecessary and TBH relying on string comparison with a specific number of spaces from a formatted text to do so, makes me cringe.

I think it could actually just make it store the options more structured, format those options just when it's needed (on --help) and filter based on those structures, not strings.

Also, I wish the subcommands implementation could work somehow with separate OptionParser instances. That way subcommands could be defined in separate files, or make them join/split to different processes more easily. For example, take a look at how it's currently used for Crystal compiler subcommands, and think how this new implementation would fit there.

I could understand that you don't feel like doing it right now, but I wouldn't say this PR is the ultimate design and good code that we could be looking for.

didactic-drunk · 2020-04-17T23:00:34Z

Also, I wish the subcommands implementation could work somehow with separate OptionParser instances. That way subcommands could be defined in separate files, or make them join/split to different processes more easily.

Should subcommands be marked experimental? Should opts.on be opts.subcommand or some other method to make the meaning clear?

RX14 · 2020-04-18T12:46:02Z

I could understand that you don't feel like doing it right now, but I wouldn't say this PR is the ultimate design and good code that we could be looking for.

On that, I totally agree. The --help text generation needs to be refactored. The rest isn't so bad, but the subcommands implementation details aren't so good either. I still like the API, though I recognise it has a few warts. But this is still better than before, which had an unreadable, slow, buggy, inflexible implementation on top of that.

If you'd like to redesign this a bit more I'd love to see your ideas!

OptionParsers are now reusable

Fixes crystal-lang#8937

RX14 commented Apr 5, 2020

View reviewed changes

RX14 force-pushed the feature/option-parser-rewrite branch from be69622 to dd59847 Compare April 6, 2020 11:53

straight-shoota approved these changes Apr 6, 2020

View reviewed changes

straight-shoota reviewed Apr 9, 2020

View reviewed changes

src/option_parser.cr Show resolved Hide resolved

spec/std/option_parser_spec.cr Outdated Show resolved Hide resolved

RX14 added kind:feature topic:stdlib labels Apr 9, 2020

RX14 changed the title ~~[WIP] OptionParser rewrite with subcommands~~ OptionParser rewrite with subcommands Apr 10, 2020

RX14 force-pushed the feature/option-parser-rewrite branch from 5aa0451 to a0b5595 Compare April 10, 2020 12:30

RX14 requested a review from straight-shoota April 12, 2020 15:20

straight-shoota approved these changes Apr 16, 2020

View reviewed changes

Sija reviewed Apr 16, 2020

View reviewed changes

Sija reviewed Apr 17, 2020

View reviewed changes

RX14 force-pushed the feature/option-parser-rewrite branch 2 times, most recently from 1ea4f06 to 91b8dda Compare April 18, 2020 12:53

RX14 added 6 commits April 18, 2020 13:55

Simplify OptionParser implementation

6c72b9e

Fixes crystal-lang#8937

Add subcommand support to OptionParser

a8dd378

Add OptionParser#stop to stop parsing

0646856

Add OptionParser callback to run on every argument

62d7f59

Clean up OptionParser documentation

88eee73

Restore OptionParser mutable state on parse end

9d7f128

RX14 force-pushed the feature/option-parser-rewrite branch from 91b8dda to 9d7f128 Compare April 18, 2020 13:00

RX14 merged commit f4f052a into crystal-lang:master Apr 18, 2020

RX14 added this to the 0.35.0 milestone Apr 18, 2020

Sija mentioned this pull request Apr 19, 2020

Refactor preserving state out of OptionParser#parse #9133

Merged

bcardiff mentioned this pull request Jun 26, 2020

The behavior of OptionParser is different between versions 0.34.0 and 0.35.0 and later. #9553

Closed

makenowjust mentioned this pull request Jul 12, 2020

OptionParser: don't call handler if value is given to none value handler #9603

Merged

straight-shoota mentioned this pull request Dec 1, 2020

Handle subcommands (without dashes) in OptionParser #3357

Closed

	old_missiong_option = @missing_option
	old_missing_option = @missing_option

OptionParser rewrite with subcommands #9009

OptionParser rewrite with subcommands #9009

Conversation

RX14 commented Apr 5, 2020 • edited

RX14 Apr 5, 2020 • edited

Choose a reason for hiding this comment

oprypin commented Apr 6, 2020

RX14 commented Apr 6, 2020

straight-shoota commented Apr 6, 2020

RX14 commented Apr 6, 2020

straight-shoota commented Apr 6, 2020

RX14 commented Apr 6, 2020

Blacksmoke16 commented Apr 6, 2020

RX14 commented Apr 6, 2020

Blacksmoke16 commented Apr 6, 2020

oprypin commented Apr 6, 2020

oprypin commented Apr 6, 2020

Blacksmoke16 commented Apr 6, 2020

oprypin commented Apr 6, 2020

straight-shoota commented Apr 6, 2020

Blacksmoke16 commented Apr 6, 2020

sudo-nice commented Apr 8, 2020

oprypin commented Apr 8, 2020

RX14 commented Apr 9, 2020 • edited

straight-shoota left a comment

Choose a reason for hiding this comment

RX14 commented Apr 10, 2020 • edited

straight-shoota commented Apr 10, 2020

RX14 commented Apr 10, 2020

asterite commented Apr 14, 2020

asterite commented Apr 14, 2020

RX14 commented Apr 16, 2020

didactic-drunk commented Apr 16, 2020 • edited

straight-shoota left a comment • edited

Choose a reason for hiding this comment

asterite commented Apr 16, 2020

RX14 commented Apr 16, 2020

RX14 commented Apr 16, 2020

Sija Apr 16, 2020

Choose a reason for hiding this comment

RX14 Apr 16, 2020 • edited

Choose a reason for hiding this comment

oprypin Apr 16, 2020

Choose a reason for hiding this comment

straight-shoota Apr 17, 2020

Choose a reason for hiding this comment

Sija Apr 17, 2020

Choose a reason for hiding this comment

asterite Apr 17, 2020

Choose a reason for hiding this comment

RX14 commented Apr 17, 2020

Sija Apr 17, 2020

Choose a reason for hiding this comment

waj commented Apr 17, 2020

didactic-drunk commented Apr 17, 2020

RX14 commented Apr 18, 2020 • edited

RX14 commented Apr 5, 2020 •

edited

RX14 Apr 5, 2020 •

edited

RX14 commented Apr 9, 2020 •

edited

RX14 commented Apr 10, 2020 •

edited

didactic-drunk commented Apr 16, 2020 •

edited

straight-shoota left a comment •

edited

RX14 Apr 16, 2020 •

edited

RX14 commented Apr 18, 2020 •

edited