Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OptionParser rewrite with subcommands #9009

Merged
merged 6 commits into from
Apr 18, 2020

Conversation

RX14
Copy link
Contributor

@RX14 RX14 commented Apr 5, 2020

I got angry with OptionParser for being a pile of bad code so I rewrote the implementation. It's simpler now, and easier to understand the implementation, and not O(n^2) so probably miles faster. And it has a completely compatible interface, no changes were required to any existing usages across the stdlib.

While I was rewriting it I had a universe brain moment and realised that I could make it handle subcommands very simply, so I threw that in as a bonus.

Example subcommand usage:

OptionParser.parse(args) do |opts|
  opts.on("subcommand", "Description") do
    opts.on("--foo arg", "Foo") { |v| foo = v }
    opts.on("--bar", "Bar") { bar = true }
    opts.on("-z", "--baz", "Baz") { z = true }
  end
  opts.on("--verbose", "") { verbose = true }
end

There's some tweaks still required to get the to_s output to not suck when using subcommands but I wanted to submit this for comment before I forgot about it tomorrow morning.

Fixes #8937.

it "has required option with = (3) raises" do
expect_missing_option ["--flag="], "--flag=FLAG", "--flag"
it "has required option with = (3) handles empty" do
expect_capture_option ["--flag="], "--flag=FLAG", ""
Copy link
Contributor Author

@RX14 RX14 Apr 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the behaviour of OptionParser here because I thought this behaviour was incorrect.

@oprypin
Copy link
Member

oprypin commented Apr 6, 2020

I think this fixes #8937

@RX14
Copy link
Contributor Author

RX14 commented Apr 6, 2020

Yeah, me too

I'll add a spec.

@straight-shoota
Copy link
Member

Awesome!

This might also enable to implement #5338 (could be a follow-up though).

@RX14
Copy link
Contributor Author

RX14 commented Apr 6, 2020

Ah, that seems rather difficult with the current code because there's no way to "peek" at the next argument and see if it's valid or not

I might just get rid of the "iterator" business and use indexing in the array, then.

@straight-shoota
Copy link
Member

I suppose it might be best to support only --optional-flag=VALUE or --optional-flag (no VALUE), an not --optional-flag VALUE because the latter is always kind of ambiguous, even with a fixed set of values and it can't handle arbitrary values at all.

@RX14
Copy link
Contributor Author

RX14 commented Apr 6, 2020

It's definitely doable to have a "next argument doesn't start with -" heuristic.

I switched the code to use array indexing and it's neater anyway.

@RX14 RX14 force-pushed the feature/option-parser-rewrite branch from be69622 to dd59847 Compare April 6, 2020 11:53
@Blacksmoke16
Copy link
Member

Thoughts on allowing multiple shortflags to be grouped?

require "option_parser"

OptionParser.parse do |opts|
  opts.on("-f", "Prints foo") { pp "foo" }
  opts.on("-b", "Prints bar") { pp "bar" }
end

Works:
./test -f -b

Does not work:
./test -fb

@RX14
Copy link
Contributor Author

RX14 commented Apr 6, 2020

@Blacksmoke16 I thought about adding that feature, but I'd rather leave it for another PR.

@Blacksmoke16
Copy link
Member

Would also close #3357.

Another thought I had, probably better for another PR; but what about parsing options with more than one value? An example of this is the --arg name value within https://stedolan.github.io/jq/manual/.

@oprypin
Copy link
Member

oprypin commented Apr 6, 2020

@Blacksmoke16 It is indeed unrelated and is also not part of any spec or common use.

@oprypin
Copy link
Member

oprypin commented Apr 6, 2020

I wrote this reply before realizing that something else was being asked, but

if one wants to support an arbitrary number of values, it should definitely not be done as
--stuff foo bar baz (non-compliant and very ambiguous in terms of what can follow it)
but rather --stuff foo --stuff=bar --stuff=baz,
or, as a shell would let you write, --stuff={foo,bar,baz}.

And this is already supported, just push it to an array:

Code
require "option_parser"

argv = ["--stuff", "foo", "--stuff=bar"]

stuff = [] of String

OptionParser.parse(argv) do |parser|
  parser.on("--stuff=NAME", "") { |name| stuff << name }
end

p! stuff # => ["foo", "bar"]

@Blacksmoke16
Copy link
Member

@oprypin I wasn't imagining supporting an arbitrary number of values, just a way to handle a fixed number that is more than one.

Like parser.on("--arg NAME VALUE", "Add an arg") { |name, value| # Do whatever }

@oprypin
Copy link
Member

oprypin commented Apr 6, 2020

I wrote this reply before realizing that something else was being asked, but

@straight-shoota
Copy link
Member

@Blacksmoke16 IMO that looks like a very uncommon feature. I'd rather not have such specialized things in OptionParser.

Btw. why not just use --arg NAME=VALUE for example? Would be much more convenient IMO.

@Blacksmoke16
Copy link
Member

Oh I don't really have a use case. Was just something I noticed in jq.

@sudo-nice
Copy link

Thank you for the great improvement of the OptionParser!
Will those sticky arguments (like command -ufoo -tbar) still be working after the update?

Currently working example
require "option_parser"

ary = [] of String
options = %w[-ufoo -tbar]

OptionParser.parse(options) do |p|
  p.on("-u FOO", "Description") { |o| ary << o }
  p.on("-t FOO", "Description") { |o| ary << o }
end

pp ary # => ["foo", "bar"]

@oprypin
Copy link
Member

oprypin commented Apr 8, 2020

You can just copy paste the file before your example to try.

Yes, it works before and after.

@RX14
Copy link
Contributor Author

RX14 commented Apr 9, 2020

I've just added commits to unregister subcommands when they're called, this means that when you call a subcommand foo, all subcommands are unregistered, so foo foo doesn't handle foo twice.

I've also added two new features: OptionParser#stop and OptionParser#every(&). The latter takes a callback to run on every single argument before it's parsed. This is neccesary, because performing a map or each on ARGV wouldn't be able to tell the difference between flags and their arguments, whereas every is not passed flag arguments. Then OptionParser#stop allows you to bail out of parsing at any point. Specs for all of these showcase them well.

This PR is still WIP because I need to fix the OptionParser documentation, but I'd like to get feedback on the current code before doing that.

Copy link
Member

@straight-shoota straight-shoota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love it =)

src/option_parser.cr Show resolved Hide resolved
spec/std/option_parser_spec.cr Outdated Show resolved Hide resolved
@RX14 RX14 changed the title [WIP] OptionParser rewrite with subcommands OptionParser rewrite with subcommands Apr 10, 2020
@RX14 RX14 force-pushed the feature/option-parser-rewrite branch from 5aa0451 to a0b5595 Compare April 10, 2020 12:30
@RX14
Copy link
Contributor Author

RX14 commented Apr 10, 2020

every has been renamed to before_each and documentation is complete. This is ready for final review then squish.

@straight-shoota
Copy link
Member

I'm not entirely sure about using the same method for defining a subcommands and flags when the only difference is whether the string starts with a dash.
Maybe a different method name for sucommand would be better to more clearly communicate the different concepts. For example opts.subcommand("subcommand", "") do.

@RX14
Copy link
Contributor Author

RX14 commented Apr 10, 2020

I mean, we could have different methods for short and long flags because they have differet behaviours too.

I think this is fine.

@asterite
Copy link
Member

Do you think there's anything wrong with explicitly using clone if you want to reuse an instance?

Yes, it's unexpected. OptionParser defines how options should be parsed. I find it pretty strange that if I use it twice, it behaves differently.

@asterite
Copy link
Member

Crystal is not a functional programming language, but we tend to use and like purity. That is, methods shouldn't have unexpected side effects.

@RX14
Copy link
Contributor Author

RX14 commented Apr 16, 2020

I made #parse restore state on exit.

I tried going with the route of cloning the OptionParser and creating a DSL like @didactic-drunk suggested, but it's not neat to synchronise @stop, and the @before_each handlers etc. between the original instance which is passed to the OptionParser.new do |parser| block, and the instance which is passed to parser.subcommand("foo", "") do |parser|. Basically you need to unify all the instance variables between the two except @flags and @handlers, which is much uglier than what I just comitted.

@didactic-drunk
Copy link
Contributor

didactic-drunk commented Apr 16, 2020

Can you have an additional OptionParser per subcommand and start parsing ARGV from the leftovers? Doesn't the parent OptionParser stop when a subcommand is reached (eliminating the @stop problem)? Perhaps @asterite's state suggestion would solve the issue?

def parse(args = ARGV, state = State.new)

Copy link
Member

@straight-shoota straight-shoota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather merge this without the last commit, but I'm fine with it.
I'm still convinced this is completely unnecessary and more a maintenance burdon than anything useful.
I went through 14 pages of search results for OptionParser.new on GitHub and couldn't find a single shard that re-uses an OptionParser instance. Only in two or three cases it would be technically possible from the code to parse multiple times, but that's in CLI builders that wrap OptionParser. I'm sure no application would use it that way.

@asterite
Copy link
Member

I guess with that proof we can go ahead without the last commit, and add it later if needed.

@RX14
Copy link
Contributor Author

RX14 commented Apr 16, 2020

@didactic-drunk in short, no.

@RX14
Copy link
Contributor Author

RX14 commented Apr 16, 2020

We might as well leave the last commit. Why not?

Comment on lines 428 to 436
ensure
@flags = old_flags.not_nil!
@handlers = old_handlers.not_nil!
@stop = false
@banner = old_banner
@unknown_args = old_unknown_args
@missing_option = old_missiong_option.not_nil!
@invalid_option = old_invalid_option.not_nil!
@before_each = old_before_each
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be better to use begin ... end block instead of using .not_nil!?

Copy link
Contributor Author

@RX14 RX14 Apr 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it but I don't think it's worth the extra indent. I prefer it this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK then how about clumping it all into one variable (tuple? record? maybe store the whole state in a sub-record in the first place?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be either as is or without the reset.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RX14 Using .not_nil! should be treated as a last resort, so if there's a way to avoid it, even at the "cost" of an extra indent, then I'd say it's worth it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just put the begin after you save all those variables. I agree with Sija.

@RX14
Copy link
Contributor Author

RX14 commented Apr 17, 2020

I am really ambivalent about leaving the last commit in or not, since it's less work to do so, I'll leave it in.

I'll merge this today if there's no more API-level objections.

old_handlers = @handlers.clone
old_banner = @banner
old_unknown_args = @unknown_args
old_missiong_option = @missing_option
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
old_missiong_option = @missing_option
old_missing_option = @missing_option

@waj
Copy link
Member

waj commented Apr 17, 2020

@RX14 thanks for working on this. I agree that the current OptionParser is bad designed and buggy. This implementation is better than the one we currently have and from that perspective I have no objections to merge this PR.

Now, I wouldn't consider this implementation "good code". I don't need to check usage stats to understand that the approach used for subcommands is not well designed. Mutating the parser is unnecessary and TBH relying on string comparison with a specific number of spaces from a formatted text to do so, makes me cringe.

I think it could actually just make it store the options more structured, format those options just when it's needed (on --help) and filter based on those structures, not strings.

Also, I wish the subcommands implementation could work somehow with separate OptionParser instances. That way subcommands could be defined in separate files, or make them join/split to different processes more easily. For example, take a look at how it's currently used for Crystal compiler subcommands, and think how this new implementation would fit there.

I could understand that you don't feel like doing it right now, but I wouldn't say this PR is the ultimate design and good code that we could be looking for.

@didactic-drunk
Copy link
Contributor

Also, I wish the subcommands implementation could work somehow with separate OptionParser instances. That way subcommands could be defined in separate files, or make them join/split to different processes more easily.

Should subcommands be marked experimental? Should opts.on be opts.subcommand or some other method to make the meaning clear?

@RX14
Copy link
Contributor Author

RX14 commented Apr 18, 2020

I could understand that you don't feel like doing it right now, but I wouldn't say this PR is the ultimate design and good code that we could be looking for.

On that, I totally agree. The --help text generation needs to be refactored. The rest isn't so bad, but the subcommands implementation details aren't so good either. I still like the API, though I recognise it has a few warts. But this is still better than before, which had an unreadable, slow, buggy, inflexible implementation on top of that.

If you'd like to redesign this a bit more I'd love to see your ideas!

@RX14 RX14 dismissed asterite’s stale review April 18, 2020 12:48

OptionParsers are now reusable

@RX14 RX14 force-pushed the feature/option-parser-rewrite branch 2 times, most recently from 1ea4f06 to 91b8dda Compare April 18, 2020 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OptionParser skips -- regardless of context
10 participants