-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify docopt #104
Comments
I'd be wary of drawing too much from JSON here - it's not an apples to apples comparison. docopt is targeting a much more complex problem and doesn't have a blank slate to work with. The legacy of thirty odd years worth of bizarre, twisted help messages is hard to escape! I've been over the current README from top to bottom and identified a few things I thought could go, only to remember heaps of use cases for everything but the uppercase arg thing mentioned in #50. I beg you - please be careful before deprecating features - docopt is incredibly useful and flexible in its current form. |
Suggestions so far:
|
What about just removing |
I suggest removing |
I think these are different types of features. |
You mean we should pick one of them? |
Sorry I've been absent; I'm in the middle of finals for my last year of undergraduate. Yeah, I think it would be better to have one or the other form for positional args, not both. It seems like |
The only thing stopping me from removing |
So, lately I'm thinking of:
Also, it is not documented, but options' arguments could be arbitrary strings right now, not just
Some more questionable moves:
I you apply all the above changes to the docopt usage-pattern, its grammar becomes context-free, and could be parsed with a normal parser. What do you think? |
/cc @docopt, since the above will likely influence everyone |
BTW, @johari, how is your success with parsing docopt using parser generator? |
@halst I can write a separate document explaining implementation details of So basically, you don't need to simplify docopt in order to use a parser generator. To be honest, I don't agree with any of the mentioned simplifications. I like I can't see why you want to simplify or remove any part of the language, as they are all based on rightful conventions made by programmers over the years. |
The thing is that right now you can't write a grammar for the language, it's just not possible. As you have shown, you can parse it in several stages, though. If we make those changes it is possible to write a single grammar and declare it as the docopt language. BTW, did you look into Parsing Expression Grammar? If I have to define docopt via grammar I would definitely use PEG, since you don't need tokenization as a separate step. |
BTW, why do you see "require to write equal sign |
Yep, but I found working with
Having a single grammar for docopt language is really cool (I love it!), but that shouldn't be the main goal for language decisions. I mostly like to consider CLI-developer happiness, and removing Here's an idea: we can have a small Implementing
This actually isn't bad. We should encourage people to do that in our docs. In fact, one of the reasons we parse docopt in two phases is to deal with Removing |
It depends whether we are talking about short-term happiness or long-term one :-). In the short turn it will be like "shit, why is docopt not working anymore?!", but in the long term, if docopt can influence the status-quo of command-line interfaces it will make more people happy, they know that Then, if we remove short options from usage-patterns, I would argue that the world would be a better place. Who likes the
It should have used |
Well we can't make any influence by breaking existing habits. The best way to influence people is to slowly encourage them do the right thing. For example, I think docopt did the right thing not to support long options with single "-" prefix (like Also, we don't have any sophisticated error reporting in docopt right now, so we'd just increase the wtf/s ratio if we deprecate common patterns, rendering people confused. Here's an idea, we can have an "strict" option to enforce best practices for people that really care. This can also map to the
I like your
|
Maybe then allow single letter options:
Sounds reasonable to me. And we still get parsability. |
And that is possible only in case |
It' is less readable to use short options than long options, but it's more convenient when you have to use those options a lot. My software-design philosophy is that it's worth some extra pain in implementing the infrastructure (docopt), so that the 99% of the usage out there, which includes writing and use of end-user apps, becomes easier or more intuitive. The problem with The problem with Requiring the Another good idea for a "strict" mode would be to require a long-name equivalent for every short option. I also think that listing both, or at least the long-form, in the usage block would be preferable to just listing the short-form of the option there. The Long story, short, as nice as it would be to simplify the docopt code, I'm afraid I don't see how to do it in the specification, without pushing out that complexity onto the broader base of end-users out there. |
I agree, but the changes proposed above are only about usage-pattern, nothing will be changed for parsing the ARGV.
This is not a problem since for an option to take several arguments you need to "repeat" it |
Well
My idea is to never break or even never change docopt after 1.0.0. Right now docopt is a long-lasting beta :-). Eventually docopt 1.0.0 will be released and we'll have a solid CLI description language for decades to come. That is why I want to make docopt right, and I would prefer omitting a good feature, than keeping a bad one. |
As you say, requiring |
So to sum up my opinion to date (about usage-pattern-parsing, not argv-parsing):
This way we keep all the functionality, and make docopt parsable. Some of you may think "why should we care so much about parsability?". Well if we change the grammar to be context-free it simplifies not only computer reasoning, but also human reasoning. Context-free grammar allows humans to parse it without keeping any context in their heads. Context like "which option takes argument, which not". When you see |
👍 Since it's not 1.0 yet, you don't have to think about backwards compatibility too hard. Just document it clearly. I personally liked the |
@ambv thanks for the feedback, appreciated. |
I just pushed an experimental PEG-based parser and thought you guys might want to take a look: Right now it handles most of usage-pattern, but it is not integrated with the rest of the code. It uses parsimonious PEG library. |
I'd prefer |
@jric but it's also totally unlike how POSIX command line utilities have worked for the last 30 years or so. |
+1 to the latest proposition |
Could we still have |
@kblomqvist the problem with But we may come up with a better context-free version than |
I would also allow |
This is tricky, indeed. A thing to consider is how will a user react if With the Maybe a feasible route would be to make docopt generate a concrete grammar from the docstring. In other words, the docstring has to be |
Absolutely agree.
I'm not sure, are you talking about usage-section-parsing or argv-parsing? Just to be sure, the idea was to restrict the usage-section (docstring) to |
Haha, this is what I was suggesting! Perfect. I'm absolutely in favor of simplifying the grammar for usage-section and being more lenient with argv parsing. If the developer cannot write |
Here's a crazy idea: on your Parsimonious branch, you're using a Parsimonious grammar to parse docopt DSL, but then presumably you're still delegating to your existing ad hoc code to do the recognition of command-line input. Have you thought about instead generating a second Parsimonious grammar from that and running it against the command-line input? You could either generate a string and throw it at the |
Okay, I've looked at your code now. Just look how much alike it and parsimonious/expressions.py are! Wow. We even named a bunch of classes almost the same. Unless I'm missing something (which is quite likely), you could delete a whole pile of that code and just write a The one tricky bit is that you have a pre-divided argv coming in; you'd have to mash that back down to a string to have a grammar parse it, so you'd need to decide on some kind of backslashy lossless conversion or something. /idle commentary |
@erikrose that's an interesting idea, I need to think more about that. That could be a nice refactoring. |
I have a proposal on the topic. |
@YorikSar argparse can't handle many of docopt features, such as arbitrary nested patterns. |
Can you please give me one good example of what argparse can not do? |
|
Thanks! |
I've been playing with the latest proposed grammar. There seems to be an ambiguity with short options, consider the following definition:
There is no problem parsing this definition in the usage section. The problem arise while parsing argv; how do we interpret To me, packing short options while allowing only the last one to have an argument is syntactic sugar that brings a lot of problem for almost no gain. IMHO, On related note, many times I faced the situation where I wanted something like |
@keleshev should we plan a deadline for the implementation of this :). Maybe make it more formal and call it docopt 1.0.0 ? |
How bout we use braces to enclose short options with arguments (for usage, not argv)? They haven't been put to use yet.
Also I agree about Finally, I removed |
To get on to the next steps of formalization, this grammar parses a strict subset of my previous one (provided I didn't screw up) to create a richer parse tree showing which patterns can come in any order in which can't. This may or may not be useful to implementations, but is certainly good for fuzz testing. A few examples of subtleties (edit: these are
|
Also, not sure if this has been discussed before, but I'd find it more intuitive if Among other things, this makes this nice symmetry:
Also with this changing the new conservative grammar to
or even
would make it more minimal without loss of expressive power. |
This will break the following format
|
@TylerTemp That part about the commutation I assume you mean? [Fun fact, BSD commands actually do work that way.] But yeah I see how it is restrictive. Another route is to make so arguments commute with things in
|
* ...but don't support abbreviations. They are considered a misfeature in the original (Refer: docopt/docopt#104) * ...but keep alias in outputs. I.e. if there's an option: "-o, --output", then provide the matched value for both keys: '{ "--output": value, "-o": value }'
@docopt/docopt, which features would you get rid of in docopt? I'm really concerned with 1.0.0 release and I wish to strip features and simplify implementation. Any candidates?
The text was updated successfully, but these errors were encountered: