Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore ellipsis in argument definition? #111

Open
Komnomnomnom opened this issue May 7, 2013 · 10 comments
Open

Ignore ellipsis in argument definition? #111

Komnomnomnom opened this issue May 7, 2013 · 10 comments

Comments

@Komnomnomnom
Copy link

docopt already supports syntax like --b=<x,y,z> for specifying options taking multiple arguments, but I'd like to be able to clearly specify options which take a delimited list of undefined length i.e. I'd like to be able to specify this in the usage pattern:

e.g. --b=<x...> or ---b=<x,y,...> or --b=<number1,number2,...>.

Right now I just print

--b=<numbers>    Option b with comma separated list of numbers, e.g. 1,2,3,4.

in my usage message but I think it would be clearer if this could be presented to the user in the argument pattern directly.

In [59]: import docopt

In [60]: docopt.__version__
Out[60]: '0.6.1'

In [61]: docopt.docopt("prog Usage: prog --b=<x,y,z>", argv=['--b=1,2,3'])
Out[61]: {'--b': '1,2,3'}

In [62]: docopt.docopt("Usage: prog --b=<x...>", argv=['--b=1,2,3'])
To exit: use 'exit', 'quit', or Ctrl-D.
An exception has occurred, use %tb to see the full traceback.

DocoptExit: Usage: prog --b=<x...>


In [63]: docopt.docopt("Usage: prog --b=<x,y,...>", argv=['--b=1,2,3'])
To exit: use 'exit', 'quit', or Ctrl-D.
An exception has occurred, use %tb to see the full traceback.

DocoptExit: Usage: prog --b=<x,y,...>

I'm aware I could just use two dots .. and all would work fine but I think it would be syntacticly cleaner and desirable to be able to consistently use ... to denote repetition.

@keleshev
Copy link
Member

keleshev commented May 7, 2013

We are in the process of formalising the parser (right now it's a bit ad hoc). So we need to decide which characters will be allowed for which elements.

I can see how special characters could be used in interesting ways, but which exactly?

The first draft grammar (discussed in #4) is even more restrictive (but it shouldn't need to be):

long_option   <- '--' [a-z-]+ ('=' argument)?                    
short_options <- '-' [a-zA-Z0-9]+ argument?
argument      <- '<' [ a-z-]+ '>'
command       <- [a-z-]+

Do you have a specific proposal/idea on which characters should be allowed?

@Komnomnomnom
Copy link
Author

Wow that is really restrictive.

A formal grammar specification would be great but I'd certainly be an advocate for docopt being quite liberal where it can be, and only restrictive where it needs to be. It's hard to foresee all use cases, especially when sometimes (like above) it is desirable for some names to be more descriptive than just a simple stand-in label, e.g. <a,b,c,d> or <a:b:c:d> or <label#NUMBER> or <a..z>.

So to your answer your question, I would only disallow characters that would interfere with the parser, or would make parsing overly cumbersome. I'd also have a similar opinion on option and command names.

@keleshev
Copy link
Member

keleshev commented May 7, 2013

I see how <a,b>, <a:b> and <a...> could be a readability improvements, so these characters will get in. However there are many possibilities for confusion. Say you have <label#NUBMER> and user enters flower#6 in her shell: #6 will be considered a comment, so I think that # is a bad symbol.

The point I'm trying to make is that docopt shouldn't be liberal in what it accepts :-) Like, we shouldn't allow weird unicode characters, that look similar to ascii; we shouldn't allow characters that lead to confusion.

@Komnomnomnom
Copy link
Author

Hmm I think # will only be considered a comment if there is a space before it, but fair enough it was a poor choice of symbol, but then again imo that would be between me and the shell, not between me and docopt :-). (BTW examples with ,, :, and ... were just examples, there are many other possibilities with other delimiters).

I'm still of the view though that it is not the responsibility of docopt to enforce such things, the robustness principle comes to mind: "Be conservative in what you do, be liberal in what you accept from others". In docopt's case I think the danger is in being too prescriptive you unintentionally forbid a lot of use-cases.

Another example, in physics it is common to write x_i in programming, LaTeX and otherwise to mean x subscript i i.e. equation, which in the following example denotes the ith particle in an experiment say. Now it would arguably make more sense for my command line tool to be

physics_prog --x_i=<value> --x_j=<value>

than

physics_prog --x-i=<value> --x-j=<value>

My point is, it is impossible to know beforehand all valid use-cases and, given docopt's potential application acroos a wide number of domains I think it would be a shame if it became overly dogmatic.

@fsaintjacques
Copy link

-1, keep it simple. I don't like the idea of having ellipsis and/or complex argument expressions.

@keleshev
Copy link
Member

keleshev commented May 8, 2013

@fsaintjacques those are not expressions, it's just about which characters are allowed inside < and >.

@KangOl
Copy link

KangOl commented May 9, 2013

I would allow (almost) any printable characters (as string python module definition)
That will allow

0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&'()*+,-./:;=?@[\]^_`{|}~

@keleshev
Copy link
Member

keleshev commented May 9, 2013

I'm not sure about quotes, #, $, *, ?, {, }, (, )—they have special meaning in shells. Or are there any other characters with special meaning?

Also I remember writing weird things like <host>:<port>; it is undocumented, but allowed in current implementation, as long as argument starts with < and ends with >.

@KangOl
Copy link

KangOl commented May 9, 2013

Their special meaning in shells doesn't matter. I can use these when writing regex for egrep; so why not in declaration of docopt?
After if the docopt implementation decide to enforce the argument pattern or not when parsing the command line, it is another question.

@jric
Copy link

jric commented May 9, 2013

+1 for allowing as much as possible in the printable ascii range -- I have no opinion about non-ascii unicode.

Regarding worrying about what the shell accepts, I wouldn't:

  1. The usage message isn't parsed by the shell.
  2. If the user wants to enter something that the shell would intercept, they can always escape it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants