Let optional arguments be followed by required arguments #27

Open
Met48 opened this Issue Jun 21, 2012 · 5 comments

Comments

Projects
None yet
2 participants
@Met48
Contributor

Met48 commented Jun 21, 2012

Currently a command like naval_fate ship [<name>] move will not parse correctly when the optional argument is omitted. This also occurs with repeating arguments.

@keleshev

This comment has been minimized.

Show comment
Hide comment
@keleshev

keleshev Jun 21, 2012

Member

Yes, I already created a bugfix branch in order to fix this. The problem with [<name>] and <name>... is that they are very greedy—they will try to match no matter what follows them. This problem requires some thought.

Member

keleshev commented Jun 21, 2012

Yes, I already created a bugfix branch in order to fix this. The problem with [<name>] and <name>... is that they are very greedy—they will try to match no matter what follows them. This problem requires some thought.

@Met48

This comment has been minimized.

Show comment
Hide comment
@Met48

Met48 Jun 24, 2012

Contributor

@halst I've made progress on fixing this in my experimental branch: see the full diff and file. There's still some refactoring and tests cases needed, but the core code is done. It passes all language agnostic tests and fully resolves this issue.

Any feedback would be appreciated. I realize it is a lot of changes, but I think it's necessary to resolve the issue. The existing implementation places all flow control in the match methods, which makes it difficult to try alternate input interpretations when there's a failure. I tried two other approaches to the issue: state restoration after failures (backtracking) and simultaneous testing of other input interpretations. This NFA solution implements the latter, as I found the code to be a lot cleaner than the backtracking solution.

Contributor

Met48 commented Jun 24, 2012

@halst I've made progress on fixing this in my experimental branch: see the full diff and file. There's still some refactoring and tests cases needed, but the core code is done. It passes all language agnostic tests and fully resolves this issue.

Any feedback would be appreciated. I realize it is a lot of changes, but I think it's necessary to resolve the issue. The existing implementation places all flow control in the match methods, which makes it difficult to try alternate input interpretations when there's a failure. I tried two other approaches to the issue: state restoration after failures (backtracking) and simultaneous testing of other input interpretations. This NFA solution implements the latter, as I found the code to be a lot cleaner than the backtracking solution.

@keleshev

This comment has been minimized.

Show comment
Hide comment
@keleshev

keleshev Jun 24, 2012

Member

I really appreciate your effort; I will need some time to read and understand the code. It would be great if you could write a couple of (github) comments in commits' diffs:

Met48/docopt@fe9f6de
Met48/docopt@d2b293b

Member

keleshev commented Jun 24, 2012

I really appreciate your effort; I will need some time to read and understand the code. It would be great if you could write a couple of (github) comments in commits' diffs:

Met48/docopt@fe9f6de
Met48/docopt@d2b293b

@keleshev

This comment has been minimized.

Show comment
Hide comment
@keleshev

keleshev Jun 25, 2012

Member

So far I can say that your code works pretty well. The only "funny" behavior I could find was this:

>>> docopt('usage: prog <a>... <b>', '1 2 3')
{'<a>': ['1', '2', '3'],
 '<b>': '3'}

Although current version of docopt can't do that at all :-)

Member

keleshev commented Jun 25, 2012

So far I can say that your code works pretty well. The only "funny" behavior I could find was this:

>>> docopt('usage: prog <a>... <b>', '1 2 3')
{'<a>': ['1', '2', '3'],
 '<b>': '3'}

Although current version of docopt can't do that at all :-)

@Met48

This comment has been minimized.

Show comment
Hide comment
@Met48

Met48 Jun 25, 2012

Contributor

Thanks for taking a look! It definitely needs test cases, as that behaviour was was one of the first I had implemented! I'm working on that though, having just got tox set up and the language agnostic tests running through pytest.

I can think of a couple of fixes to that issue which I'll look at tonight, the quickest being a deepcopy of collected in traverse.append.

I don't really want to comment on the commits since I'm rebasing the branch, so I'll mention the general method here.


For the most part it's a modification of this regular expression matching algorithm. The usage pattern is converted into a NFA which the traverse function navigates.

This NFA has only two node types, Literal and Split. All containers (like Required)

Literal nodes include Argument and Command. The traverse function calls their next method, which modifies the arguments and collected lists before returning the next node.

Split nodes have two child nodes and allow for multiple execution paths. When the traverse function encounters one, it adds both child nodes to its list of nodes to process. Each loop it processes all nodes in this list simultaneously.

The NFA is generated by the assemble methods. Each Container subclasses implement this method differently, to connect their child nodes correctly. Required just connects them in sequence, for instance. Either creates a Split nodes as necessary so that each child is its own branch.

Contributor

Met48 commented Jun 25, 2012

Thanks for taking a look! It definitely needs test cases, as that behaviour was was one of the first I had implemented! I'm working on that though, having just got tox set up and the language agnostic tests running through pytest.

I can think of a couple of fixes to that issue which I'll look at tonight, the quickest being a deepcopy of collected in traverse.append.

I don't really want to comment on the commits since I'm rebasing the branch, so I'll mention the general method here.


For the most part it's a modification of this regular expression matching algorithm. The usage pattern is converted into a NFA which the traverse function navigates.

This NFA has only two node types, Literal and Split. All containers (like Required)

Literal nodes include Argument and Command. The traverse function calls their next method, which modifies the arguments and collected lists before returning the next node.

Split nodes have two child nodes and allow for multiple execution paths. When the traverse function encounters one, it adds both child nodes to its list of nodes to process. Each loop it processes all nodes in this list simultaneously.

The NFA is generated by the assemble methods. Each Container subclasses implement this method differently, to connect their child nodes correctly. Required just connects them in sequence, for instance. Either creates a Split nodes as necessary so that each child is its own branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment