-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
argparse: mutually exclusive groups full of help-suppressed args can cause AssertionErrors #62090
Comments
When it goes to format a usage message, argparse seems to (correctly) fail to satisfy one of its assertions when all of the following are true:
The cause seems to be that the set of regular expressions that argparse uses to clean up mutually exclusive groups' separators doesn't handle the space that follows what would otherwise be an empty pair of square braces, sort of like this:
A test case is attached. I was able to reproduce this with python-2.7.3-13.fc18.x86_64 on Fedora as well as with commit 83588:e6b962fa44bb in 3.4 mainline. I have a small patch for the latter that I'll submit shortly. Sorry if I missed anything. This is my first bug report against python proper. |
Looks like the text = text.strip() at the end of the set of regex (in _format_actions_usage) needs to be replaced with something that removes all excess spaces, e.g. text = _re.sub( '\s+', ' ', text ).strip() |
Made similar required changes for version 2.7 |
In the test case: class TestMutuallyExclusiveManySuppressed Old usage would be: usage: PROG [-h] [--eggs EGGS] new usage: PROG [-h] [--eggs EGGS] i.e. 2 v 1 space. But extra spaces are not as dramatic a failure as an assertion error. It would also be good to check what happens when there are 2 suppressed groups. If the text before all trimming is: [ -h ] [] () [ --eggs EGGS ] does it reduce to? [-h] [--eggs EGGS] The old code would have left 3 spaces. I can't think of a situation in which a user would want a (generated) usage line with multiple spaces. If some sort of special formatting is needed, there is always the option of specifying an explicit usage line. parser = ArugmentParser(usage='one \ttwo \nthree four') |
A user generated line with multiple spaces may be essential if we are getting the usage data (particularly from) a csv file (that may have random spaces in certain fields) or some other form of stored or auto-generated data. |
I see three solutions -
The first 2 do the same thing most of the time, but may differ if the user somehow inserts spaces into names. The third leaves the extra blanks, but renders them innocuous. I doubt if the asserts were written to catch this problem. They probably were included to verify the usage line had been split up as expected prior to reassembling on multiple lines. As best I can tell test_argparse.py does not test for these spaces. Curiously though a port of argparse to javascript does have a test case with the extra space. |
Garrett and Yogesh, please submit contributor license agreements |
I am not sure if I am missing something. I had filled out the form at http://www.python.org/psf/contrib/contrib-form/ on the day I submitted the patch and even got back an email from Ewa Jodlowska. However, I don't see any "*" after my name. I submitted the same form to contributors@python.org and shot a mail to python-dev mailing list but there is no change. Any suggestions? |
Wait a week and see what happens. |
@terry: Thanks for the info. I seem to have the elusive "*" after my username now. I am not sure how this works, but can you review/test/apply the patch now? |
Argparse is out of my area of competence/experience. I have added a couple of people who have worked on argparse in the past and should be better able to review or suggest another reviewer. |
I'm following a dozen argparse issues with patches. I haven't seen much posting by argparse experts (like bethard, david.murry) since last December. |
@terry and @paul: |
I've been observing the activity on the argparse issues and am appreciating the work, but I don't have time right now to review the patches. I should have more time next month, and expect to get to them then, if no one else gets to them before I do. |
I just submitted a patch to http://bugs.python.org/issue11874 that substantially rewrites _format_actions_usage(). It generates the group and action parts separately, and does not do the kind of cleanup that produces this issue. |
See duplicate bpo-22363 for the traceback, and a workaround. There is also a patch there similar to Garrett’s original fix, but using the RE r'%s *%s ?'. I restored the diff from Garrett’s repository, in case it is still useful. Yogesh: I am not familiar with how the code works, and I struggle to be excited at using regular expressions to build a usage message :). But can you explain why you changed Garrett’s original fix? Does it affect intentional double spaces (or tabs, non-ASCII spaces, etc) in customizable parts of the usage message? |
I'm unable to reproduce this problem on 3.11: >>> import argparse
>>> parser = argparse.ArgumentParser()
>>> group = parser.add_mutually_exclusive_group()
>>> group.add_argument('--spam', help=argparse.SUPPRESS)
_StoreAction(option_strings=['--spam'], dest='spam', nargs=None, const=None, default=None, type=None, choices=None, help='==SUPPRESS==', metavar=None)
>>> parser.add_argument('--' + 'eggs' * 20, dest='eggs')
_StoreAction(option_strings=['--eggseggseggseggseggseggseggseggseggseggseggseggseggseggseggseggseggseggseggseggs'], dest='eggs', nargs=None, const=None, default=None, type=None, choices=None, help=None, metavar=None)
>>> parser.print_usage()
usage: [-h] [--eggseggseggseggseggseggseggseggseggseggseggseggseggseggseggseggseggseggseggseggs EGGS]
>>> |
Rationale ========= argparse performs a complex formatting of the usage for argument grouping and for line wrapping to fit the terminal width. This formatting has been a constant source of bugs for at least 10 years (see linked issues below) where defensive assertion errors are triggered or brackets and paranthesis are not properly handeled. Problem ======= The current implementation of argparse usage formatting relies on regular expressions to group arguments usage only to separate them again later with another set of regular expressions. This is a complex and error prone approach that caused all the issues linked below. Special casing certain argument formats has not solved the problem. The following are some of the most common issues: - empty `metavar` - mutually exclusive groups with `SUPPRESS`ed arguments - metavars with whitespace - metavars with brackets or paranthesis Solution ======== The following two comments summarize the solution: - python#82091 (comment) - python#77048 (comment) Mainly, the solution is to rewrite the usage formatting to avoid the group-then-separate approach. Instead, the usage parts are kept separate and only joined together at the end. This allows for a much simpler implementation that is easier to understand and maintain. It avoids the regular expressions approach and fixes the corresponding issues. This closes the following issues: - Closes python#62090 - Closes python#62549 - Closes python#77048 - Closes python#82091 - Closes python#89743 - Closes python#96310 - Closes python#98666 These PRs become obsolete: - Closes python#15372 - Closes python#96311
Rationale ========= argparse performs a complex formatting of the usage for argument grouping and for line wrapping to fit the terminal width. This formatting has been a constant source of bugs for at least 10 years (see linked issues below) where defensive assertion errors are triggered or brackets and paranthesis are not properly handeled. Problem ======= The current implementation of argparse usage formatting relies on regular expressions to group arguments usage only to separate them again later with another set of regular expressions. This is a complex and error prone approach that caused all the issues linked below. Special casing certain argument formats has not solved the problem. The following are some of the most common issues: - empty `metavar` - mutually exclusive groups with `SUPPRESS`ed arguments - metavars with whitespace - metavars with brackets or paranthesis Solution ======== The following two comments summarize the solution: - #82091 (comment) - #77048 (comment) Mainly, the solution is to rewrite the usage formatting to avoid the group-then-separate approach. Instead, the usage parts are kept separate and only joined together at the end. This allows for a much simpler implementation that is easier to understand and maintain. It avoids the regular expressions approach and fixes the corresponding issues. This closes the following GitHub issues: - #62090 - #62549 - #77048 - #82091 - #89743 - #96310 - #98666 These PRs become obsolete: - #15372 - #96311
The reproducer needs a narrow enough console, or a bigger number for the This was fixed in #96311. |
Rationale ========= argparse performs a complex formatting of the usage for argument grouping and for line wrapping to fit the terminal width. This formatting has been a constant source of bugs for at least 10 years (see linked issues below) where defensive assertion errors are triggered or brackets and paranthesis are not properly handeled. Problem ======= The current implementation of argparse usage formatting relies on regular expressions to group arguments usage only to separate them again later with another set of regular expressions. This is a complex and error prone approach that caused all the issues linked below. Special casing certain argument formats has not solved the problem. The following are some of the most common issues: - empty `metavar` - mutually exclusive groups with `SUPPRESS`ed arguments - metavars with whitespace - metavars with brackets or paranthesis Solution ======== The following two comments summarize the solution: - python#82091 (comment) - python#77048 (comment) Mainly, the solution is to rewrite the usage formatting to avoid the group-then-separate approach. Instead, the usage parts are kept separate and only joined together at the end. This allows for a much simpler implementation that is easier to understand and maintain. It avoids the regular expressions approach and fixes the corresponding issues. This closes the following GitHub issues: - python#62090 - python#62549 - python#77048 - python#82091 - python#89743 - python#96310 - python#98666 These PRs become obsolete: - python#15372 - python#96311
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: