Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

argparse wrapping fails with metavar="" (no metavar) #82091

Closed
sjfranklin mannequin opened this issue Aug 21, 2019 · 7 comments
Closed

argparse wrapping fails with metavar="" (no metavar) #82091

sjfranklin mannequin opened this issue Aug 21, 2019 · 7 comments
Labels
3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@sjfranklin
Copy link
Mannequin

sjfranklin mannequin commented Aug 21, 2019

BPO 37910
Nosy @rhettinger, @shihai1991, @sjfranklin, @iritkatriel
PRs
  • bpo-37910: argparse usage wrapping should allow whitespace differences caused by metavar #15372
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-08-21.22:08:19.919>
    labels = ['type-bug', 'library', '3.10', '3.11']
    title = 'argparse wrapping fails with metavar="" (no metavar)'
    updated_at = <Date 2021-10-19.09:28:30.797>
    user = 'https://github.com/sjfranklin'

    bugs.python.org fields:

    activity = <Date 2021-10-19.09:28:30.797>
    actor = 'iritkatriel'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2019-08-21.22:08:19.919>
    creator = 'sjfranklin'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 37910
    keywords = ['patch']
    message_count = 6.0
    messages = ['350117', '350169', '354647', '371415', '371446', '404280']
    nosy_count = 5.0
    nosy_names = ['rhettinger', 'paul.j3', 'shihai1991', 'sjfranklin', 'iritkatriel']
    pr_nums = ['15372']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue37910'
    versions = ['Python 3.10', 'Python 3.11']

    @sjfranklin
    Copy link
    Mannequin Author

    sjfranklin mannequin commented Aug 21, 2019

    When argparse wraps the usage text, it can fail its assertion tests with whitespace differences. This can occur when metavar="", needed if a user wishes to avoid having a metavar print. It also could occur if a user specifies any other whitespace.
    Here's a minimum example (depending on $COLUMNS):

    import argparse
    # based on Vajrasky Kok's script in https://bugs.python.org/issue11874
    parser = argparse.ArgumentParser(prog='PROG')
    parser.add_argument('--nil', metavar='', required=True)
    parser.add_argument('--a', metavar='a' * 165)
    parser.parse_args()

    This produces the AssertionError at the bottom of this comment.

    A solution is to have the two asserts ignore whitespace. I'll submit a pull request very shortly for this. (First time so happy for any comments or critiques!)

    A more extensive example:
    import argparse
    # based on Vajrasky Kok's script in https://bugs.python.org/issue11874
    parser = argparse.ArgumentParser(prog='PROG')

    parser.add_argument('--nil', metavar='', required=True)
    parser.add_argument('--Line-Feed', metavar='\n', required=True)
    parser.add_argument('--Tab', metavar='\t', required=True)
    parser.add_argument('--Carriage-Return', metavar='\r', required=True)
    parser.add_argument('--Carriage-Return-and-Line-Feed',
                        metavar='\r\n', required=True)
    parser.add_argument('--vLine-Tabulation', metavar='\v', required=True)
    parser.add_argument('--x0bLine-Tabulation', metavar='\x0b', required=True)
    parser.add_argument('--fForm-Feed', metavar='\f', required=True)
    parser.add_argument('--x0cForm-Feed', metavar='\x0c', required=True)
    parser.add_argument('--File-Separator', metavar='\x1c', required=True)
    parser.add_argument('--Group-Separator', metavar='\x1d', required=True)
    parser.add_argument('--Record-Separator', metavar='\x1e', required=True)
    parser.add_argument('--C1-Control-Code', metavar='\x85', required=True)
    parser.add_argument('--Line-Separator', metavar='\u2028', required=True)
    parser.add_argument('--Paragraph-Separator', metavar='\u2029', required=True)
    parser.add_argument('--a', metavar='a' * 165)
    parser.parse_args()

    This is related to https://bugs.python.org/issue17890 and https://bugs.python.org/issue32867.

    File "/minimum_argparse_bug.py", line 7, in <module>
    parser.parse_args()
    File "/path/to/cpython/Lib/argparse.py", line 1758, in parse_args
    args, argv = self.parse_known_args(args, namespace)
    File "/path/to/cpython/Lib/argparse.py", line 1790, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
    File "/path/to/cpython/Lib/argparse.py", line 1996, in _parse_known_args
    start_index = consume_optional(start_index)
    File "/path/to/cpython/Lib/argparse.py", line 1936, in consume_optional
    take_action(action, args, option_string)
    File "/path/to/cpython/Lib/argparse.py", line 1864, in take_action
    action(self, namespace, argument_values, option_string)
    File "/path/to/cpython/Lib/argparse.py", line 1037, in __call__
    parser.print_help()
    File "/path/to/cpython/Lib/argparse.py", line 2483, in print_help
    self._print_message(self.format_help(), file)
    File "/path/to/cpython/Lib/argparse.py", line 2467, in format_help
    return formatter.format_help()
    File "/path/to/cpython/Lib/argparse.py", line 281, in format_help
    help = self._root_section.format_help()
    File "/path/to/cpython/Lib/argparse.py", line 212, in format_help
    item_help = join([func(*args) for func, args in self.items])
    File "/path/to/cpython/Lib/argparse.py", line 212, in <listcomp>
    item_help = join([func(*args) for func, args in self.items])
    File "/path/to/cpython/Lib/argparse.py", line 336, in _format_usage
    assert ' '.join(opt_parts) == opt_usage
    AssertionError

    @sjfranklin sjfranklin mannequin added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes type-crash A hard crash of the interpreter, possibly with a core dump stdlib Python modules in the Lib dir labels Aug 21, 2019
    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Aug 22, 2019

    That usage formatting is extremely brittle. It's not just "" metavar that can mess it up. Other 'usual' characters can mess it in the same way.

    The underlying problem is that it formats the whole usage, and if it is too long tries to split it into pieces, and then reassemble it in wrapped lines. The assertion tries to verify that the split was accurate.

    Usage really needs to be rewritten in a way that keeps the individual Action pieces separate until it is ready to assemble them into final lines. Anything else is just bandaids.

    @sjfranklin
    Copy link
    Mannequin Author

    sjfranklin mannequin commented Oct 14, 2019

    Paul, very true. If I find time I may take a look at rewriting it as you suggest.
    For the moment, though, there's a relatively simple change that opens up a range of allowable whitespace characters. It's still a bandage, but it covers decent-sized area and should be easily backported. What do you think of the pull request? It's passed a first review.

    @rhettinger
    Copy link
    Contributor

    Paul, what do you think about the PR?

    @rhettinger rhettinger added 3.10 only security fixes and removed 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes labels Jun 12, 2020
    @shihai1991
    Copy link
    Member

    LGTM. It's a lightweight patch :)

    @iritkatriel
    Copy link
    Member

    Reproduced on 3.11.

    @iritkatriel iritkatriel added 3.11 only security fixes type-bug An unexpected behavior, bug, or error and removed type-crash A hard crash of the interpreter, possibly with a core dump labels Oct 19, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    hamdanal added a commit to hamdanal/cpython that referenced this issue May 28, 2023
    Rationale
    =========
    
    argparse performs a complex formatting of the usage for argument grouping
    and for line wrapping to fit the terminal width. This formatting has been
    a constant source of bugs for at least 10 years (see linked issues below)
    where defensive assertion errors are triggered or brackets and paranthesis
    are not properly handeled.
    
    Problem
    =======
    
    The current implementation of argparse usage formatting relies on regular
    expressions to group arguments usage only to separate them again later
    with another set of regular expressions. This is a complex and error prone
    approach that caused all the issues linked below. Special casing certain
    argument formats has not solved the problem. The following are some of
    the most common issues:
    - empty `metavar`
    - mutually exclusive groups with `SUPPRESS`ed arguments
    - metavars with whitespace
    - metavars with brackets or paranthesis
    
    Solution
    ========
    
    The following two comments summarize the solution:
    - python#82091 (comment)
    - python#77048 (comment)
    
    Mainly, the solution is to rewrite the usage formatting to avoid the
    group-then-separate approach. Instead, the usage parts are kept separate
    and only joined together at the end. This allows for a much simpler
    implementation that is easier to understand and maintain. It avoids the
    regular expressions approach and fixes the corresponding issues.
    
    This closes the following issues:
    - Closes python#62090
    - Closes python#62549
    - Closes python#77048
    - Closes python#82091
    - Closes python#89743
    - Closes python#96310
    - Closes python#98666
    
    These PRs become obsolete:
    - Closes python#15372
    - Closes python#96311
    encukou pushed a commit that referenced this issue May 7, 2024
    Rationale
    =========
    
    argparse performs a complex formatting of the usage for argument grouping
    and for line wrapping to fit the terminal width. This formatting has been
    a constant source of bugs for at least 10 years (see linked issues below)
    where defensive assertion errors are triggered or brackets and paranthesis
    are not properly handeled.
    
    Problem
    =======
    
    The current implementation of argparse usage formatting relies on regular
    expressions to group arguments usage only to separate them again later
    with another set of regular expressions. This is a complex and error prone
    approach that caused all the issues linked below. Special casing certain
    argument formats has not solved the problem. The following are some of
    the most common issues:
    - empty `metavar`
    - mutually exclusive groups with `SUPPRESS`ed arguments
    - metavars with whitespace
    - metavars with brackets or paranthesis
    
    Solution
    ========
    
    The following two comments summarize the solution:
    - #82091 (comment)
    - #77048 (comment)
    
    Mainly, the solution is to rewrite the usage formatting to avoid the
    group-then-separate approach. Instead, the usage parts are kept separate
    and only joined together at the end. This allows for a much simpler
    implementation that is easier to understand and maintain. It avoids the
    regular expressions approach and fixes the corresponding issues.
    
    This closes the following GitHub issues:
    -  #62090
    -  #62549
    -  #77048
    -  #82091
    -  #89743
    -  #96310
    -  #98666
    
    These PRs become obsolete:
    -  #15372
    -  #96311
    @encukou
    Copy link
    Member

    encukou commented May 7, 2024

    Fixed in #105039:

    import argparse
    # based on Vajrasky Kok's script in https://bugs.python.org/issue11874
    parser = argparse.ArgumentParser(prog='PROG')
    parser.add_argument('--nil', metavar='', required=True)
    parser.add_argument('--a', metavar='a' * 165)
    parser.parse_args()
    usage: PROG [-h] --nil 
                [--a aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa]
    PROG: error: the following arguments are required: --nil
    

    import argparse
    # based on Vajrasky Kok's script in https://bugs.python.org/issue11874
    parser = argparse.ArgumentParser(prog='PROG')
    parser.add_argument('--nil', metavar='', required=True)
    parser.add_argument('--Line-Feed', metavar='\n', required=True)
    parser.add_argument('--Tab', metavar='\t', required=True)
    parser.add_argument('--Carriage-Return', metavar='\r', required=True)
    parser.add_argument('--Carriage-Return-and-Line-Feed',
                        metavar='\r\n', required=True)
    parser.add_argument('--vLine-Tabulation', metavar='\v', required=True)
    parser.add_argument('--x0bLine-Tabulation', metavar='\x0b', required=True)
    parser.add_argument('--fForm-Feed', metavar='\f', required=True)
    parser.add_argument('--x0cForm-Feed', metavar='\x0c', required=True)
    parser.add_argument('--File-Separator', metavar='\x1c', required=True)
    parser.add_argument('--Group-Separator', metavar='\x1d', required=True)
    parser.add_argument('--Record-Separator', metavar='\x1e', required=True)
    parser.add_argument('--C1-Control-Code', metavar='\x85', required=True)
    parser.add_argument('--Line-Separator', metavar='\u2028', required=True)
    parser.add_argument('--Paragraph-Separator', metavar='\u2029', required=True)
    parser.add_argument('--a', metavar='a' * 165)
    parser.parse_args()
    usage: PROG [-h] --nil  --Line-Feed 
     --Tab 	 --Carriage-Return 
                --Carriage-Return-and-Line-Feed 
     --vLine-Tabulation 
                         --x0bLine-Tabulation 
    
                --fForm-Feed 
                              --x0cForm-Feed 
                                              --File-Separator  --Group-Separator 
                --Record-Separator  --C1-Control-Code 
     --Line-Separator 

                --Paragraph-Separator 

                [--a aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa]
    PROG: error: the following arguments are required: --nil, --Line-Feed, --Tab, --Carriage-Return, --Carriage-Return-and-Line-Feed, --vLine-Tabulation, --x0bLine-Tabulation, --fForm-Feed, --x0cForm-Feed, --File-Separator, --Group-Separator, --Record-Separator, --C1-Control-Code, --Line-Separator, --Paragraph-Separator
    

    @encukou encukou closed this as completed May 7, 2024
    SonicField pushed a commit to SonicField/cpython that referenced this issue May 8, 2024
    Rationale
    =========
    
    argparse performs a complex formatting of the usage for argument grouping
    and for line wrapping to fit the terminal width. This formatting has been
    a constant source of bugs for at least 10 years (see linked issues below)
    where defensive assertion errors are triggered or brackets and paranthesis
    are not properly handeled.
    
    Problem
    =======
    
    The current implementation of argparse usage formatting relies on regular
    expressions to group arguments usage only to separate them again later
    with another set of regular expressions. This is a complex and error prone
    approach that caused all the issues linked below. Special casing certain
    argument formats has not solved the problem. The following are some of
    the most common issues:
    - empty `metavar`
    - mutually exclusive groups with `SUPPRESS`ed arguments
    - metavars with whitespace
    - metavars with brackets or paranthesis
    
    Solution
    ========
    
    The following two comments summarize the solution:
    - python#82091 (comment)
    - python#77048 (comment)
    
    Mainly, the solution is to rewrite the usage formatting to avoid the
    group-then-separate approach. Instead, the usage parts are kept separate
    and only joined together at the end. This allows for a much simpler
    implementation that is easier to understand and maintain. It avoids the
    regular expressions approach and fixes the corresponding issues.
    
    This closes the following GitHub issues:
    -  python#62090
    -  python#62549
    -  python#77048
    -  python#82091
    -  python#89743
    -  python#96310
    -  python#98666
    
    These PRs become obsolete:
    -  python#15372
    -  python#96311
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    Status: Doc issues
    Development

    No branches or pull requests

    4 participants