Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expected behavior of argparse given quoted strings #85766

Closed
vegarsti mannequin opened this issue Aug 20, 2020 · 9 comments
Closed

Expected behavior of argparse given quoted strings #85766

vegarsti mannequin opened this issue Aug 20, 2020 · 9 comments
Labels
3.8 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@vegarsti
Copy link
Mannequin

vegarsti mannequin commented Aug 20, 2020

BPO 41600
Nosy @rhettinger, @ericvsmith, @vegarsti

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2020-08-20.13:56:57.259>
created_at = <Date 2020-08-20.11:24:14.744>
labels = ['3.8', 'type-bug', 'library', 'invalid']
title = 'Expected behavior of argparse given quoted strings'
updated_at = <Date 2020-08-20.15:46:18.106>
user = 'https://github.com/vegarsti'

bugs.python.org fields:

activity = <Date 2020-08-20.15:46:18.106>
actor = 'vegarsti'
assignee = 'none'
closed = True
closed_date = <Date 2020-08-20.13:56:57.259>
closer = 'eric.smith'
components = ['Library (Lib)']
creation = <Date 2020-08-20.11:24:14.744>
creator = 'vegarsti'
dependencies = []
files = []
hgrepos = []
issue_num = 41600
keywords = []
message_count = 9.0
messages = ['375702', '375705', '375706', '375707', '375710', '375711', '375715', '375716', '375717']
nosy_count = 4.0
nosy_names = ['rhettinger', 'eric.smith', 'paul.j3', 'vegarsti']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue41600'
versions = ['Python 3.8']

@vegarsti
Copy link
Mannequin Author

vegarsti mannequin commented Aug 20, 2020

I'm not sure if this is a bug, but I had a problem when I was trying to use argparse recently, and I was wondering about the expected behavior.

For context: We invoke a Python program from a deployment tool, where we provide input in a text box. We were using argparse to read and parse the input arguments. The scenario we had was we were requiring two named arguments to be given, as illustrated in the minimal example below.

# a.py

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("--a", required=True)
parser.add_argument("--b", required=True)
parser.parse_args()

When invoking this program from this deployment tool giving --a=1 --b=2 as input, we got the error message a.py: error: the following arguments are required: --a, --b.

As it turns out, the input was provided in the same way as if you had given the program a quoted string in the shell:

$ python a.py "--a=1 --b=2"
usage: a.py [-h] --a A --b B
a.py: error: the following arguments are required: --a, --b

When given a quoted string like this, sys.argv only has two elements, namely a.py and --a=1 --b=2. This was new to me! But it makes sense.

This was a bit annoying! One way to get around it, which we did indeed implement, is to mutate sys.argv, effectively unpacking the input string such that sys.argv ends up as ["a.py", "--a=1, --b=2].

Given that the string contains named arguments, it seems to me that it could be possible, and safe, to unpack this quoted string. Would that make sense? Or am I using it incorrectly? Or is there some other way to provide input such that I don't have to do this hack that I mentioned?

If we make a similar program where the arguments a and b are not named arguments, but rather positional arguments,

# b.py

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("a")
parser.add_argument("b")
parser.parse_args()

and we call the program as before with python b.py "1 2", then a will be set to the string 1 2, whereas b will not be set (and so the program will, of course, exit). This seems entirely reasonable. And perhaps it isn't possible to both get this behaviour, as well as the behaviour that I mentioned above.

@vegarsti vegarsti mannequin added type-feature A feature request or enhancement 3.8 only security fixes stdlib Python modules in the Lib dir labels Aug 20, 2020
@vegarsti
Copy link
Mannequin Author

vegarsti mannequin commented Aug 20, 2020

For what it's worth, I'd love to work on this if it's something that could be nice to have.

@vegarsti
Copy link
Mannequin Author

vegarsti mannequin commented Aug 20, 2020

It seems that I mixed up something in the post here. If the quoted string is "--a=1 --b=2 as I said in the post, then the program will only complain about b missing. In this case, it sets a to be 1 --b=2. Whereas if the quoted string is "--a 1 --b 2" (i.e. space and not = is used to separate), then it will say that both a and b are missing.

@vegarsti
Copy link
Mannequin Author

vegarsti mannequin commented Aug 20, 2020

In fact, what happens in the latter case (i.e. "--a 1 --b 2"), inside the call to _parse_optional, is that it fails to get the optional tuple. And so it continues to this line in argparse.py:

if ' ' in arg_string:

Here it says that if there's a space in the string, it was meant to be a positional, and so the function returns None, causing it to not find the argument.

In conclusion, it seems to me that argparse is not, in fact, meant to handle quoted strings, or rather, strings where there are spaces.

@ericvsmith
Copy link
Member

This is all working as designed. We do not want to modify argparse to split parameters.

You probably want to split the input with shlex.split(). See https://stackoverflow.com/questions/44945815/how-to-split-a-string-into-command-line-arguments-like-the-shell-in-python

You shouldn't need to mutate sys.argv. You can break the input up into multiple strings with shlex.split() (or whatever you decide to use) and pass those to ArgumentParser.parse_args().

@vegarsti
Copy link
Mannequin Author

vegarsti mannequin commented Aug 20, 2020

I see! Thanks, had not heard about shlex. I also had not realized parse_args takes arguments. Doh. That makes sense.

Thanks a lot!

@ericvsmith ericvsmith added invalid type-bug An unexpected behavior, bug, or error and removed type-feature A feature request or enhancement labels Aug 20, 2020
@ericvsmith ericvsmith added invalid type-bug An unexpected behavior, bug, or error and removed type-feature A feature request or enhancement labels Aug 20, 2020
@paulj3
Copy link
Mannequin

paulj3 mannequin commented Aug 20, 2020

I'd say the problem is with the deployment tool. Inputs like that should be split regardless of who's doing the commandline parsing. With normal shell input, quotes are used to prevent splitting, or to otherwise prevent substitutions and special character handling.

@ericvsmith
Copy link
Member

Completely agree with paul j3. The calling tool is breaking the "argv" conventions. If the OP can control the calling tool, it should be fixed there.

@vegarsti
Copy link
Mannequin Author

vegarsti mannequin commented Aug 20, 2020

Great idea, thanks! It's open source, so I'll see if I can fix it.

On Thu, 20 Aug 2020 at 17:28, Eric V. Smith <report@bugs.python.org> wrote:

Eric V. Smith <eric@trueblade.com> added the comment:

Completely agree with paul j3. The calling tool is breaking the "argv"
conventions. If the OP can control the calling tool, it should be fixed
there.

----------


Python tracker <report@bugs.python.org>

<https://bugs.python.org/issue41600\>


@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.8 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

1 participant