New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shlex doesn't differentiate escaped characters in output #81078
Comments
The output of the following invocations are exactly the same: list(shlex.shlex('a ; b', posix=True, punctuation_chars=True))
list(shlex.shlex('a \; b', posix=True, punctuation_chars=True)) They both output the following: ['a', ';', 'b'] This makes it impossible to determine when the user wanted to escape the semi-colon for some reason, such as if they were using find's |
The goal is to match posix shell semantics. Can you provide a concrete example where shlex.shlex does something different from a posix-compliant shell? With all the escaping, it's going to be tough. Note also that your code raises a DeprecationWarning in 3.7, at least, and will be an error in the future. You should probably use r-strings in your examples. |
The point is that it's not possible to use the output of shlex.shlex to try to match the behaviour of a POSIX-compliant shell by reliably splitting up a user's input into multiple commands. In the first case I presented (no escape character), the user entered two commands. In the second case, the user entered a single command with two arguments. However, there's no way to differentiate the two situations based on the output of shlex. It's also worth noting that the output is the same with this too: list(shlex.shlex('a \\; b', posix=True, punctuation_chars=True)) I tested this code on python 3.6.7 and 3.7.2, and didn't see any deprecation warnings at all. I also checked the history of shlex.py: https://github.com/python/cpython/commits/master/Lib/shlex.py The last commit was from 2017, and I don't see any usages of DeprecationWarning inside that file. I'm also not sure how r-strings are relevant, as I don't see any regular expressions used inside of the shlex class. |
Run 3.7 with -Wd: $ python3 -Wd
Python 3.7.3 (default, Mar 29 2019, 13:03:53)
[GCC 7.4.0] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 'a \; b'
<stdin>:1: DeprecationWarning: invalid escape sequence \;
'a \\; b'
>>> The deprecation is in relation to invalid escape sequences, not shlex. My point is just that you should use r'a \; b' or 'a \\;b', and not rely on invalid escape sequences. For one reason, I can never remember how they're interpreted, and had to look it up. r-strings don't have anything to do with regular expressions per-se, they're a way of changing how python interprets stings, no matter what they're used for.
My question is: can a posix-compliant shell tell the difference? I don't know, it's an honest question. Can you show some shell code where it can tell the difference? |
My apologies, I didn't realise you were talking about the invalid escape sequence. Thanks for letting me know about the fact that it's deprecated, I'll definitely be keeping that in mind going forward. In a bash shell with the find command available, run the following command: find . -type f -exec ls {} \; You should see a list of files. If you run this: find . -type f -exec ls {} ; You should see an error message from find: "find: missing argument to `-exec'" If I pass the first example in this message to shlex, I get no indication that the user attempted escaped the semi-colon in their input. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: