Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add shlex.quote #53932

Closed
brandon-rhodes mannequin opened this issue Aug 31, 2010 · 32 comments
Closed

Add shlex.quote #53932

brandon-rhodes mannequin opened this issue Aug 31, 2010 · 32 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@brandon-rhodes
Copy link
Mannequin

brandon-rhodes mannequin commented Aug 31, 2010

BPO 9723
Nosy @birkenfeld, @vstinner, @ericvsmith, @ezio-melotti, @merwok, @bitdancer, @anacrolix, @IIIIllllIIIIllllIIIIllllIIIIllllIIIIll
Files
  • shlex.quote.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/merwok'
    closed_at = <Date 2011-08-12.16:05:02.948>
    created_at = <Date 2010-08-31.13:40:13.159>
    labels = ['type-feature', 'library']
    title = 'Add shlex.quote'
    updated_at = <Date 2012-04-18.17:13:06.313>
    user = 'https://bugs.python.org/brandon-rhodes'

    bugs.python.org fields:

    activity = <Date 2012-04-18.17:13:06.313>
    actor = 'eric.araujo'
    assignee = 'eric.araujo'
    closed = True
    closed_date = <Date 2011-08-12.16:05:02.948>
    closer = 'eric.araujo'
    components = ['Library (Lib)']
    creation = <Date 2010-08-31.13:40:13.159>
    creator = 'brandon-rhodes'
    dependencies = []
    files = ['22685']
    hgrepos = []
    issue_num = 9723
    keywords = ['patch']
    message_count = 32.0
    messages = ['115263', '115270', '115463', '126674', '126768', '126830', '126850', '126863', '126936', '126939', '126950', '127957', '134638', '134834', '140601', '141350', '141368', '141374', '141379', '141380', '141383', '141386', '141387', '141430', '141439', '141963', '141964', '141997', '142006', '142203', '142204', '158642']
    nosy_count = 10.0
    nosy_names = ['georg.brandl', 'vstinner', 'eric.smith', 'ezio.melotti', 'eric.araujo', 'r.david.murray', 'brandon-rhodes', 'anacrolix', 'xuanji', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue9723'
    versions = ['Python 3.3']

    @brandon-rhodes
    Copy link
    Mannequin Author

    brandon-rhodes mannequin commented Aug 31, 2010

    The only way to safely build shell command lines from inside of Python — which is necessary when sending commands across SSH, since that behaves like os.system() rather than like subprocess.call() — is to use the wonderful pipes.call() method to turn possibly-dangerous arguments, like filenames that might have spaces, special characters, and embedded "rm -r" calls, into perfectly quoted strings for an "sh"-like shell (say, bash or zsh).

    This call is already recommended on mailing lists, blog posts, and Stack Overflow — and since it doesn't start with a "_", I think its public use is fair game. But the "pipes" documentation itself doesn't officially mention or support it. I think it should be added to the Standard Library documentation for "pipes". So. Yeah.

    @brandon-rhodes brandon-rhodes mannequin added the type-feature A feature request or enhancement label Aug 31, 2010
    @brandon-rhodes brandon-rhodes mannequin added the docs Documentation in the Doc dir label Aug 31, 2010
    @ericvsmith
    Copy link
    Member

    I think you mean pipe.quote in your message, not pipe.call. The subject looks correct.

    I'm not sure pipes is the best place for this, but I agree it should probably be documented in older versions. It seems to me we've had this discussion before about quoting command lines, how it applies differently between Windows and various shells, and which functions to expose. But having said that, I can't find a previous issue that discusses it.

    Not the least of my concerns is that pipes says it's available on Unix systems, despite the fact that I have it on a Windows machine. And I might need the functionality of passing a bash command from a Windows machine to a Unix machine, so we definitely need this cross platform.

    @merwok
    Copy link
    Member

    merwok commented Sep 3, 2010

    Eric referred to this thread: http://mail.python.org/pipermail/stdlib-sig/2010-May/thread.html

    @anacrolix
    Copy link
    Mannequin

    anacrolix mannequin commented Jan 21, 2011

    I agree, I discovered this function (pipes.quote) only through recommendation here:

    http://stackoverflow.com/questions/4748344/whats-the-reverse-of-shlex-split

    I suggest that it be added to shlex, perhaps as shlex.quote. While the quoting style it performs, and the module it's found in are specific to Unix tools, the shlex module is available on non-Unix platforms. To this end, adding the function to shlex would make it available elsewhere, as well as be appropriately shell-related.

    @merwok
    Copy link
    Member

    merwok commented Jan 21, 2011

    Why do you want to move quote from pipes to shlex? The function is available, the issue here is lack of documentation.

    @merwok merwok removed the type-feature A feature request or enhancement label Jan 21, 2011
    @anacrolix
    Copy link
    Mannequin

    anacrolix mannequin commented Jan 22, 2011

    Two reasons: The pipes module is Unix only, but pipes.quote is useful on all platforms. Secondly pipes.quote pertains to shell command-lines, this is also the domain of shlex which already cross platform. In pipes, an import shlex.quote would more than sufficient.

    If this belongs in another separate bug I shall submit one. Please advise.

    @merwok
    Copy link
    Member

    merwok commented Jan 22, 2011

    Even if quote does not start with an underscore, its absence from the docs and from the module’s __all__ make it a private function.

    I propose to make it public as shlex.quote in 3.3 and deprecate pipes.quote for people that relied on it (i.e. move code from pipes/test_pipes to shlex/test_shlex, add doc, write a pipes.quote function that sends a DeprecationWarning and calls shlex.quote).

    @merwok merwok added stdlib Python modules in the Lib dir and removed docs Documentation in the Doc dir labels Jan 22, 2011
    @merwok merwok changed the title pipes.quote() needs to be documented Add shlex.quote Jan 22, 2011
    @merwok merwok added the type-feature A feature request or enhancement label Jan 22, 2011
    @bitdancer
    Copy link
    Member

    Rather than doing a code deprecation you can just do 'from shlex import quote' in pipes (with a comment about backward compatibility). I don't think there is any real harm in leaving that kind of backward compatibility in place indefinitely.

    @merwok
    Copy link
    Member

    merwok commented Jan 24, 2011

    you can just do 'from shlex import quote' in pipes

    We would have had to do that anyway, since pipes needs to use the function. Agreed that a deprecation is not necessary.

    @bitdancer
    Copy link
    Member

    Yes, I know you have to do it anyway, that's why I used the adverb 'just' (as in 'just that, and nothing more') (I note that 'just' as an adverb tends to get overused by English speakers, and so loses some of its semantic value... :)

    @birkenfeld
    Copy link
    Member

    Putting quote() into shlex sounds good to me.

    @merwok
    Copy link
    Member

    merwok commented Feb 5, 2011

    A note from Ian Bicking in http://mail.python.org/pipermail/stdlib-sig/2010-May/000948.html:

    I've had to do this sort of thing for similar reasons, like calling remote
    ssh commands, where a shell command is embedded in a positional argument. I
    implemented my own thing, but the way pipes.quote works would have been
    nicer (the implementation could use a couple regexes instead of iterating
    over strings though).

    @merwok
    Copy link
    Member

    merwok commented Apr 28, 2011

    Someone on Stack Overflow suggested subprocess.list2cmdline. It’s one of those functions undocumented and absent from __all__ that somehow get found and used by some people. So when we make the patch, let’s include links to the new function doc from subprocess and pipes, and reply to the Stack Overflow thread.

    @ezio-melotti
    Copy link
    Member

    See also bpo-11827 about subprocess.list2cmdline.

    @merwok
    Copy link
    Member

    merwok commented Jul 18, 2011

    Here’s the patch, please review.

    @merwok merwok self-assigned this Jul 18, 2011
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 29, 2011

    New changeset 5966eeb0457d by Éric Araujo in branch 'default':
    Add shlex.quote function, to escape filenames and command lines (bpo-9723).
    http://hg.python.org/cpython/rev/5966eeb0457d

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 29, 2011

    New changeset 43c41e19527a by Éric Araujo in branch 'default':
    Expand shlex.quote example (bpo-9723)
    http://hg.python.org/cpython/rev/43c41e19527a

    @merwok
    Copy link
    Member

    merwok commented Jul 29, 2011

    I suck at regexes, but the one I came up with lets quote pass the tests that existed for pipes.quote, so I figure it’s good. I did not change the string operations at the end of the function (+ and replace) as it was simple and working.

    In the absence of review or opposition here, I have committed my patch. Feedback welcome.

    @merwok merwok closed this as completed Jul 29, 2011
    @bitdancer
    Copy link
    Member

    Sorry I didn't look at the patch earlier. I thought we were just *moving* pipes.quote. Why is a new regex involved?

    @merwok
    Copy link
    Member

    merwok commented Jul 29, 2011

    Because I choose to follow Ian’s remark in msg127957:

    the implementation could use a couple regexes instead of iterating
    over strings though

    I have not touched the tests, so I felt confident with my regex. FWIW, I’ve also changed the argument name, since the function is not limited to file names (as I hopefully made clear in the doc).

    @bitdancer
    Copy link
    Member

    Well, it's a micro-optimization (it would be interesting to benchmark, but not worth it). I find the original code much more readable than the regex, but it doesn't matter all that much. (And as far as optimization goes, using translate might be even faster.)

    @merwok
    Copy link
    Member

    merwok commented Jul 29, 2011

    str.translate is not an option, as the code does not replace but add characters (quotes).

    Out of sheer curiosity, I may actually benchmark this.

    @bitdancer
    Copy link
    Member

    You aren't using a regex to replace the quotes, either.

    >>> len('abcd'.translate(str.maketrans('', '', string.ascii_letters ))) > 0
    False

    I don't know if this is faster than the corresponding search regex, but depending on how much overhead regex has, it might be. Note that the translate table can be pre-built, just like the compiled regex.

    @ezio-melotti
    Copy link
    Member

    +_find_unsafe = re.compile(r'[^\\w\\d@%_\\-\\+=:,\\./]').search

    \w already includes both \d and _, so (unless you really want to be explicit about it) they are redundant. Also keep in mind that they match non-ASCII letters/numbers on Python 3.
    '+' and '.' don't need to be escaped in a character class (i.e. [...]), because they lose their meta-characters meaning there.
    '-' is correctly escaped there, but it's common practice to place it at the end of the character class, where it doesn't need escaping.

    r'[^\\w\\d@%_\\-\\+=:,\\./]' and r'[^\\w@%+=:,./-]' should therefore be equivalent.

    @merwok
    Copy link
    Member

    merwok commented Jul 30, 2011

    \w already includes both \d and _, so (unless you really want to be
    explicit about it) they are redundant. [snip]
    I just started from the previous list and turned it into a regex. Eliminating redundancy sounds good, I’ll use your version.

    Also keep in mind that they match non-ASCII letters/numbers on
    Python 3.
    This was not covered by the previous tests, but I think it’s fine: the shell is concerned about some metacharacters and spaces, not Unicode letters. To be 100% sure, I will add tests with Unicode characters for pipes.quote in 3.2, and when merging with 3.3 I’ll know if the new code still complies.

    @merwok merwok reopened this Jul 30, 2011
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 12, 2011

    New changeset 8032ea4c3619 by Éric Araujo in branch '3.2':
    Test pipes.quote with a few non-ASCII characters (see bpo-9723).
    http://hg.python.org/cpython/rev/8032ea4c3619

    New changeset 6ae0345a7e29 by Éric Araujo in branch 'default':
    Avoid unwanted behavior change in shlex.quote (see bpo-9723).
    http://hg.python.org/cpython/rev/6ae0345a7e29

    @merwok
    Copy link
    Member

    merwok commented Aug 12, 2011

    I have restored compatibility (see commit messages).

    @merwok merwok closed this as completed Aug 12, 2011
    @ezio-melotti
    Copy link
    Member

    -_find_unsafe = re.compile(r'[^\\w\\d@%_\\-\\+=:,\\./]').search
    +_find_unsafe = re.compile(r'[^\\w@%\\-\\+=:,\\./]', re.ASCII).search

    FWIW there are still unnecessary escapes before '+' and '.', and possibly '-' ('-' doesn't need escaping only when it's at the end (or beginning) of the regex).

    @anacrolix
    Copy link
    Mannequin

    anacrolix mannequin commented Aug 13, 2011

    Why can't pipes.quote can't be moved to shlex.quote verbatim as I originally proposed?

    Is there justification to also change it as part of the relocation? I think any changes to its behaviour should be a separate issue.

    @merwok
    Copy link
    Member

    merwok commented Aug 16, 2011

    FWIW there are still unnecessary escapes before '+' and '.', and
    possibly '-'

    This is IMO cosmetic and not as “important” as the duplicate characters already implied by the character class. Feel free to change it.

    Why can't pipes.quote can't be moved to shlex.quote verbatim as I
    originally proposed?

    I took the opportunity of changing some convoluted code.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 16, 2011

    New changeset 5d4438001069 by Ezio Melotti in branch 'default':
    bpo-9723: refactor regex.
    http://hg.python.org/cpython/rev/5d4438001069

    @merwok
    Copy link
    Member

    merwok commented Apr 18, 2012

    See bpo-14616 for a doc edition related to this.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants