New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urlencode doesn't escape slashes #515

Closed
chuwy opened this Issue Nov 20, 2015 · 19 comments

Comments

Projects
None yet
7 participants
@chuwy
Copy link

chuwy commented Nov 20, 2015

This was raised in #444, but this code always evaluates to b'/' on Python 3.4.1, so slashes still not escaped.

@mitsuhiko mitsuhiko closed this in 8189d21 Nov 20, 2015

@chuwy

This comment has been minimized.

Copy link

chuwy commented Nov 20, 2015

Just a heads up @mitsuhiko
I have no idea why, but do_urlencode still doesn't escape slashes, while unicode_urlencode works as expected.

Python 3.4.3 (default, Oct 14 2015, 20:28:29)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import jinja2
>>> jinja2.utils.unicode_urlencode("http://url.by", for_qs=True)
'http%3A%2F%2Furl.by'
>>> jinja2.filters.do_urlencode("http://url.by")
'http%3A//url.by'
>>> jinja2.__version__
'2.9.dev'
@mitsuhiko

This comment has been minimized.

Copy link
Member

mitsuhiko commented Nov 20, 2015

Because the urlencode filter does not escape slashes. Is there a specific reason why it has to? To clarify this: it only encodes slashes in the value position of passed key/value pairs.

@chuwy

This comment has been minimized.

Copy link

chuwy commented Nov 20, 2015

I believe this is standard behaviour for function intended for encoding urls. Tools like http://meyerweb.com/eric/tools/dencoder/ behave this way. Also I have at least one tool inside our company which expects URLs being passed inside GET requests with escaped slashes.
Also, I don't understand why unicode_urlencode called inside do_urlencode with for_qs=True.
May be I understand something wrong.

@mitsuhiko

This comment has been minimized.

Copy link
Member

mitsuhiko commented Nov 20, 2015

Slashes are reserved characters in the path component and the more common behavior is to encode everything but slashes there when forcing things to url encoded behavior. The alternative (to encode slashes to %2f) does not even make sense as most servers outright reject those requests due to security problems as backend servers typically cannot distinguish %2f and / in the path component as they operate on decoded octets.

So the only part where a slash actually makes sense encoding is in query strings and this is where the dict based encoder that urlencode has works like that. However even there a slash does not have to be encoded, so there is no reason to force it to be encoded.

The urlencode function should use for most people by default that's why it does not encode a slash. If you have custom requirements then you can override the function in your filter registration.

@chuwy

This comment has been minimized.

Copy link

chuwy commented Nov 20, 2015

Ok. Thank you Armin.

@linwukang

This comment has been minimized.

Copy link

linwukang commented Jan 19, 2016

Hi, I got a same trouble with reject slash, here is the code:

jinja2.Template("{{ disks|reject('sameas', '/')|list }}").render(disks=["/", "/mnt/disk0", "/mnt/disk1"])
u"['/', '/mnt/disk0', '/mnt/disk1']"

I want to reject root disk, but it not work anymore, how to solved it?

@ThiefMaster

This comment has been minimized.

Copy link
Member

ThiefMaster commented Jan 19, 2016

>>> import jinja2
>>> jinja2.Template("{{ disks|reject('sameas', '/')|list }}").render(disks=["/", "/mnt/disk0", "/mnt/disk1"])
u"['/mnt/disk0', '/mnt/disk1']"
>>> jinja2.__version__
Out[3]: '2.8'

Works for me.

@linwukang

This comment has been minimized.

Copy link

linwukang commented Jan 19, 2016

Still not work for me, @ThiefMaster, would the a python issue? what's the version of python you used.

    Python 2.7.10 (default, May 23 2015, 09:44:00) [MSC v.1500 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import jinja2
    >>> jinja2.Template("{{ disks|reject('sameas', '/')|list }}").render(disks=["/", "/mnt/disk0", "/mnt/disk1"])
    u"['/', '/mnt/disk0', '/mnt/disk1']"
    >>> jinja2.__version__
    '2.8'
@ThiefMaster

This comment has been minimized.

Copy link
Member

ThiefMaster commented Jan 19, 2016

Oh.. sameas uses is (and you cannot expect 'foo' is 'foo' to work). You want equalto which uses ==.

@linwukang

This comment has been minimized.

Copy link

linwukang commented Jan 19, 2016

Yes it works, I'm not family with Jinja2. Thanks, @ThiefMaster

@MartinNowak

This comment has been minimized.

Copy link

MartinNowak commented Mar 28, 2017

So the only part where a slash actually makes sense encoding is in query strings and this is where the dict based encoder that urlencode has works like that. However even there a slash does not have to be encoded, so there is no reason to force it to be encoded.

No, for example you need to encode / in usernames and passwords as well.
It's the reason why JS has encodeURI and encodeURIComponent.

@mitsuhiko

This comment has been minimized.

Copy link
Member

mitsuhiko commented Mar 28, 2017

That's fair enough but in practice it's not necessary there either and included credentials are deprecated anyways. Since those are unlikely to be produced within templates it's an edge case that is not really worth considering.

@danielkza

This comment has been minimized.

Copy link

danielkza commented Oct 7, 2017

Ansible uses Jinja, and it's pretty common to handle security credentials when setting up systems. I just hit a case where an automatically-generated password contained a slash that was not replace by urlencode to generate a database URL, which is pretty unfortunate. While breaking current behavior would be problematic, why not introduce a second filter that does escape the slashes?

@ThiefMaster

This comment has been minimized.

Copy link
Member

ThiefMaster commented Oct 7, 2017

Ansible could do that. There is no need for such a change to be in Jinja itself - it is extensible enough to add custom filters or even replace builtin ones.

@danielkza

This comment has been minimized.

Copy link

danielkza commented Oct 7, 2017

@ThiefMaster Are use cases other than constructing HTML templates irrelevant when determining what is useful to be included in Jinja itself? For example, the Saltstack project, with similar purpose to Ansible, also uses Jinja for templating, and would benefit from the same change.

@mitsuhiko

This comment has been minimized.

Copy link
Member

mitsuhiko commented Oct 7, 2017

@danielkza what stops saltstack from providing a filter that does that?

@danielkza

This comment has been minimized.

Copy link

danielkza commented Oct 7, 2017

@mitsuhiko Why does Jinja include any built-in filters then? I can only guess it is because they're useful in multiple use cases. I used Ansible and Salt as two examples of where being able to escape slashes in URLs is desired, and hence, it would be valuable to have it available for everyone.

What about adding a safe argument to urlencode, as Python's urllib.url_quote has, so that by default slashes are preserved, but in a way that can be easily overriden?

@mitsuhiko

This comment has been minimized.

Copy link
Member

mitsuhiko commented Oct 7, 2017

Jinja attempts to provide some commonly used functionality. We have two modes for urlencoding which gets you about 95% there. You can encode entire querystrings by encoding a dict and you can encode to a common set which is valid for paths through urlencode on strings.

We don't do anything other than utf-8 or that. Because where would it stop. There are too many parts of a url, there are iris and they all have their own kinks. When we are there, why not just also provide a punycode encoder for the netloc?

@ahuffman

This comment has been minimized.

Copy link

ahuffman commented Apr 4, 2018

I hit this issue when trying to create a file in a gitlab repository via API calls. The gitlab api requires the slashes to be encoded. To make this work I do: {{ myvar | urlencode | regex_replace('/','%2F') }}. I'm working with Ansible and Jinja2 filters in my playbook tasks. This could be a workaround for those of you hitting this, as I validated it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment