You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ckan.config.middleware.__init__.py:handle_i18n urlencodes the querystring, but this function is explictly not for querystrings, as it encodes the special characters. (it encodes & and =, so It's for things that go into urls and querystrings, not the whole querystring).
This urlencoded querystring gets added back into the current url, at which point we have a current "url" which is potentially not a valid url for anything in our system. As evidence for this, datastore/backend/postgres.py has added a patch to parse the returned url, unquote, and then reassemble.
Because of this, helpers.current_url returns a urldecoded version of the url, which is also not really something that we want to do, because that decodes the path component as well.
Now, once we're url decoding, we're taking it out of ascii and putting essentially arbitrary bytes through a unicode encode/decode process and potentially getting invalid UTF-8, or trying to put UTF-8 bytes into an ascii string, causing an error.
Steps to reproduce
Steps to reproduce the behavior:
Visit https://site/%EF%AC%81 and it returns a 500 error, not a 404. The path is the fi ligature.
Expected behavior
Return a 404, not a server error.
The text was updated successfully, but these errors were encountered:
CKAN version
Definitely 2.8, Probably 2.9, Not sure about master. Bug appears to have been added 5 years ago. Definitely on python2, maybe on python3.
Describe the bug
https://site/%EF%AC%81?foo=bar&bz=%AC%81
ckan.config.middleware.__init__.py:handle_i18n
urlencodes the querystring, but this function is explictly not for querystrings, as it encodes the special characters. (it encodes & and =, so It's for things that go into urls and querystrings, not the whole querystring).This urlencoded querystring gets added back into the current url, at which point we have a current "url" which is potentially not a valid url for anything in our system. As evidence for this, datastore/backend/postgres.py has added a patch to parse the returned url, unquote, and then reassemble.
Because of this, helpers.current_url returns a urldecoded version of the url, which is also not really something that we want to do, because that decodes the path component as well.
Now, once we're url decoding, we're taking it out of ascii and putting essentially arbitrary bytes through a unicode encode/decode process and potentially getting invalid UTF-8, or trying to put UTF-8 bytes into an ascii string, causing an error.
Steps to reproduce
Steps to reproduce the behavior:
Visit https://site/%EF%AC%81 and it returns a 500 error, not a 404. The path is the fi ligature.
Expected behavior
Return a 404, not a server error.
The text was updated successfully, but these errors were encountered: