WSGI adapter does not validate if query_string can be decoded to ascii #317

rodrigoffzz · 2022-02-18T21:28:50Z

hi,

on wsgi.py, when running build_environ(), the query_string is not proper validated before being decoded to ascii and it can lead to a general failure:

    def build_environ(self, scope, body):
    #...
        environ = {
            "REQUEST_METHOD": scope["method"],
            "SCRIPT_NAME": scope.get("root_path", "").encode("utf8").decode("latin1"),
            "PATH_INFO": scope["path"].encode("utf8").decode("latin1"),
            "QUERY_STRING": scope["query_string"].decode("ascii"),
            #...
        }

a simple test would be to run the test_wsgi_path_encoding test with 'ç' in the query_string, it fails with:

E     File "/home/user/asgiref/tests/test_wsgi.py", line 81
E       "query_string": b"çççç",
E                               ^
E   SyntaxError: bytes can only contain ASCII literal characters.

I understand that django can handle these requests properly without crashing when running alongside asgiref, but what about when it runs on its own?
do you think this might be a real issue that could be exploited to cause a DoS?

thanks for checking it.

The text was updated successfully, but these errors were encountered:

andrewgodwin · 2022-02-18T22:34:40Z

You are correct, it assumes the query string is ASCII as there is no other sensible assumption we can make when going to WSGI, apart from maybe using latin-1 encoding. HTTP generally doesn't tolerate non-ASCII characters in URLs well anyway.

The error you posted is unrelated - that's a syntax error as you have tried to put non-ASCII characters in a bytestring, which Python doesn't allow. The error you'd get at runtime would be a DecodeError, which would be caught by the server and result in an error page.

Unless you have a concrete suggestion of how we can change this behaviour without breaking anything else, I'm not sure what we can do here.

rodrigoffzz · 2022-02-21T13:16:25Z

hi,
thanks your prompt reply.

I'm really sorry about the error I posted, I was trying to make the report easier and didn't pay attention to the error at all, sorry for that, I was looking for an UnicodeDecodeError and not a SyntaxError, I'm glad you got the idea, thanks for that.

honestly, I don't have a concrete suggestion at this point. I was trying to analyze the code from a security perspective and I decide to raise the discussion here.

as I understand for from your comment, there are other general HTTP guards, that does make sense, that would prevent this issue to be exploited, right? If so, I don't see a clear reason to be changing the code and taking the risk of breaking things around in order to fix the issue. If that is the case, please consider closing this issue.

thanks for your attention.

andrewgodwin · 2022-02-21T17:32:50Z

Yes, I don't believe this can be exploited, even for a DoS attack. Thanks for checking, though!

andrewgodwin closed this as completed Feb 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WSGI adapter does not validate if query_string can be decoded to ascii #317

WSGI adapter does not validate if query_string can be decoded to ascii #317

rodrigoffzz commented Feb 18, 2022

andrewgodwin commented Feb 18, 2022

rodrigoffzz commented Feb 21, 2022

andrewgodwin commented Feb 21, 2022

WSGI adapter does not validate if query_string can be decoded to ascii #317

WSGI adapter does not validate if query_string can be decoded to ascii #317

Comments

rodrigoffzz commented Feb 18, 2022

andrewgodwin commented Feb 18, 2022

rodrigoffzz commented Feb 21, 2022

andrewgodwin commented Feb 21, 2022