Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSGI adapter does not validate if query_string can be decoded to ascii #317

Closed
rodrigoffzz opened this issue Feb 18, 2022 · 3 comments
Closed

Comments

@rodrigoffzz
Copy link

hi,

on wsgi.py, when running build_environ(), the query_string is not proper validated before being decoded to ascii and it can lead to a general failure:

    def build_environ(self, scope, body):
    #...
        environ = {
            "REQUEST_METHOD": scope["method"],
            "SCRIPT_NAME": scope.get("root_path", "").encode("utf8").decode("latin1"),
            "PATH_INFO": scope["path"].encode("utf8").decode("latin1"),
            "QUERY_STRING": scope["query_string"].decode("ascii"),
            #...
        }

a simple test would be to run the test_wsgi_path_encoding test with 'ç' in the query_string, it fails with:

E     File "/home/user/asgiref/tests/test_wsgi.py", line 81
E       "query_string": b"çççç",
E                               ^
E   SyntaxError: bytes can only contain ASCII literal characters.

I understand that django can handle these requests properly without crashing when running alongside asgiref, but what about when it runs on its own?
do you think this might be a real issue that could be exploited to cause a DoS?

thanks for checking it.

@andrewgodwin
Copy link
Member

You are correct, it assumes the query string is ASCII as there is no other sensible assumption we can make when going to WSGI, apart from maybe using latin-1 encoding. HTTP generally doesn't tolerate non-ASCII characters in URLs well anyway.

The error you posted is unrelated - that's a syntax error as you have tried to put non-ASCII characters in a bytestring, which Python doesn't allow. The error you'd get at runtime would be a DecodeError, which would be caught by the server and result in an error page.

Unless you have a concrete suggestion of how we can change this behaviour without breaking anything else, I'm not sure what we can do here.

@rodrigoffzz
Copy link
Author

hi,
thanks your prompt reply.

I'm really sorry about the error I posted, I was trying to make the report easier and didn't pay attention to the error at all, sorry for that, I was looking for an UnicodeDecodeError and not a SyntaxError, I'm glad you got the idea, thanks for that.

honestly, I don't have a concrete suggestion at this point. I was trying to analyze the code from a security perspective and I decide to raise the discussion here.

as I understand for from your comment, there are other general HTTP guards, that does make sense, that would prevent this issue to be exploited, right? If so, I don't see a clear reason to be changing the code and taking the risk of breaking things around in order to fix the issue. If that is the case, please consider closing this issue.

thanks for your attention.

@andrewgodwin
Copy link
Member

Yes, I don't believe this can be exploited, even for a DoS attack. Thanks for checking, though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants