Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http.server parse_request() bug and error reporting #72734

Closed
warsaw opened this issue Oct 28, 2016 · 9 comments
Closed

http.server parse_request() bug and error reporting #72734

warsaw opened this issue Oct 28, 2016 · 9 comments
Labels
3.7 only security fixes stdlib Python modules in the Lib dir

Comments

@warsaw
Copy link
Member

warsaw commented Oct 28, 2016

BPO 28548
Nosy @warsaw, @vadmium
PRs
  • [Do Not Merge] Convert Misc/NEWS so that it is managed by towncrier #552
  • Files
  • parse-version.patch
  • parse-version.v2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-11-19.03:06:33.902>
    created_at = <Date 2016-10-28.15:57:09.762>
    labels = ['3.7', 'library']
    title = 'http.server parse_request() bug and error reporting'
    updated_at = <Date 2017-03-31.16:36:25.199>
    user = 'https://github.com/warsaw'

    bugs.python.org fields:

    activity = <Date 2017-03-31.16:36:25.199>
    actor = 'dstufft'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-11-19.03:06:33.902>
    closer = 'martin.panter'
    components = ['Library (Lib)']
    creation = <Date 2016-10-28.15:57:09.762>
    creator = 'barry'
    dependencies = []
    files = ['45259', '45300']
    hgrepos = []
    issue_num = 28548
    keywords = ['patch']
    message_count = 9.0
    messages = ['279611', '279634', '279637', '279639', '279640', '279646', '279673', '281190', '281194']
    nosy_count = 3.0
    nosy_names = ['barry', 'python-dev', 'martin.panter']
    pr_nums = ['552']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue28548'
    versions = ['Python 3.7']

    @warsaw
    Copy link
    Member Author

    warsaw commented Oct 28, 2016

    This might also affect other Python version; I haven't checked, but I know it affects Python 3.5.

    In Mailman 3, we have a subclass of WSGIRequestHandler for our REST API, and we got a bug report about an error condition returning a 400 in the body of the response, but still returning an implicit 200 header.

    https://gitlab.com/mailman/mailman/issues/288

    This is pretty easily reproducible with the following recipe.

    $ git clone https://gitlab.com/mailman/mailman.git
    $ cd mailman
    $ tox -e py35 --notest -r
    $ .tox/py35/bin/python3.5 /home/barry/projects/mailman/trunk/.tox/py35/bin/runner --runner=rest:0:1 -C /home/barry/projects/mailman/trunk/var/etc/mailman.cfg 

    (Note that you might need to run .tox/py35/bin/mailman info first, and of course you'll have to adjust the directories for your own local fork.)

    Now, in another shell, do the following:

    $ curl -v -i -u restadmin:restpass "http://localhost:8001/3.0/lists/list example.com"

    Note specifically that there is a space right before the "example.com" bit.

    Take note also that we're generating an HTTP/1.1 request as per curl default.

    The response you get is:

    • Trying 127.0.0.1...
    • Connected to localhost (127.0.0.1) port 8001 (#0)
    • Server auth using Basic with user 'restadmin'

    GET /3.0/lists/list example.com HTTP/1.1
    Host: localhost:8001
    Authorization: Basic cmVzdGFkbWluOnJlc3RwYXNz
    User-Agent: curl/7.50.1
    Accept: */*

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd"\>
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <title>Error response</title>
    </head>
    <body>
    <h1>Error response</h1>
    <p>Error code: 400</p>
    <p>Message: Bad request syntax ('GET /3.0/lists/list example.com HTTP/1.1').</p>
    <p>Error code explanation: HTTPStatus.BAD_REQUEST - Bad request syntax or unsupported method.</p>
    </body>
    </html>

    • Connection #0 to host localhost left intact

    Notice the lack of response headers, and thus the implicit 200 return even though the proper error code is in the body of the response. Why does this happen?

    Now look at http.server.BaseHTTPRequestHandler. The default_request_version is "HTTP/0.9". Given the request, you'd expect to see the version set to "HTTP/1.1", but that doesn't happen because the extra space messes up the request parsing. parse_request() splits the request line by spaces and when this happens, the wrong number of words shows up. We get 4 words, thus the 'else:' clause in parse_request() gets triggered. So far so good.

    This eventually leads us to send_error() and from there into send_response() with the error code (properly 400) and message. From there we get to .send_response_only() and tracing into this function shows you where things go wrong.

    send_response_only() has an explicit test on self.request_version != 'HTTP/0.9', in which case it adds nothing to the _header_buffer. Well, because the request parsing got the unexpected number of words, in fact request_version *is* the default HTTP/0.9. Thus the error headers are never added.

    There are several problems here. Why are the headers never added when the request is HTTP/0.9? (I haven't read the spec.) Also, since parse_request() isn't setting the request version to 'HTTP/1.1'. It should probably dig out the words[-1] and see if that looks like a version string.

    Clearly the request isn't properly escaped, but http.server should not be sending an implicit 200 when the request is bogus.

    @warsaw warsaw added the stdlib Python modules in the Lib dir label Oct 28, 2016
    @vadmium
    Copy link
    Member

    vadmium commented Oct 28, 2016

    I think you should be able to reproduce this without Mailman or tox, by just running “python -m http.server”.

    The problem is the “HTTP/0.9” protocol that Python is assuming does not include a header section, so there is no place to put a 400 status code or header fields. The HTTP 0.9 response is supposed to only be a HTML body; see <https://www.w3.org/Protocols/HTTP/AsImplemented.html\> and <https://tools.ietf.org/html/rfc1945#section-6\>.

    I think we should drop HTTP 0.9 response support from Python’s HTTP server, as well as the attempted but buggy request support. But there was a bit of resistance; see bpo-10721.

    Another possibility would be to change default_request_version so that error responses are sent as HTTP 1.0. But there may be more fixes needed for this to continue the buggy HTTP 0.9 support; see bpo-26578.

    @vadmium
    Copy link
    Member

    vadmium commented Oct 28, 2016

    Parsing the version from words[-1] rather than words[2] may be a minor improvement however.

    @warsaw
    Copy link
    Member Author

    warsaw commented Oct 28, 2016

    On Oct 28, 2016, at 10:42 PM, Martin Panter wrote:

    I think you should be able to reproduce this without Mailman or tox, by just
    running “python -m http.server”.

    Yep, indeed.

    The problem is the “HTTP/0.9” protocol that Python is assuming does not
    include a header section, so there is no place to put a 400 status code or
    header fields. The HTTP 0.9 response is supposed to only be a HTML body; see
    <https://www.w3.org/Protocols/HTTP/AsImplemented.html\> and
    <https://tools.ietf.org/html/rfc1945#section-6\>.

    I think we should drop HTTP 0.9 response support from Python’s HTTP server,
    as well as the attempted but buggy request support. But there was a bit of
    resistance; see bpo-10721.

    Another possibility would be to change default_request_version so that error
    responses are sent as HTTP 1.0. But there may be more fixes needed for this
    to continue the buggy HTTP 0.9 support; see bpo-26578.

    I think it may indeed make sense to set default_request_version to 1.0. It's
    probably too late to do that for 3.6, but I do think it makes sense for 3.7.
    I'm nosying on the issues you mentioned.

    I've proposed essentially that workaround for Mailman.

    https://gitlab.com/mailman/mailman/issues/288

    We can get away with requiring at least HTTP/1.1 for our REST clients.

    @warsaw
    Copy link
    Member Author

    warsaw commented Oct 28, 2016

    On Oct 28, 2016, at 10:57 PM, Martin Panter wrote:

    Parsing the version from words[-1] rather than words[2] may be a minor
    improvement however.

    Indeed; I thought about that too.

    @vadmium
    Copy link
    Member

    vadmium commented Oct 29, 2016

    Here is a patch to parse the version from the request in more cases. Since it is more of a cosmetic improvement for handling erroneous requests, I would probably only apply it to the next Python version (3.7 atm). Or do you think it should go into bug fix versions as well?

    @warsaw
    Copy link
    Member Author

    warsaw commented Oct 29, 2016

    On Oct 29, 2016, at 02:20 AM, Martin Panter wrote:

    Here is a patch to parse the version from the request in more cases. Since it
    is more of a cosmetic improvement for handling erroneous requests, I would
    probably only apply it to the next Python version (3.7 atm). Or do you think
    it should go into bug fix versions as well?

    That would be a RM decision.

    @vadmium vadmium added the 3.7 only security fixes label Nov 1, 2016
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 19, 2016

    New changeset 7c98768368cb by Martin Panter in branch 'default':
    Issue bpo-28548: Parse HTTP request version even if too many words received
    https://hg.python.org/cpython/rev/7c98768368cb

    @vadmium
    Copy link
    Member

    vadmium commented Nov 19, 2016

    Okay committed to 3.7 for the moment. I think that is all we can reasonably do unless we drop the pseudo-HTTP-0.9 support.

    @vadmium vadmium closed this as completed Nov 19, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 only security fixes stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants