Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mongoose accepts requests containing multiple differing Content-Length headers. #2763

Closed
kenballus opened this issue May 23, 2024 · 7 comments

Comments

@kenballus
Copy link

To see this for yourself,

  1. Compile and run the following Mongoose HTTP server:
// Copyright (c) 2020 Cesanta Software Limited
// All rights reserved
// Heavily modified by Ben Kallus.

#include <stdio.h>
#include "mongoose.h"

static int const s_debug_level = MG_LL_INFO;
static char const s_listening_address[] = "http://0.0.0.0:80";

// Handle interrupts, like Ctrl-C
static int s_signo;
static void signal_handler(int signo) {
    s_signo = signo;
}

static size_t base64_encoded_len(size_t const plaintext_len) {
    // Base64-encoding takes 3n bytes to 4n bytes.
    // Then, there's padding, so we add on a correction term.
    // Note that we do not care about '\0' terminators here.
    return 4 * (plaintext_len / 3) + (plaintext_len % 3 != 0 ? 4 : 0);
}

#define INITIAL_BUF_SIZE (65535)

// *buf: A pointer to a heap-allocated buffer                      (might be `realloc`ed)
// *buf_size: The size of buf                                      (gets adjusted if *buf is `realloc`ed)
// *buf_size_remaining: The number of free bytes at the end of buf (gets adjusted if *buf is `realloc`ed)
// src: A buffer that we'll copy into buf
// src_size: The number of bytes to copy from src to buf.
void add_to_buffer(char **const buf_ptr, size_t *const buf_size_ptr, size_t *const buf_size_remaining_ptr, char const *const src, size_t const src_size) {
    char *buf = *buf_ptr;
    size_t buf_size = *buf_size_ptr;
    size_t buf_size_remaining = *buf_size_remaining_ptr;
    while (buf_size_remaining < src_size) {                              // Is there enough room?
        char *const new_buf = realloc(buf, buf_size + INITIAL_BUF_SIZE); // If not, make buf INITIAL_BUF_SIZE bigger.
        if (new_buf == NULL) {                                           // Did it work?
            puts("Out of memory!");
            exit(1);                                                     // If not, fail.
        }
        buf = new_buf;                                                   // Update buf and friends.
        buf_size += INITIAL_BUF_SIZE;
        buf_size_remaining += INITIAL_BUF_SIZE;
    }
    memcpy(buf + (buf_size - buf_size_remaining), src, src_size);        // Copy the data into the buffer
    buf_size_remaining -= src_size;

    *buf_ptr = buf;                                                      // Fill the in-out args
    *buf_size_ptr = buf_size;
    *buf_size_remaining_ptr = buf_size_remaining;
}

static char B64_SCRATCH_SPACE[INITIAL_BUF_SIZE];
static char STRCAT_SCRATCH_SPACE[INITIAL_BUF_SIZE];
static char const PAYLOAD_START[] =              "{\"headers\":[";
static char const PAYLOAD_FIRST_HEADER_START[] = "[\"";
static char const PAYLOAD_HEADER_START[] =       ",[\"";
static char const PAYLOAD_HEADER_MID[] =         "\",\"";
static char const PAYLOAD_HEADER_END[] =         "\"]";
static char const PAYLOAD_BODY_BEGIN[] =         "],\"body\":\"";
static char const PAYLOAD_URI_BEGIN[] =          "\",\"uri\":\"";
static char const PAYLOAD_METHOD_BEGIN[] =       "\",\"method\":\"";
static char const PAYLOAD_VERSION_BEGIN[] =      "\",\"version\":\"";
static char const PAYLOAD_END[] =                "\"}";
static void cb(struct mg_connection *c, int ev, void *ev_data) {
    if (ev != MG_EV_HTTP_MSG) {
        return;
    }
    struct mg_http_message parsed_request = {0};
    parsed_request = *(struct mg_http_message *)ev_data;
    // mg_http_parse((char *)c->recv.buf, c->recv.len, &parsed_request);

    char *buf = (char *)malloc(INITIAL_BUF_SIZE); // The buffer that contains the request.
    size_t buf_size = INITIAL_BUF_SIZE;           // Its size
    size_t buf_size_remaining = INITIAL_BUF_SIZE; // The amount of buf that's left
    add_to_buffer(&buf, &buf_size, &buf_size_remaining, PAYLOAD_START, sizeof(PAYLOAD_START) - 1);

    for (int i = 0; i < MG_MAX_HTTP_HEADERS; i++) {
        if (parsed_request.headers[i].name.buf == NULL && parsed_request.headers[i].value.buf == NULL) { // Empty header signifies end of headers?
            break;
        }
        struct mg_http_header const curr = parsed_request.headers[i];
        if (i == 0) {
            add_to_buffer(&buf, &buf_size, &buf_size_remaining, PAYLOAD_FIRST_HEADER_START, sizeof(PAYLOAD_FIRST_HEADER_START) - 1);
        } else {
            add_to_buffer(&buf, &buf_size, &buf_size_remaining, PAYLOAD_HEADER_START, sizeof(PAYLOAD_HEADER_START) - 1);
        }

        // Name
        if (base64_encoded_len(curr.name.len) + 1 > sizeof(B64_SCRATCH_SPACE)) { // + 1 because mg_base64_encode adds a null
            mg_http_reply(c, 400, "", "Header name too long");
            free(buf);
            return;
        }
        mg_base64_encode((unsigned char *)curr.name.buf, curr.name.len, B64_SCRATCH_SPACE, sizeof(B64_SCRATCH_SPACE));
        add_to_buffer(&buf, &buf_size, &buf_size_remaining, B64_SCRATCH_SPACE, base64_encoded_len(curr.name.len));
        add_to_buffer(&buf, &buf_size, &buf_size_remaining, PAYLOAD_HEADER_MID, sizeof(PAYLOAD_HEADER_MID) - 1);

        // Value
        if (base64_encoded_len(curr.value.len) + 1 > sizeof(B64_SCRATCH_SPACE)) { // + 1 because mg_base64_encode adds a null
            mg_http_reply(c, 400, "", "Header value too long");
            free(buf);
            return;
        }
        mg_base64_encode((unsigned char *)curr.value.buf, curr.value.len, B64_SCRATCH_SPACE, sizeof(B64_SCRATCH_SPACE));
        add_to_buffer(&buf, &buf_size, &buf_size_remaining, B64_SCRATCH_SPACE, base64_encoded_len(curr.value.len));
        add_to_buffer(&buf, &buf_size, &buf_size_remaining, PAYLOAD_HEADER_END, sizeof(PAYLOAD_HEADER_END) - 1);
    }

    // Body
    if (base64_encoded_len(parsed_request.body.len) + 1 > sizeof(B64_SCRATCH_SPACE)) {
        mg_http_reply(c, 400, "", "Request body too large");
        free(buf);
        return;
    }
    add_to_buffer(&buf, &buf_size, &buf_size_remaining, PAYLOAD_BODY_BEGIN, sizeof(PAYLOAD_BODY_BEGIN) - 1);
    mg_base64_encode((unsigned char *)parsed_request.body.buf, parsed_request.body.len, B64_SCRATCH_SPACE, sizeof(B64_SCRATCH_SPACE));
    add_to_buffer(&buf, &buf_size, &buf_size_remaining, B64_SCRATCH_SPACE, base64_encoded_len(parsed_request.body.len));

    // URI
    if (parsed_request.uri.len + 1 + parsed_request.query.len + 1 > sizeof(STRCAT_SCRATCH_SPACE)) {
        mg_http_reply(c, 400, "", "Request URI too long");
        free(buf);
        return;
    }
    size_t uri_len = 0;
    STRCAT_SCRATCH_SPACE[uri_len] = '\0';
    if (parsed_request.uri.len > 0) {
        memcpy(STRCAT_SCRATCH_SPACE, parsed_request.uri.buf, parsed_request.uri.len);
        uri_len += parsed_request.uri.len;
        STRCAT_SCRATCH_SPACE[uri_len] = '\0';
    }
    if (parsed_request.query.len > 0) {
        STRCAT_SCRATCH_SPACE[uri_len] = '?';
        uri_len++;
        memcpy(STRCAT_SCRATCH_SPACE + uri_len, parsed_request.query.buf, parsed_request.query.len);
        uri_len += parsed_request.query.len;
        STRCAT_SCRATCH_SPACE[uri_len] = '\0';
    }
    if (base64_encoded_len(uri_len) + 1 > sizeof(B64_SCRATCH_SPACE)) {
        mg_http_reply(c, 400, "", "Request URI too long");
        free(buf);
        return;
    }
    add_to_buffer(&buf, &buf_size, &buf_size_remaining, PAYLOAD_URI_BEGIN, sizeof(PAYLOAD_URI_BEGIN) - 1);
    mg_base64_encode((unsigned char *)STRCAT_SCRATCH_SPACE, uri_len, B64_SCRATCH_SPACE, sizeof(B64_SCRATCH_SPACE));
    add_to_buffer(&buf, &buf_size, &buf_size_remaining, B64_SCRATCH_SPACE, base64_encoded_len(uri_len));

    // Method
    if (base64_encoded_len(parsed_request.method.len) + 1 > sizeof(B64_SCRATCH_SPACE)) {
        mg_http_reply(c, 400, "", "Request method too long");
        free(buf);
        return;
    }
    add_to_buffer(&buf, &buf_size, &buf_size_remaining, PAYLOAD_METHOD_BEGIN, sizeof(PAYLOAD_METHOD_BEGIN) - 1);
    mg_base64_encode((unsigned char *)parsed_request.method.buf, parsed_request.method.len, B64_SCRATCH_SPACE, sizeof(B64_SCRATCH_SPACE));
    add_to_buffer(&buf, &buf_size, &buf_size_remaining, B64_SCRATCH_SPACE, base64_encoded_len(parsed_request.method.len));

    // Version
    if (base64_encoded_len(parsed_request.method.len) + 1 > sizeof(B64_SCRATCH_SPACE)) {
        mg_http_reply(c, 400, "", "Request method too long");
        free(buf);
        return;
    }
    add_to_buffer(&buf, &buf_size, &buf_size_remaining, PAYLOAD_VERSION_BEGIN, sizeof(PAYLOAD_VERSION_BEGIN) - 1);
    mg_base64_encode((unsigned char *)parsed_request.proto.buf, parsed_request.proto.len, B64_SCRATCH_SPACE, sizeof(B64_SCRATCH_SPACE));
    add_to_buffer(&buf, &buf_size, &buf_size_remaining, B64_SCRATCH_SPACE, base64_encoded_len(parsed_request.proto.len));

    add_to_buffer(&buf, &buf_size, &buf_size_remaining, PAYLOAD_END, sizeof(PAYLOAD_END));

    mg_http_reply(c, 200, "Content-Type: application/json\r\n", "%s", buf);
    free(buf);
}

int main(void) {
    // Initialise stuff
    signal(SIGINT, signal_handler);
    signal(SIGTERM, signal_handler);
    mg_log_set(s_debug_level);

    struct mg_mgr mgr;
    mg_mgr_init(&mgr);

    struct mg_connection *c = mg_http_listen(&mgr, s_listening_address, cb, NULL);
    if (c == NULL) {
        MG_ERROR(("Cannot listen on %s. Use http://ADDR:PORT or :PORT", s_listening_address));
        exit(EXIT_FAILURE);
    }

    // Start infinite event loop
    MG_INFO(("Mongoose version : v%s", MG_VERSION));
    MG_INFO(("Listening on     : %s", s_listening_address));
    while (s_signo == 0) {
        mg_mgr_poll(&mgr, 1000);
    }
    mg_mgr_free(&mgr);
    MG_INFO(("Exiting on signal %d", s_signo));
    return 0;
}
  1. Run the following to send it a request with two Content-Length headers, and observe that Mongoose does not reject the request, and prioritizes the first over the second:
printf 'POST / HTTP/1.1\r\nHost: whatever\r\nContent-Length: 34\r\nContent-Length: 0\r\n\r\nGET / HTTP/1.1\r\nHost: whatever\r\n\r\n' | nc localhost 80

That Mongoose prioritizes the first over the second is clear from the fact that the server responds only once. If the values of the two CL headers are swapped, then it will respond twice. This is unexpected because messages with multiple differeng Content-Length headers are disallowed by the RFCs because of request smuggling concerns.

Ideally, the fix here would be to reject messages that contain multiple Content-Length headers. This is what most HTTP servers do.

Environment

  • mongoose version: master (6bb40e6)
  • Compiler: GCC 13.2.0
  • Platform: AMD64 Linux
@cpq
Copy link
Member

cpq commented Jun 17, 2024

We won't fix this. If the client is broken, we can't help it.

@cpq cpq closed this as completed Jun 17, 2024
@kenballus
Copy link
Author

This isn't about broken clients; it's about broken gateways. If you accept requests with conflicting CL headers, then you don't know which one the gateway prioritizes. Therefore, the only correct thing to do is reject the message.

If Mongoose continues to instead prioritize the last one arbitrarily, then Mongoose is rendered vulnerable to request smuggling when the gateway prioritized the first one, but forwarded both.

I am aware of a few gateway servers that demonstrate this behavior.

@cpq
Copy link
Member

cpq commented Jun 17, 2024

I believe Mongoose always chooses the first content-length.
I really don't see why mongoose needs to add an extra checks in its code to fix broken gateways.

@kenballus
Copy link
Author

I believe Mongoose always chooses the first content-length.

My mistake; I had it flipped in my head. (Of course, the same concerns still apply)

I really don't see why mongoose needs to add an extra checks in its code to fix broken gateways.

This exact same logic applies in the other direction, though.

RFC 7230 says the following w.r.t this issue:

If a message is received that has multiple Content-Length header
fields with field-values consisting of the same decimal value, or a
single Content-Length header field with a field value containing a
list of identical decimal values (e.g., "Content-Length: 42, 42"),
indicating that duplicate Content-Length header fields have been
generated or combined by an upstream message processor, then the
recipient MUST either reject the message as invalid or replace the
duplicated field-values with a single valid Content-Length field
containing that decimal value prior to determining the message body
length or forwarding the message.

Thus, because Mongoose doesn't reject messages with multiple conflicting Content-Length headers, Mongoose is also "broken." The developers of a gateway that forwards such invalid messages could just as easily say "we don't need to add an extra check to account for broken origin servers."

Ultimately, both parties that are at fault, so both should be fixed. The RFCs are pretty clear that these messages are invalid and must be rejected by any recipient, gateway or origin.

@cpq
Copy link
Member

cpq commented Jun 18, 2024

@kenballus That makes sense, thank you - the only part I did not get is this:

Thus, because Mongoose doesn't reject messages with multiple conflicting Content-Length headers, Mongoose is also "broken." The developers of a gateway that forwards such invalid messages could just as easily say "we don't need to add an extra check to account for broken origin servers."

I don't see how a developers of a gateway may see messages from Mongoose with multiple Content-Length. We are talking about inbound messages. Those which Mongoose should reject. So, how exactly gateway development sticks here? If there is an incoming message with multiple headers, it is not Mongoose-generated.

Could you clarify please - maybe a simple case where Mongoose comes broken?

@kenballus
Copy link
Author

You're saying that you don't need to add a check to reject messages with multiple CL headers because developers of gateways should ensure that they don't send messages containing multiple CL headers.

The wrinkle in that argument is that developers of gateways can just as easily say that they don't need to add an extra check to ensure that outbound messages have <=1 CL value because it's the job of origin servers to reject all messages with multiple Content-Length values.

This is circular. Both parties need to ensure that

  1. such messages are both rejected, and
  2. such messages are never produced.

@kenballus
Copy link
Author

Could you clarify please - maybe a simple case where Mongoose comes broken?

I guess I'm not certain what you're asking here. As far as I'm concerned, "broken" is defined by the standards. Maybe we have different working definitions of "broken?"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants