Skip to content

Conversation

@ilevyor
Copy link
Contributor

@ilevyor ilevyor commented Sep 7, 2021

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@ilevyor ilevyor requested review from TingDaoK and graebm September 7, 2021 16:14
@ilevyor ilevyor requested a review from graebm September 13, 2021 20:05
Comment on lines 83 to 91
struct aws_h1_trailer *aws_h1_trailer_new(
struct aws_allocator *allocator,
const struct aws_http_headers *trailing_headers);

/* Just destroy the chunk (don't fire callback) */
void aws_h1_chunk_destroy(struct aws_h1_chunk *chunk);

void aws_h1_trailer_destroy(struct aws_h1_trailer *trailer);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utterly trivial: put the 2 aws_h1_trailer functions next to each other.

it's nice to group functions for the same "class" together

* Encoder completes/frees/pops front chunk when it's done sending. */
struct aws_linked_list pending_chunk_list;

struct aws_h1_encoder_message *message;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the ACTUAL struct aws_h1_encoder_message encoder_message; into thread_data

it's really weird having a pointer, that just points to another member in the same class

Comment on lines 636 to 640
* fields necessary for message framing (e.g., Transfer-Encoding and Content-Length),
* routing (e.g., Host), request modifiers (e.g., controls and conditionals in Section 5 of [RFC7231]),
* authentication (e.g., see [RFC7235] and [RFC6265]), response control data (e.g., see Section 7.1 of [RFC7231]),
* or determining how to process the payload (e.g., Content-Encoding, Content-Type, Content-Range, and Trailer)
* must not be used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: This is a verrry run on sentence, where you've buried the important part "must not be used" at the very end. Keep it simple. Suggestion:

Certain headers are forbidden in the trailer (e.g., Transfer-Encoding, Content-Length, Host). See RFC-7541 Section 4.1.2 for more details.

struct aws_http_connection *aws_http_stream_get_connection(const struct aws_http_stream *stream);

/* Only valid in "request" streams, once response headers start arriving */
/* Only valid in "request" streams, once response headers start arrivi ng */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this accidental change

* For client streams, activate() must be called before any chunks are submitted.
*
* // Is this correct?
* // For server streams, the response must be submitted before any chunks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more or less. I'd suggest:

For server streams, the response must be submitted before the trailer can be added


void aws_h1_trailer_destroy(struct aws_h1_trailer *trailer) {
/* should this check be here? it makes it so that I need to check before calling this function */
AWS_PRECONDITION(trailer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for destroy() and release() calls, we usually add a check like:

if (trailer == NULL) {
    return;
}

so that they can be called blindly without checking if the ptr is null or not

so like, if destroy() worked that way, you could alter trailer_new() to do all its error handling like:

    if (step 1 goes wrong) {
        goto error;
    }

    if (step 2 goes wrong) {
        goto error;
    }

    if (step 3 goes wrong) {
        goto error;
    }

    ...
    /* success! */
    return trailer;
    
error:
    aws_h1_trailer_destroy(trailer);
    return NULL;
}

AWS_ASSERT(wrote_all);
}

static int s_headers_size(const struct aws_http_headers *headers, size_t *out_size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have nit-picky feelings about how we can share this code for trailers and traditional headers but it would take too long to type out, let's just talk tomorrow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can think of ways to share the code, but it involves either scanning it twice, or doing something weird like "visitor pattern" with callbacks.

maybe a big copy/paste is fine. But at least please put the 2 functions 1 after the other in the code. And please give this a clearer name like s_scan_outgoing_trailer()

}
if (!done) {
/* Remain in this state until done writing out CRLF */
/* Remain in this state until we're done writing out body */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: done writing out body -> done writing out trailer

const struct aws_http_header trailer = {
.name = AWS_BYTE_CUR_INIT_FROM_STRING_LITERAL("chunked"),
.value = AWS_BYTE_CUR_INIT_FROM_STRING_LITERAL("trailer"),
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: add another header, since there's only 1 test may as well stress what you can

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, since this was actually broken for a while and the test didn't catch it. I caught it when running through the debugger. This debugger thing is pretty useful.


struct aws_h1_trailer *trailer = aws_mem_calloc(allocator, 1, sizeof(struct aws_h1_trailer));
trailer->allocator = allocator;
if (!trailer) {
Copy link
Contributor

@TingDaoK TingDaoK Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check null before assigning. Otherwise, it will crash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually. our policy on allocations has changed
you don't need to check the results of aws_mem_calloc() or aws_mem_acquire() anymore. They cannot return NULL anymore. The program will fail if they can't get memory.

So you can remove this error-checking and keep the program simpler

@ilevyor ilevyor requested review from TingDaoK and graebm October 6, 2021 16:58

struct aws_h1_trailer *trailer = aws_mem_calloc(allocator, 1, sizeof(struct aws_h1_trailer));
trailer->allocator = allocator;
if (!trailer) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually. our policy on allocations has changed
you don't need to check the results of aws_mem_calloc() or aws_mem_acquire() anymore. They cannot return NULL anymore. The program will fail if they can't get memory.

So you can remove this error-checking and keep the program simpler

}
trailer->allocator = allocator;

if (aws_byte_buf_init(&trailer->trailer_data, allocator, trailer_size)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise, this can't fail anymore. feel free to comment it like:

aws_byte_buf_init(&trailer->trailer_data, allocator, trailer_size); /* cannot fail */

/* @graebm should I be more discerning about which headers I ban? I figure any header in the enum is used in
* some way for message framing which is banned as per the rfc */
enum aws_http_header_name name_enum = aws_http_str_to_header_name(header.name);
if (name_enum != AWS_HTTP_HEADER_UNKNOWN) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UNKNOWN doesn't mean it's illegal, it just means that it's not one of the few special names we have in this enum

/**
 * Headers that affect internal processing.
 * This is NOT a definitive list of headers.
 */
enum aws_http_header_name {
    AWS_HTTP_HEADER_UNKNOWN, /* Unrecognized value */

}

/* @graebm should I be more discerning about which headers I ban? I figure any header in the enum is used in
* some way for message framing which is banned as per the rfc */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, absolutely, do some homework and ban everything that the RFC says MUST NOT about.

Especially ban "Connection-Length" and "Transfer-Encoding" which can be used for request smuggling

AWS_ASSERT(wrote_all);
}

static int s_headers_size(const struct aws_http_headers *headers, size_t *out_size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can think of ways to share the code, but it involves either scanning it twice, or doing something weird like "visitor pattern" with callbacks.

maybe a big copy/paste is fine. But at least please put the 2 functions 1 after the other in the code. And please give this a clearer name like s_scan_outgoing_trailer()

void aws_h1_trailer_destroy(struct aws_h1_trailer *trailer) {
/* should this check be here? it makes it so that I need to check before calling this function */
if (trailer == NULL) {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YES it is great to do null-checks on destroy() functions.

Imagine a thing class with 3 members inside it: a, b, c and any of those 3 members could fail during creation. If destroy() functions didn't check NULL, then error-handling in the thing_new() function looks like this (a bit annoying to do so many null-checks, and a little error-prone):

error:
    if (thing->a) {
        a_destroy(a);
    }
    if (thing->b) {
        b_destroy(b);
    }
    if (thing->c) {
        c_destroy(c);
    }
    aws_mem_release(alloc, thing)
    return NULL;
}

or you avoid the if-checks by having distinct goto labels like, and tear things down in reverse-init order which is (VERY FRAGILE AND EASY TO SCREW UP):

c_failed:
    b_destroy(thing->b);
b_failed:
    a_destroy(thing->a);
a_failed:
    aws_mem_release(alloc, thing)
    return NULL;
}

but if we know that destroy() functions check for null, then we can just be simple and do like:

error:
    a_destroy(thing->a);
    b_destroy(thing->b);
    c_destroy(thing->c);
    aws_mem_release(alloc, thing);
    return NULL;
}

or even better, just have the thing call its own destructor:

error:
    thing_destroy(thing);
    return NULL;
}

and since I'm documenting every possible error handling strategy, here's the very worst: If, at any step things go wrong, try to clean up each previous step:

struct thing* thing_new(struct allocator *alloc, ...);
    Thing* thing = aws_mem_calloc(...);
    thing->alloc = alloc;

    A* thing->a = a_new(...);
    if (thing->a == NULL) {
        aws_mem_release(alloc, thing);
        return NULL;
    }

    B* thing->b = b_new(...);
    if (thing->b == NULL) {
        a_destroy(thing->a);
        aws_mem_release(alloc, thing);
        return NULL;
    }

    C* thing->c = c_new(...);
    if (thing->c == NULL) {
        a_destroy(thing->a);
        b_destroy(thing->b);
        aws_mem_release(alloc, thing);
        return NULL;
    }
    
    return thing;
}

which is UNGODLY FRAGILE and hard to get right
and that, boys and girls, is why we use goto in C

Copy link

@sdavtaker sdavtaker Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you check for the nulls, just having a destroy_thing function wouldn't do the same as

error:
    thing_destroy(thing);
    return NULL;
}

But allow you to return to the place you called for the destroy? Not looking in this particular code, just in general.
Use case would be to emit log, or put an assert or some other thing after the destroy is executed.

"chunked: trailer"
"\r\n"
"another: test"
"\r\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please but the CRLF in the same line as the header? Also, shouldn't there be one more CRLF indicating that the trailer header has ended?

...
"0\r\n"
"chunked: trailer\r\n"
"another: test\r\n"
"\r\n" <-- MISSING ?!?

return AWS_OP_SUCCESS;
}

H1_CLIENT_TEST_CASE(h1_client_request_send_chunked_trailer) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a helper function that just takes the input-headers, and the expected string, so we can easily write more trailer tests, and stress cases like:

  • forbidden header names
  • empty headers

@ilevyor ilevyor requested a review from graebm October 12, 2021 19:11
@ilevyor ilevyor merged commit 07cac78 into main Oct 25, 2021
@ilevyor ilevyor deleted the chunked_trailer branch October 25, 2021 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants