Skip to content

Conversation

@jasonwashburn
Copy link

Changes headers to use a Python dictionary ( Python: Dict[str, str], Rust: HashMap<String, String>) rather than a list of tuples (Python: List[Tuple[str, str]], Rust: Vec<(String, String)>).

Addresses #25

Copy link
Collaborator

@dicej dicej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks for doing this!

One thing just occurred to me: I believe it's possible to use the same header name multiple times in a request, e.g.

Foo: hello
Foo: world

...which is equivalent to Foo: hello, world (see https://stackoverflow.com/a/24502264). Not sure how common that is, nor whether Hyper (which is what Spin uses to handle HTTP) will automatically consolidate such cases for us. Have you happened to test that scenario to see what happens?

@jasonwashburn
Copy link
Author

Hmm... I haven't, but that's easy enough to test, let me see what it does currently. 🤔

@jasonwashburn
Copy link
Author

So it looks like the current behavior of the Python SDK today is to use the last value that was passed in when there are multiple header fields using the same field-name with different field-values. Behavior is the same for both Request and Response headers. See test script and test output below.

from spin_http import Response

def handle_request(request):
    print(request.headers)
    response = "Headers Received: \n" + "\n".join([str(header) for header in request.headers])
    return Response(200,
                    [("test-response-header", "first"), ("test-response-header", "second")],
                    bytes(f"{response}", "utf-8"))
curl -H "test-request-header: first" -H "test-request-header: second" -X GET http://localhost:3000 -v
Note: Unnecessary use of -X or --request, GET is already inferred.
*   Trying 127.0.0.1:3000...
* Connected to localhost (127.0.0.1) port 3000 (#0)
> GET / HTTP/1.1
> Host: localhost:3000
> User-Agent: curl/7.87.0
> Accept: */*
> test-request-header: first
> test-request-header: second
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< test-response-header: second
< content-length: 356
< date: Tue, 18 Apr 2023 00:40:03 GMT
<
Headers Received:
('host', 'localhost:3000')
('user-agent', 'curl/7.87.0')
('accept', '*/*')
('test-request-header', 'second')
('spin-path-info', '/')
('spin-full-url', 'http://localhost:3000/')
('spin-matched-route', '/...')
('spin-base-path', '/')
('spin-raw-component-route', '/...')
('spin-component-route', '')
* Connection #0 to host localhost left intact
('spin-client-addr', '127.0.0.1:50607')% 

@jasonwashburn
Copy link
Author

Not surprisingly, the Rust SDK behaves in the same way.

use anyhow::Result;
use spin_sdk::{
    http::{Request, Response},
    http_component,
};

/// A simple Spin HTTP component.
#[http_component]
fn handle_spin_rust_header_test(req: Request) -> Result<Response> {
    let mut response: String = String::from("Headers Received:\n");
    for (key, value) in req.headers() {
        response.push_str(&format!(
            "{}: {}\n",
            key.to_string(),
            value.to_str().unwrap()
        ));
    }
    Ok(http::Response::builder()
        .status(200)
        .header("test-response-header", "first")
        .header("test-response-header", "second")
        .body(Some(response.into()))?)
}
curl -H "test-request-header: first" -H "test-request-header: second" -X GET http://localhost:3000 -v
Note: Unnecessary use of -X or --request, GET is already inferred.
*   Trying 127.0.0.1:3000...
* Connected to localhost (127.0.0.1) port 3000 (#0)
> GET / HTTP/1.1
> Host: localhost:3000
> User-Agent: curl/7.87.0
> Accept: */*
> test-request-header: first
> test-request-header: second
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< test-response-header: second
< content-length: 290
< date: Tue, 18 Apr 2023 01:05:47 GMT
<
Headers Received:
host: localhost:3000
user-agent: curl/7.87.0
accept: */*
test-request-header: second
spin-path-info: /
spin-full-url: http://localhost:3000/
spin-matched-route: /...
spin-base-path: /
spin-raw-component-route: /...
spin-component-route:
spin-client-addr: 127.0.0.1:56824
* Connection #0 to host localhost left intact

@jasonwashburn
Copy link
Author

Following definitions back from the HeaderMap struct, it does look like there's consideration for duplicate field-values with the same field-name in the http crate as they get stored in HeaderMap.extra_values See src/header/map.rs#L66

From reading the docs, it seems like you should be able to access multiple values using the get_all() method in HeaderMap Docs, though when trying to use it in the Rust SDK I just get a single value (the last one received).

@itowlson
Copy link
Contributor

We use HeaderMap::iter(), which ought to yield multiple header values (https://docs.rs/http/latest/http/header/struct.HeaderMap.html#method.iter), and push those onto the Headers vector without modification or deduplication (https://github.com/fermyon/spin/blob/3d20ffdab31bac660bb0e3b2387f27641e9f6221/crates/trigger-http/src/spin.rs#L147). Odd that they're not coming through.

@itowlson
Copy link
Contributor

Hmm, the Rust SDK internally, when converting from the WIT representation to the http::Request representation (https://github.com/fermyon/spin/blob/3d20ffdab31bac660bb0e3b2387f27641e9f6221/sdk/rust/macro/src/lib.rs#L82), uses http::header::HeaderMap::insert() (https://docs.rs/http/latest/http/header/struct.HeaderMap.html#method.insert), which is documented as "If the map did have this key present, the new value is associated with the key and all previous values are removed." So I think this might be a bug in the Rust SDK. @fibonacci1729 does that sound plausible?

Copy link
Collaborator

@dicej dicej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging into the "repeated header names" question, @jasonwashburn and @itowlson. Since it looks like we've got an issue in the Rust SDK on which this SDK is based, I'm fine with merging this now and, if necessary, addressing repeated header names later once things are sorted out on the Rust side.

Copy link

@lann lann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is completely valid to send the same header multiple times. It isn't super common in requests but responses often use multiple set-cookie fields.

The easiest fix here is probably to combine duplicate headers with ", ", as the spec suggests: https://httpwg.org/http-core/draft-ietf-httpbis-semantics-latest.html#fields.order

A recipient MAY combine multiple field lines within a field section that have the same field name into one field line, without changing the semantics of the message, by appending each subsequent field line value to the initial field line value in order, separated by a comma (",") and optional whitespace (OWS, defined in Section 5.6.3). For consistency, use comma SP.

@dicej
Copy link
Collaborator

dicej commented Apr 18, 2023

@lann Based on the discussion above, it does not seem possible to address the repeated header name issue in this SDK until it is first addressed in the Rust SDK. Or are you thinking we should make a change here and now, anticipating that the issue on the Rust side will soon be fixed as well?

@lann
Copy link

lann commented Apr 18, 2023

Ah sorry I (clearly) didn't read the thread closely enough.

are you thinking we should make a change here and now, anticipating that the issue on the Rust side will soon be fixed as well?

That would make sense to me. There is already a PR to fix the Rust SDK: spinframework/spin#1407

@jasonwashburn
Copy link
Author

Nice, I'll take a look at the PR and start making the changes necessary here to work with it.

@jasonwashburn
Copy link
Author

Alright, I've changed it to work with changes made in spinframework/spin#1407. It will now consolidate headers with the same name by appending subsequent values separated by ", ".

Example below:

Python component used for testing:

from spin_http import Response


def handle_request(request):
    response_content = "Headers Received:\n" + \
        "\n".join(str(header) for header in request.headers.items())
    return Response(200,
                    {"content-type": "text/plain"},
                    bytes(f"{response_content}", "utf-8"))

Testing with curl: (Note: learned the hard way that some test clients mutate your headers into this format before they send the request. Curl does not.)

curl -H "key1: value1" -H "key2: value2" -H "key1: value3" -H "key2: value4" -X GET 
curl -H "key1: value1" -H "key2: value2" -H "key1: value3" -H "key2: value4" -X GET http://localhost:3000 -v
Note: Unnecessary use of -X or --request, GET is already inferred.
*   Trying 127.0.0.1:3000...
* Connected to localhost (127.0.0.1) port 3000 (#0)
> GET / HTTP/1.1
> Host: localhost:3000
> User-Agent: curl/7.87.0
> Accept: */*
> key1: value1
> key2: value2
> key1: value3
> key2: value4
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-type: text/plain
< content-length: 375
< date: Tue, 25 Apr 2023 11:00:17 GMT
<
Headers Received:
('spin-base-path', '/')
('accept', '*/*')
('user-agent', 'curl/7.87.0')
('host', 'localhost:3000')
('key1', 'value1, value3')
('spin-raw-component-route', '/...')
('spin-client-addr', '127.0.0.1:58046')
('spin-matched-route', '/...')
('key2', 'value2, value4')
('spin-full-url', 'http://localhost:3000/')
('spin-path-info', '/')
* Connection #0 to host localhost left intact
('spin-component-route', '')%      

@jasonwashburn jasonwashburn requested a review from lann April 25, 2023 11:17
@lann lann requested a review from dicej April 25, 2023 13:29
Copy link

@lann lann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm but I'd like @dicej to review too

@lann
Copy link

lann commented Apr 25, 2023

@jasonwashburn Thanks for the PR! The Spin project requires sign-off on all commits which is just an assertion that you are the author/copyright holder of the code.

@jasonwashburn
Copy link
Author

Turns out unit testing with pyO3 isn't particularly straightforward, less so when you throw in cPython for WASM 😬...however, following in the footsteps of what exists currently I've added an endpoint/test for the test in /test that's used in CI. Should be pushing that change shortly.

Copy link
Collaborator

@dicej dicej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for sticking with this, @jasonwashburn!

Looks like we just need signed commits to make CI happy. Could you do something like git rebase --exec 'git commit --amend --no-edit -n -S' -i origin/master to sign everything?

@lann
Copy link

lann commented Apr 26, 2023

The DCO check is about sign off (-s), not signing (-S). I'm not sure whether this repo is enforcing signing.

Signed-off-by: Jason Washburn <wburn@wburn.net>
Signed-off-by: Jason Washburn <wburn@wburn.net>
Signed-off-by: Jason Washburn <wburn@wburn.net>
Signed-off-by: Jason Washburn <wburn@wburn.net>
Signed-off-by: Jason Washburn <wburn@wburn.net>
Signed-off-by: Jason Washburn <wburn@wburn.net>
Signed-off-by: Jason Washburn <wburn@wburn.net>
@jasonwashburn
Copy link
Author

jasonwashburn commented Apr 26, 2023

Looks like its enforcing both the sign off commit message and actual signing, but both are complete now. Added to the test that's running in github actions, though I had to bump the spin version being used to canary from 0.9.0 to get the changes from the fix to the main sdk. If that's going to be an issue I can either back the test/version change out, or we could wait for the next stable spin release to merge it in? Otherwise, I think it should be good to go.

@lann
Copy link

lann commented Apr 26, 2023

I think it is reasonable for PRs against main to be tested against spin main (tracked by canary). It might make sense for the release build to run tests against the minimum spin release it targets.

@dicej dicej merged commit 09d5447 into spinframework:main Apr 26, 2023
@jasonwashburn jasonwashburn deleted the headers-to-dict branch April 26, 2023 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants