Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The pseudo-header :status is included in h2 fetch. #2415

Closed
tsctx opened this issue Nov 9, 2023 · 19 comments · Fixed by #2422
Closed

The pseudo-header :status is included in h2 fetch. #2415

tsctx opened this issue Nov 9, 2023 · 19 comments · Fixed by #2422
Labels
bug Something isn't working

Comments

@tsctx
Copy link
Member

tsctx commented Nov 9, 2023

Bug Description

The pseudo-header :status can be obtained by h2 fetch.
A fix for this problem is needed in #2410.

Reproducible By

There is a simple test.

// test.js
(async () => {
  const http2 = require("node:http2");

  const undici = require("undici");

  const serverOptions = {
    key:
      "-----BEGIN RSA PRIVATE KEY-----\r\n" +
      "MIICXAIBAAKBgQCqK4GJEFelA5sWipPsBgONPuwxAL3io9GexKlTv6YPsXkfbgkP\r\n" +
      "a8P6zbzqt4eOVjFtDyqrMkJDmc4gRqGNk3+VzOPqaTLcrvOgDAFfY9eF+Z4KmVut\r\n" +
      "JYTYE/3antTOtcP5yAfVVxkpErvvNhE+0+Vtro6FpfGu93iWGsALtM6p5QIDAQAB\r\n" +
      "AoGAfjyB/pTR/8baAPHSZA/eCKT6ACwlsQCH23jWiBsu7ItiaA9vxzHRk7eWAqUd\r\n" +
      "eVi1l8rdg1xoTc/GWqkq5WTbCqbfBE23UQF65wXcBg+sB7aPcpK48PsnIKM3x0ty\r\n" +
      "Xe33hkSQUn9WzH6TTN0b9pFOHb66bHpmY0bM+3g6xxA4ByECQQDTZuw3opoKgt8t\r\n" +
      "tVNnOeOCgcqwIsZZAv97WhKLPJw1iVj5zSJA+fOnskGHevHBByhTPG2Fxjosv4kB\r\n" +
      "wbFAatktAkEAzhHIjyLWG55LF8V3wqlb9IQj4ufy9Oz+GgUSfU83+w2ZsPRW8ywY\r\n" +
      "Nn40dZe4H3cgjPg2vF2Iw3R0QtafWw0WmQJBAJDfR6IY3Fe3uBoq2S00jqUxLSXn\r\n" +
      "ruXxGAQAhhTFRIeryD3oRkEhCckyb5XYuzP9I7VnA+dZISTpXYrGoiHzF0ECQFGG\r\n" +
      "p5o5Zgu3LG5TKXZSGpc6UBhNO8m6mS0iJJZ+6N7iGpPvlxbNsyab95UuWSHZt38O\r\n" +
      "FFmhh6IMSJLI5nLXUsECQBjEDhjqK533gETR+1pF6qIoKxCYWJHe2s7lRUNnEgty\r\n" +
      "CRlOuuMe4yigwWVHjM4914FujY81uqrsmf36H17hWxo=\r\n" +
      "-----END RSA PRIVATE KEY-----\r\n",
    cert:
      "-----BEGIN CERTIFICATE-----\r\n" +
      "MIIClTCCAf6gAwIBAgIJV+VFRrTLKIw6MA0GCSqGSIb3DQEBBQUAMGkxFDASBgNV\r\n" +
      "BAMTC2V4YW1wbGUub3JnMQswCQYDVQQGEwJVUzERMA8GA1UECBMIVmlyZ2luaWEx\r\n" +
      "EzARBgNVBAcTCkJsYWNrc2J1cmcxDTALBgNVBAoTBFRlc3QxDTALBgNVBAsTBFRl\r\n" +
      "c3QwHhcNMjMxMTA5MDgwOTI2WhcNMjQxMTA4MDgwOTI2WjBpMRQwEgYDVQQDEwtl\r\n" +
      "eGFtcGxlLm9yZzELMAkGA1UEBhMCVVMxETAPBgNVBAgTCFZpcmdpbmlhMRMwEQYD\r\n" +
      "VQQHEwpCbGFja3NidXJnMQ0wCwYDVQQKEwRUZXN0MQ0wCwYDVQQLEwRUZXN0MIGf\r\n" +
      "MA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCqK4GJEFelA5sWipPsBgONPuwxAL3i\r\n" +
      "o9GexKlTv6YPsXkfbgkPa8P6zbzqt4eOVjFtDyqrMkJDmc4gRqGNk3+VzOPqaTLc\r\n" +
      "rvOgDAFfY9eF+Z4KmVutJYTYE/3antTOtcP5yAfVVxkpErvvNhE+0+Vtro6FpfGu\r\n" +
      "93iWGsALtM6p5QIDAQABo0UwQzAMBgNVHRMEBTADAQH/MAsGA1UdDwQEAwIC9DAm\r\n" +
      "BgNVHREEHzAdhhtodHRwOi8vZXhhbXBsZS5vcmcvd2ViaWQjbWUwDQYJKoZIhvcN\r\n" +
      "AQEFBQADgYEAWxD2Q8JRJQC+O7tMSoU5M2Ko5VlCIpgWfqQxrdJ1581izZLtie8Y\r\n" +
      "ccYVYYqoXNnIiE7K/m45UAgN7FSeWGmoCD5GHd8wzOOlmWT0W7+cCufUbfnUsa4g\r\n" +
      "KE7vqFuJjXxtC9iU6sQcYPXk6zVQpCpkUl1Ib5Xsk8Qevk3xvfJR/Vo=\r\n" +
      "-----END CERTIFICATE-----\r\n",
  };

  const server = http2.createSecureServer(serverOptions);

  server.on("stream", (stream, headers) => {
    stream.respond({
      ":status": 200,
    });
    stream.end("test");
  });

  // waiting start server
  await /** @type {Promise<void>} */ (
    new Promise((resolve) => server.listen(3000, "0.0.0.0", () => resolve()))
  );

  const client = new undici.Client(
    `https://localhost:${server.address().port}`,
    {
      connect: {
        rejectUnauthorized: false,
      },
      allowH2: true,
    }
  );

  const response = await undici.fetch(
    `https://localhost:${server.address().port}/`,
    {
      method: "GET",
      dispatcher: client,
    }
  );

  server.close();

  await client.close();

  /**
   * @param {() => any} fn
   * @returns {any}
   */
  function call(fn) {
    try {
      return fn();
    } catch (e) {
      return e;
    }
  }

  /**
   * @param {boolean} cond
   * @param {string | Error} message
   */
  function assert(cond, message) {
    if (!cond) console.error(message);
  }

  assert(
    call(() => response.headers.get(":status")) !== "200",
    "Assertion: It should not be possible to get the pseudo-headers `:status`."
  );

  for (const key of response.headers.keys()) {
    assert(
      !key.startsWith(":"),
      "Assertion: The pseudo-headers `" +
        key +
        "` must not be included in `Headers#keys`."
    );
  }

  console.log([...response.headers]);
})();

Expected Behavior

The above test should succeed.

The pseudo-header :status is not passed to Headers.

Logs & Screenshots

> $ node test.js
(node:7544) [UNDICI-H2] Warning: H2 support is experimental, expect them to change at any time.
(Use `node --trace-warnings ...` to show where the warning was created)
Assertion: It should not be possible to get the pseudo-headers `:status`.
Assertion: The pseudo-headers `:status` must not be included in `Headers#keys`.
[ [ ':status', '200' ], [ 'date', 'Thu, 09 Nov 2023 09:10:56 GMT' ] ]

Environment

Windows 11
Node.js v21.1.0
Undici v5.27.2

Additional context

related to: #2409 #2410

@tsctx tsctx added the bug Something isn't working label Nov 9, 2023
@metcoder95
Copy link
Member

The RFC-7540 states that the pseudo headers are valid header field names, and should always be preserved in a request/response, especially as they belong to the response mandatory headers. Its absence should be considered a malformed response.

For fetch, the standard specifies:

A header name is a byte sequence that matches the field-name token production.

From RFC-9110:

New fields can be introduced without changing the protocol version if their defined semantics allow them to be safely ignored by recipients that do not recognize them; see Section 16.3.

As they are scoped within h2 version, being a different protocol version, the headers in fact are valid, meanwhile they are not in the list of forbidden headers.

Changing slightly what I said in the PR, I'm a little bit on the fence here but more into the fact of keeping them as it does not seems clearly to be violating the spec in any form, unless otherwise it does.

cc: @KhafraDev

@tsctx
Copy link
Member Author

tsctx commented Nov 9, 2023

I meant to reply a little sooner but sorry for the delay.
I never got around to the idea of allowing pseudo-headers. Thanks for the suggestion.
Added implementations that allow pseudo-headers and implementations that do not allow pseudo-headers.
#2417 (allow) #2416 (not allow)

@KhafraDev
Copy link
Member

The RFC (and as such, by the fetch spec) is adamant that colons are not allowed in header names:

https://fetch.spec.whatwg.org/#header-name (which links to https://httpwg.org/specs/rfc9110.html#fields.names)

Tokens are short textual identifiers that do not include whitespace or delimiters.

---

Many HTTP field values are defined using common syntax components, separated by whitespace or specific delimiting
characters. Delimiters are chosen from the set of US-ASCII visual characters not allowed in a token
(DQUOTE and "(),/:;<=>?@[\]{}").

How the fetch spec intends for implementations to deal with these pseudo headers, I couldn't tell you.

@KhafraDev
Copy link
Member

@metcoder95 I'd recommend either making an/searching for an issue in the whatwg repo for fetch, or even going on their matrix channel.

@metcoder95
Copy link
Member

metcoder95 commented Nov 10, 2023

The RFC (and as such, by the fetch spec) is adamant that colons are not allowed in header names:

Yeah, it firmly points to that RFC, which it's of course valid. However, it somehow feels counterintuitive with what the H2 version says, which seems to mark those headers as mandatory part of the response, otherwise should be considered malformed. On the request side its crystal clear, on the response is when the doubts arises.

I'll follow up on this and see what I can find.
@tsctx we will need to wait a little bit longer to have a decision on how to mover forward

@KhafraDev
Copy link
Member

Neither of the PRs addressing this issue are correct.

  • The server uses http/2 and sends pseudo headers to the client.
  • The client parses them normally and appends them to the response's headers.
  • When the user tries to do response.headers.get(':status') the header name is validated, and because of what I mentioned above, throws an error.

This is already fixed with a1342a7.

@KhafraDev KhafraDev reopened this Nov 10, 2023
@KhafraDev
Copy link
Member

Not quite fixed; we are meant to append the headers to the HeaderList (which doesn't include validation), not to the Headers instance directly. The http parser should already handle the validation for us.

@tsctx
Copy link
Member Author

tsctx commented Nov 10, 2023

@KhafraDev

Neither of the PRs addressing this issue are correct.

  • The server uses http/2 and sends pseudo headers to the client.
  • The client parses them normally and appends them to the response's headers.
  • When the user tries to do response.headers.get(':status') the header name is validated, and because of what I mentioned above, throws an error.

This is already fixed with a1342a7.

I believe the following code will throw an error if the HeadersIterator contains :status.

// H2 fetch
const response = await fetch("https://localhost");
for (const key of response.headers.keys()) {
  // Error because key is sometimes `:status`.
  const value = response.headers.get(key)
}

@KhafraDev
Copy link
Member

KhafraDev commented Nov 10, 2023

@metcoder95 according to RFC 9113 "Pseudo-header fields are not HTTP header fields.". I don't think we should be receiving pseudo headers with normal headers, which is definitely causing this. My fix is partially correct, but this is no longer an issue with fetch afterwards.

While we could handle it in fetch, it wouldn't be correct. Fetch is an abstraction on the http specs, it shouldn't be handling this lower-leveled behavior/validation/whatever.

@metcoder95
Copy link
Member

RFC 9113 "Pseudo-header fields are not HTTP header fields.". I don't think we should be receiving pseudo headers with normal headers, which is definitely causing this. My fix is partially correct, but this is no longer an issue with fetch afterwards.

See what you mean, pseudo-headers are not common HTTP header fields and should not be generated by the endpoint beyond the ones specified in the spec.

Quote:

Pseudo-header fields are not HTTP header fields. Endpoints MUST NOT generate pseudo-header fields other than those defined in this document.

I agree, I believe it's not a fetch issue, removing it should be ok then. Fetch should just standardise that usage and handle it accordingly to the fetch spec. #2422 can close it I guess?

@KhafraDev
Copy link
Member

KhafraDev commented Nov 10, 2023

I think those 2 statements are separate - if pseudo-headers aren't header fields, undici is mishandling them (or at least the expectation that headers are actually headers is wrong?). Would it make sense to split them into their own property?

what I'm proposing:

import { Client, request } from 'undici'

const { headers, pseudoHeaders } = await request('https://h2', {
  client: new Client({ allowH2: true })
})

@tsctx
Copy link
Member Author

tsctx commented Nov 11, 2023

I think those 2 statements are separate - if pseudo-headers aren't header fields, undici is mishandling them (or at least the expectation that headers are actually headers is wrong?). Would it make sense to split them into their own property?

what I'm proposing:

import { Client, request } from 'undici'

const { headers, pseudoHeaders } = await request('https://h2', {
  client: new Client({ allowH2: true })
})

This is a good proposal. However, there is one thing that concerns me.
If this were to be introduced, the behavior would be different from Node Http2.

const http2 = require("http2");
const undici = require("undici");

// Node Http2
(async () => {
  // H2 Server URI
  const client = https.connect("https://localhost");

  const req = client.request({ ":method": "GET", ":path": "/"  });

  req.on("response", (headers, flags) => {
    console.log(headers[":status"]); // 200
    console.log(typeof headers[":status"]); // 'number'
  });
  req.on("end", () => { client.destroy(); });
})();

// Current
(async () => {
  const client = new undici.Client({ allowH2: true });
  
  // H2 Server URI
  const req = await undici.request("https://localhost", { client });

  console.log(req.headers[":status"]); // 200
  console.log(typeof req.headers[":status"]); // 'number'
})();

// Proposal
(async () => {
  const client = new undici.Client({ allowH2: true });
  
  // H2 Server URI
  const req = await undici.request("https://localhost", { client });

  console.log(req.headers[":status"]); // undefined
  console.log(typeof req.headers[":status"]); // 'undefined'
  console.log(typeof req.pseudoHeaders[":status"]); // 200
  console.log(typeof req.pseudoHeaders[":status"]); // 'number'
})();

So I don't think you need to change anything except fetch.

@KhafraDev
Copy link
Member

Being different from node's http/2 is fine, it's fundamentally not an issue in fetch, so there's nothing to fix in fetch. The spec states that pseudo headers are not header fields, so they shouldn't be passed as regular headers. IMO undici is doing something wrong if fake headers are being passed as real headers - to quote myself "the expectation that headers are actually headers is wrong".

@tsctx
Copy link
Member Author

tsctx commented Nov 11, 2023

@KhafraDev
Of course.

@metcoder95
Copy link
Member

I agree that diverging from node:http2 is okay, it should not be a blocker.

Continuing with the headers, you are correct in assuming that they are not header fields in the sense that no other fields should follow the syntax, as pseudo headers are proper to the version of the protocol. This means that if the caller, when making a request, attempts to add a new header in the form :myheader, this should be forbidden, as the spec does not allow it and the remote server can consider it as malformed.

The same happens on the server side, endpoints should not add other custom fields with the pseudo-header syntax, but it does not mean that they should not be exposed to the request handler (e.g. :path indicates the resource path the client is trying to access, instead of inline with the string method as it happens for http/1.1).
This goes similar to the response handling by the client; the client should have access to them as are part of the protocol, although they cannot create custom headers are these are not header fields.

As fetch, which you stated pretty well, is an abstraction on top of the http protocol semantics (independently of its version); and has its living spec, the behaviour I would expect if using it should be:

  1. fetch does not allow the caller to append headers with the pseudo-headers syntax
  2. fetch abstracts the response's pseudo-headers (if applicable)
    • response just has a single pseudo-header -- :status. It should be mapped to Response#status
    • As fetch is made on top of Dispatch, if h2, the pseudo-headers should be stripped if any (undici should not remove it, as even fetch is a client of undici, other implementations should preserve it, as are part of the h2 spec).

Happy to hear your thoughts @KhafraDev 🙂

@tsctx
Copy link
Member Author

tsctx commented Nov 12, 2023

Hi @KhafraDev @metcoder95,
Within a few days, we will have implemented the proposed features and sent you a PR.

@KhafraDev
Copy link
Member

they are not header fields in the sense that no other fields should follow the syntax, as pseudo headers are proper to the version of the protocol.

They aren't header fields because colons are forbidden from being in headers fields, I assume this is why they call these fake headers.

response just has a single pseudo-header -- :status. It should be mapped to Response#status

shouldn't undici be doing this already, at a lower level (and if not I'd love to get more info on why it's not)? Then no pseudo headers would need to be passed as headers while making the response's status available still, in a way where fetch doesn't need to be changed.

@metcoder95
Copy link
Member

metcoder95 commented Nov 13, 2023

They aren't header fields because colons are forbidden from being in headers fields, I assume this is why they call these fake headers.

I believe that's one of the reasons 😅
But I have the impression it was to scope the new primitives of the protocol in a form that cannot be altered by custom header fields; some sort of immutable field

shouldn't undici be doing this already, at a lower level (and if not I'd love to get more info on why it's not)? Then no pseudo headers would need to be passed as headers while making the response's status available still, in a way where fetch doesn't need to be changed.

We do, we just don't remove the header from the response's headers; as this can be of interest to some implementers.

When calling dispatch, the onHeaders method gives you the mapped statusCode and the headers at rest. But maybe...

As we do not have a rawHeaders, we can pass the original (the one with the :status) headers as rawHeaders and the sanitized headers as normal ones.
In that way, we respect functionality and split the protocol-scoped headers from the common header fields, wdyt?

@KhafraDev
Copy link
Member

SGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants