Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing a Key - odd failure case #1075

Closed
mnot opened this issue Feb 26, 2020 · 3 comments
Closed

Parsing a Key - odd failure case #1075

mnot opened this issue Feb 26, 2020 · 3 comments

Comments

@mnot
Copy link
Member

mnot commented Feb 26, 2020

Parsing a Key has a bit of an odd behaviour when it encounters a character (after the first) that isn't allowed; it assumed that it's the end of the key, and returns the value so far.

3. While input_string is not empty:
   1. If the first character of input_string is not one of lcalpha, DIGIT, "\_", "-", ".", or "\*", return output_string.

This means that a malformed key will be truncated, and the remaining data will be fed back into the parser.

In the case of Dictionary member parsing, this goes to step 2 of:

   1. Let this_key be the result of running Parsing a Key ({{parse-key}}) with input_string.
   2. If the first character of input_string is "=":
       1. Consume the first character of input_string.
       2. Let member be the result of running Parsing an Item or Inner List ({{parse-item-or-list}}) with input_string.
   3. Otherwise:
      1. Let value be Boolean true.
      2. Let parameters be an empty, ordered map.
      3. Let member be the tuple (value, parameters).

... which means that a malformed key will have the a default value of Boolean True, and then the rest of the key will hit:

   7. Consume the first character of input_string; if it is not ",", fail parsing.

If the key is in Parameters, it treats a malformed key as the end of the parameters, because of this:

2. While input_string is not empty:
   1. If the first character of input_string is not ";", exit the loop.
   2. Consume a ";" character from the beginning of input_string.
   3. Discard any leading SP characters from input_string.
   4. let param_name be the result of running Parsing a Key ({{parse-key}}) with input_string.
   5. Let param_value be Boolean true.
   6. If the first character of input_string is "=":
      1. Consume the "=" character at the beginning of input_string.
      2. Let param_value be the result of running Parsing a Bare Item ({{parse-bare-item}}) with input_string.
   7. Append key param_name with value param_value to parameters. If parameters already contains a name param_name (comparing character-for-character), overwrite its value.

... which puts things in a really weird state, because what happens next depends upon the inputs.

I think the simple thing to do here is to change Parsing a Key to hard fail if the next character isn't =. Something like:

3. While input_string is not empty:
   1. If the first character of input_string is "=", return output_string.
   2. Let char be the result of removing the first character of input_string.
   3. If char is not one of lcalpha, DIGIT, "\_", "-", ".", or "\*", fail parsing.
   4. Otherwise, append char to output_string.

Thoughts? I mean, it works now, but the error you get back is odd, and it could potentially get things into weird states.

@mnot
Copy link
Member Author

mnot commented Feb 26, 2020

Ping @clelland @phluid61 for thoughts.

@phluid61
Copy link
Collaborator

Do you have an example invalid header value that really makes it confusing? Everything I could come up with wasn't that far off the mark, as far as errors from a streaming parser are concerned. Even appending a ; to the key, which is insane, threw errors at the right point ("x; y;=bad" threw parse_key: first character not lcalpha "=" at the start of =bad).

And regarding weird states; an error is an error, isn't it? We're not back-tracking.

The only really weird state I could contrive required putting a ; in the middle of a parameter key, which breaks undetectably. (1; x;y="x;y" means 1; x=?1; y="x;y")

I think the simple thing to do here is to change Parsing a Key to hard fail if the next character isn't =.

Can't do that because of the implicit true, no?

@mnot
Copy link
Member Author

mnot commented Feb 26, 2020

It came up with parsing some Alt-Svc headers; e.g., with parameters like h3-Q43=....

I realised this is the way it is because implicit true while I was eating dinner -- sorry for the noise. Closing.

@mnot mnot closed this as completed Feb 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants