Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should structured headers serialize data which cannot be parsed? #1055

Closed
clelland opened this issue Feb 7, 2020 · 6 comments
Closed

Should structured headers serialize data which cannot be parsed? #1055

clelland opened this issue Feb 7, 2020 · 6 comments

Comments

@clelland
Copy link
Contributor

clelland commented Feb 7, 2020

With the issue of rounding coming up again in #1044, @phluid61 brought up that there's a bigger question around all of the serialization algorithms -- is their input specifically defined in terms of the structured data types? (With all of the range restrictions on those types?)

The example at hand is Decimals -- since the "Serialize a Decimal" algorithm states that its input is a decimal, should rounding even be a possible concern? According to 3.3.2, decimals can only have three fractional digits. Unless that algorithm actually accepts arbitrary real numbers, the issue of rounding, or of the integer component being too large, should never come up.

Similarly, after saying "Given a byte sequence as input_bytes, ...", the "Serializing a Byte Sequence" algorithm includes "If input_bytes is not a sequence of bytes, ...", which seems like an impossible situation.

Strings, tokens, and keys all do the same, guarding against situations which can't happen if the input is the correct structured data type (as opposed to a generic string in the implementation language.)

I don't know if the spec just needs to be more precise about when it is talking about Structured Data Strings, Tokens, Decimals, etc., or if the serialization algorithms need to be defined as taking generic strings, numbers, and arrays as input, or maybe we need both, with a section on converting strings to Strings or Tokens, numbers to Decimals, etc., which can happen before or during serialization.

(Or maybe I'm being too strict here, and the threshold for spec text should be 'understandable by a reasonable person'? :) )

@phluid61
Copy link
Collaborator

phluid61 commented Feb 8, 2020

Some points to consider:

It was decided to not define the structured headers data model(s) as formal types. They are more like a profile applied to generic data types. The informality becomes more apparent when you realise that Integer, Byte Sequence, and Boolean are never actually defined.

The structured headers types are modelled to map natively to JavaScript types, modulo some profiling. Not least because one half of the web (the client half) is largely restricted to working in JavaScript. It doesn't have to be stated as an explicit constraint in the spec, but it's been an underlying consideration.

If your serialisation function accepts a native JavaScript value, which is a reasonable assumption on today's web, it behooves us to provide you with a sanitisation/validation algorithm.

What remains to be decided is where and how that algorithm is provided. For now, it's implicitly included in the serialisation algorithm.

This was also discussed in structured-header-tests#32

@clelland
Copy link
Contributor Author

That's really useful context, @phluid61 -- would a statement like that make sense in the introduction?

That makes the answer to the original question "Yes" :)

In that case, the serialization algorithms should probably be clear that they're not strictly limited to structured data types as input.

@mnot
Copy link
Member

mnot commented Feb 11, 2020

Sounds reasonable to me; there needs to be some wiggle room here, I think. Perhaps this could be in the Implementation Notes appendix?

mnot added a commit that referenced this issue Feb 18, 2020
@mnot
Copy link
Member

mnot commented Feb 18, 2020

PTAL - is that enough?

@mnot
Copy link
Member

mnot commented Feb 19, 2020

As per Kari on-list, we should clarify serialisation of non-ascii content in tokens, keys and strings. While we constrain the characters allowed in each, we don't actually talk about character sets.

mnot added a commit that referenced this issue Feb 19, 2020
@mnot
Copy link
Member

mnot commented Feb 26, 2020

Not hearing any more feedback, closing.

@mnot mnot closed this as completed Feb 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants