httpwg / http-extensions Public
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define id- as a prefix for any Digest alg, not just id-sha-256 and id-sha-512 #885
Comments
@martinthomson can we provide a new, short, separate, I-D defining the I can provide it in a brief time if there's agreement on that. See martinthomson/http-mice#17 that's somewhat related. |
|
Let's add it in a FAQ and close this |
|
@LPardue should we file a new I-D for that and close this issue? Should we reserve the |
|
There's two things at play in my mind:
Point 2 is important not to lose IMO. My suggested course of action is to keep this issue open for the time being, @ioggstream to lead on putting together an independent Internet-Draft that describes "id- prefix for digest algorithms", and @ioggstream to lead on a WIP PR to replace the definition of |
|
I'll accept your suggestion to check the impacts. In general: 1- ok to a new I-D "id- prefix for digest algorithms" |
|
"id-" prefix for digest-algorithms:
IMHO id-sha-256 and id-sha-512 are one selling point for this I-D, as they give the opportunity to implementors to avoid the fallacies of their current implementations (that is, they always send the sha-256 of the unencoded body even when Content-Encoding != null) |
|
Thanks for doing the work and sorry for the delay. I suspect that given how concise the I-D turned out being, we could incorporate some of that text back into digest without much impact. We should ask the WG whether they think the ability to have |
|
Both solutions are good. The only issue I see is for checksum algorithms starting with |
|
@martinthomson suggests to separate those two I-D. Seems reasonable to me at this point. cc: @LPardue |
|
I think it would be better if we could find a way to distinguish these cases (digest of selected representation vs digest of unencoded representation) without overloading the digest name - for instance, with a separate set of header fields. |
|
In splitting the work, we can decide to spell the unencoded digest differently. I agree that something like 'Hash: foo' rather than 'Digest' offers some advantages. Allowing structured fields is a non-trivial benefit, for instance. |
|
Splitting WFM. How does this work procedurally? The id- algorithms were in the adopted document, do we need to seek adoption of the new document before removing things is digest? |
|
Any technical change that has the consensus of the working group can be enacted, no matter how disruptive. Of course, more significant changes probably need more thorough checking. Consulting the chairs or sending an email to the list would be where I start. |
|
The id- algorithm are in this document to help current implementation to fix their current behaviour, which send digest without considering of content-coding without breaking their current setup ( Eg Just changing the algorithm, or adding the id- one). While defining a new field is fine, for current implementers there is no value in adding this feature in a new field, as they are not using since it is not standardized ;) |
|
Do we actually have current implementations that use "id-"? How recent are those? Are they actively maintened? |
|
When reviewing the way Digest was used in various contexts (banking APIs, ...) I found that they were incompatible with RFC3230 mainly because:
I then started this I-D to provide guidance in computing digest, but I had to find a way for those implementation which currently send content-coded payload together with the digest of the unencoded payload. So your question:
We have broken implementations which compute This I-D is an attempt to make order in the current Digest usage providing implementers with some tool to improve their compliance: in this case, adding an |
|
Using structured fields is a separate issue. So I understand currently the "id-" prefix is not used (yet)? Is it easier to add "id-" as prefix as opposed to using a different field name? (just checking) |
the
Yes, because with Digest, people buy a way to transition and use different algorithms. |
But so would putting the hash into a diffent header field, no?
That is indeed a problem (similar to gzip-after-etag).
Could you please elaborate on that? |
|
Since Digest is used to convey the integrity of transactions, its values are usually long-term stored as transaction proof/checksum.
|
|
Well, the idea would be not to overload the digest name, but to put the digest into a differently names field which always applies to the identity-encoded content. (I think it's worthwhile to explore this before we start overloading algorithm names with semantics) |
|
I made an assumption earlier that I'd like to challenge. Does the suggestion to "split" actually require a new document? Would one possible option be to define an additional header ( I'm just a little concerned about the amount of activation energy required for a new document and how that might tie up progress on this I-D. |
|
Since this discussion in January, the WG has gradually formed consensus that defining a new Content-Digest header is a reasonable path forward. That leaves me wondering if "Identity-Digest" isn't such a bad idea now. Especially because it makes managing the algorithm repository easier. See #1555 for a good example of unintended consequences of overloading the algorithms with an |
|
@LPardue not sure... this path will lead to an |
|
Right. We typically try to avoid implementers shooting themselves in the foot by stating simple algorithm identifiers that have one meaning. We can't prevent them from doing obviously broken things like you illustrate. However, this issue specifically asks the question about whether we want to make I don't think there is much strong case for an Identity-Content-Digest. For example, if a range request yields a partial response with content coding, there's not much the client can do to remove the coding in order to calculate a digest to verify. |
|
My .02 - overloading the digest algo name means that there will be a bunch of algorithms who share a slab of text about HTTP operation. That's really weird. It also means that prefixes now mean something, which begs the question of what to do the next time you need to define a prefix. It's much cleaner to just define this as a new header with separate semantics. |
|
I'm more comfortable having either a separate header entirely or having parameters on the header objects themselves as opposed to having a semantic layer on the algorithm identifiers. And in completeness, for me personally, it makes the most sense for the |
|
We deprecated algorithm parameters for security reasons: I don't know if it's worth reintroducing them. @LPardue atm to move on I suggest to:
|
|
Sgtm. Let's take this plan to the list with a brief background summary just to ensure visibility. |
|
One clarification question. Related to Justin's. Do we want to define only an identity-content-digest and sidestep the matter of content encoding with identity altogether? |
|
I think from the perspective of most developers it's going to be something like: "I am making a request/sending a response and a have a bunch of bytes I'm going to put into the message body that I want to protect." The average application developer isn't going to think in terms of content encoding, identity, or anything like that. I fully appreciate that the spec needs to be precise about what a digest applies to, but we also have to account for how people will look at implementing this in the wild. |
|
@LPardue I think that ignoring content-encoding is not always feasible (eg. some content codings provide encryption). imho
if you take a closer look, I-C-D implementers have to compute the value before that any coding is applied (on requests) and must validate the response after the content coding is removed: this is being aware of content-coding and probably means computing the value directly into the actual application and not via "interceptors", "sequences" or "api gateway". @jricher It would be great if somebody could provide some pseudo-code implemementations to improve the reasoning, eg like in #1555 . wdyt? PS: generally, devs implementing integrity should have basic knowledge of the "equivalence class" inferred by the communication layers. When they don't, bad things happen :) |
|
Yeah i think this is a good illustration of the traps that people can fall into. Devs can be under the illusion that what they put in their application message is what gets serialized on the wire. but then some server configuration or middleware goes and changes it dynamically on the fly. |
|
My experience with implementing Consequently, I wonder whether |
|
I'm going to push back against any attempt to redefine the intentions of Digest. The premise of RFC3230 is clear, even if the execution of communication hasn't been. As mentioned at the last interim, specifications like metalink RFC 6249 use Digest as it was intended to fetch ranges of content across different requests and to validate the sum of the parts, not the parts individually. If older implementations got it wrong, they need to use the new thing that is much more constrained. |
|
The way I think about these different digests is, for a receiver to validate in an order like so • check Content-Digest to prove the HTTP message content is valid. |
|
@LPardue As time goes by, I think we can defer |
|
Works for me, did you already have a WIP change that does that? |
draft-ietf-httpbis-digest-headers introduces id-sha-256 as a separate algorithm to sha-256 for the Digest HTTP header to allow an integrity check that doesn't depend on the content-encoding.
The desire to apply integrity to the content-encoded representation, or the decoded representation, is independent of the choice of integrity algorithm.
How about defining "id-" as a prefix that can be applied to any integrity algorithm to indicate the choice of representation? That avoids the need to duplicate sha-256 and sha-512 entries (and others) in the table of Digest algorithms.
The text was updated successfully, but these errors were encountered: