Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fold fixity, external-body specs into resource management GET, POST, PUT #15

Merged
merged 2 commits into from
Jan 18, 2017

Conversation

barmintor
Copy link
Contributor

  • move Digest, Want-Digest specifications to GET, POST, PUT
  • move Content-Type: message/external-body spec to GET, POST, PUT
  • add description of Expect: 202-digest for persistence fixity check
  • add description of Expect: 202-digest-link for persistence fixity check
  • remove draft specification of fixity resources
  • resolves issue-13

- move Digest, Want-Digest specifications to GET, POST, PUT
- move Content-Type: message/external-body spec to GET, POST, PUT
- add description of Expect: 202-digest for persistence fixity check
- add description of Expect: 202-digest-link for persistence fixity check
- remove draft specification of fixity resources
- resolves issue-13
@barmintor
Copy link
Contributor Author

All right, respec markup should be fixed and I drafted the other Expect token we discussed for @ajs6f use case of linked fixity resources for complex verification. Please review.

<h3>Ignored Expectations and 200 (OK)</h3>
<p>A client aware of <code>202-digest</code> that receives a 200 should assume it is responsible for verifying fixity by downloading the resource.</p>
</section>
<section id="202-digest-link">
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the references to 202-digest-link be marked informative? I don't know we would write a TCK for them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, dunno. But I don't like informative because I do want this as a real part of the spec. Let me think about a test that everyone would believe in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll face a similar problem with 202-digest, but that is at least buttressed by the PUT+Digest behavior and the simplicity of the check. The linked behavior seems much more arbitrary- I would feel only a little uncomfortable saying "MD5 is required", but a lot uncomfortable saying "framemd5 is required".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to require any particular form. We would need a check that is agnostic. Like somehow just to see that some linked resource was retrieved. Maybe there's a "always passes" form and the test can just offer a resource to be linked and then assert that it was retrieved, just thinking out loud.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But as I say elsewhere, I don't know that we really require a check. I would be more comfortable in many ways packing this into the form of the instance-digest, so that the other machinery already in play would work. Why aren't we doing that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought there was consensus that some digest/verification information will not fit into the form of an instance digest, and must be submitted as a related document? It is also worth noting that the instance digest form is in part regulated by an iana algorithm registry via RFC3230.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sure, the data wouldn't fit, but a link certainly would. An URI is generally not going to be much bigger than a checksum. The registry is a different story, that's severe. Hm. But if Expect: 202-digest is ours to begin with, can't we say that we accept a superset of the registry in that token position?

<code>instance-digest</code> values are of the syntax described in [[!RFC3230]] for the <code>Digest</code> header.
</p>
<p>HTTP/1.1 servers that do not support the 202-digest expectation MUST reject requests with 417 Expectation Failed.</p>
<p>An implementing server MAY allow <code>202-digest</code> without a <code>digest-param</code>. Such servers SHOULD compare the calculated instance digest to a stored original digest. Servers that require a client-proffered instance-digest MUST reject requests without a digest with 417 Expectation Failed.</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a server wants to store more than one type of fixity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this precludes that, but maybe I don't follow your question?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would the client select one of them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the client is not identifying a digest, and the server supports defaulting, the client has abrogated the selection, no? I think it's a questionable thing to do, but it aligns with some of the use case feedback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'm asking if the client wants to use a stored value and there is more than one, which one gets selected (or is that an impl choice)? It's fine for that to be an impl choice, but I think we should be explicit about that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make it more explicit- I was trying to accommodate deferring to the server. If the client wants to pick one, it should have to explicitly pick it, I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's cool. Just trying to think of use cases. If I want to explicitly pick from a selection of digests and I want to store them server-side, I should probably just put them in properties or something like that, right? So no prob, but let's just maybe be more explicit about "If the client wants to pick one, it [will] have to explicitly pick it".

digest-link-expectation = "202-digest-link" rel-param
rel-param = ";" "rel" "=" ( token | quoted-string )
</code></pre>
<p>HTTP/1.1 servers that do not support the 202-digest-link expectation MUST reject requests with 417 Expectation Failed.</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused-- why would we want servers that don't do this expectation to be considered legitimate Fedora servers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The discussion about the strict optionality of all of the verification behavior besides Want-Digest, etc., makes me feel like the disqualifying part is bad behavior about Expect in general.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand. I'm saying that we should not be making this stuff optional. Fedora servers have to do fixity in the form we have described. (Which is really lightweight and cheap and really not a big deal.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see this language change to be precise in that the behavior that causes a 417 is that the server does not understand the instance-digest, not the 202-digest-link form. Understanding the link form is not optional. If you would rather I make that edit afterwards, that fine, just say so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that the language should be more precise. If the server did not understand the instance-digest (i.e. does not support the provides algorithm), or does not know how to interpret the contents of the provided link when given a 202-digest-link expectation, would it acceptable for it to return 200 (as described in the "Ignored expectations" non-normative section)? Would it be acceptable for it to return a 417? Is either response equally appropriate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This thread is out of date with #15 (comment), so I think we can let it hang.

</section>
<section id="202-link-digest-preamble" class="informative">
<h3>Complex Fixity and Linked Fixity Resources</h3>
<p>Digital preservationists acknowledge the <a href="http://dericed.com/papers/reconsidering-the-checksum-for-audiovisual-preservation/">limits of simple bit fixity verification</a> to describe the state of some types of resources, eg large media files. Servers that can support more sophisticated assessments of content difference may use the <a href="#202-digest-link">202-digest-link</a> extension to offer more appropriate verification of such resources, with 406 responses accompanied by appropriate explanatory entities.</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems pretty cool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's what we hashed out in IRC, if I remember correctly. I'll try to go back and link the archive to #13 as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I don't remember that, but whatevs-- more importantly, the question you raise about testing makes me wonder if this is such a good form after all. See my question elsewhere on this PR about using a form of instance-digest.

@ajs6f
Copy link
Contributor

ajs6f commented Jan 17, 2017

So by and large I am okay with this PR. I have one qualm that will not let me merge this. I'd like to get rid of 202-digest-link in favor of saying that digest includes the link form. In other words, changing (modulo spacing which got screwed up)

digest-expectation    = "202-digest" [digest-param]
digest-param             = ";" #(instance-digest)

to

digest-expectation    = ("digest" "=" [digest-param]) | ("digest-link" "=" [digest-link])
digest-param            = instance-digest
digest-link                 = URI-reference

where URI-reference is from RFC 2396.

We should specify that for an absent value the behavior is as we've discussed elsewhere. Also, I'm suggesting that we don't give multiple fixities in a single header, which seems very hard to read to me, even though I realize that Expect allows it, but it's not something I feel that strongly about.

<p>GET requests to a LDP-NR with <code>Content-Type: message/external-body</code>, MUST result in an HTTP 3xx redirect message redirecting to the external URL.</p>
<section id="httpGETLDPNR-fixity-expectation">
<h3><code>Expect: 202-digest</code> and <code>Expect: 202-digest-link</code></h3>
<p>GET requests to a LDP-NR SHOULD respond to <code>Expect</code> request headers with a <a href="#202-digest">202-digest</a> expectation. Implementations that do not support this expection MUST reject <a href="#202-digest">202-digest</a> requests with a 417 Expectation Failed. If the LDP-NR is of <code>Content-Type: message/external-body</code>, the server must respond with a 3xx. If the digest parameter does not match the current instance digest of the resource, the request MUST be rejected with a 406 Not Acceptable. If the digest matches, the request MUST be completed with a 202 Accepted.</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't 406 Not acceptable being overloaded in this case? What happens if a client includes a Want-Digest and an Accept. Does a 406 mean the digest didn't match, or that the server could not produce a media type accepted by the client?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good point. Maybe it's better to give 417 for both "I don't understand your kind of fixity" and "I understand your kind of fixity and your fixity is wrong"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think a 417 to all would be more clear: "The fixity check you requested isn't happening, or failed. Check it out". Maybe add that responses SHOULD contain a response body explaining the nature of the expectation failure?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to both of these suggestions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is explicitly against what Expect and 417 define- it is not for that, it is only for unsupported Expect. In this case, You do not send Want-Digest and an Expect together, nor do you send an Accept header, since the Digest behavior is scoped to the resource identified by the URI. We might be abusing 406, but I think this change would abuse 417.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again from the HTTP RFC

10.4.13 412 Precondition Failed
   The precondition given in one or more of the request-header fields
   evaluated to false when it was tested on the server. This response
   code allows the client to place preconditions on the current resource
   metainformation (header field data) and thus prevent the requested
   method from being applied to a resource other than the one intended.

That's pretty generic... we kind of are talking about "a resource other than the one intended", although they share the same URI... sort of.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although... (continuing to c&p the entire HTTP spec)

10.4.18 417 Expectation Failed
   The expectation given in an Expect request-header field (see section
   14.20) could not be met by this server, or, if the server is a proxy,
   the server has unambiguous evidence that the request could not be met
   by the next-hop server.

I think that's actually a lot more flexible than the idea of it given by the header def'n quoted above. The expectation given in an Expect request-header field ... could not be met by this server... That is pretty much what we are talking about with a failed digest match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least Expect and 417 a little confusing RFC7231. The language regarding 417 ranges from the general "the request's Expect header field could not be met" to the specific "The response chain does not support expectations."

@barmintor you mention in another comment using a 412. That seems to less contradictory than 406, and more helpful than a 417. I like it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@birkland I have to give @ajs6f credit for the 412 suggestion (and, to be fair, for many ideas in this PR)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blushes GO TEAM!

I think we can say that all of us are collectively most comfortable with 412 right now. It's not perfect, but it seems to be as good as we are going to do. I think it does a good enough job of

  1. Distinguishing between "I don't understand the request" and "I understand the request and it failed"
  2. Avoiding any explicit misuse of HTTP semantics

</p>
<p>Implementations MUST support <code>Content-Type: message/external-body</code> extensions for request bodies for HTTP <code>POST</code> that would create LDP-NRs.
This content-type requires a complete <code>Content-Type</code> header that includes the location of the external body, e.g
<code>Content-Type: message/external-body; access-type=URL; URL=\"http://www.example.com/file\"</code>, as defined in [[!rfc2017]].</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hang on! I may be missing something here but we seem to have lost the actual behavior in this edit (MUST return a 3xx to the external resource) and the same below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we MUST 3xx on POST/PUT, or on GET? It's in the GET section iirc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you're right-- cool, I somehow didn't see that.

@barmintor
Copy link
Contributor Author

barmintor commented Jan 17, 2017 via email

@barmintor
Copy link
Contributor Author

barmintor commented Jan 17, 2017 via email

@barmintor
Copy link
Contributor Author

barmintor commented Jan 17, 2017 via email

@barmintor
Copy link
Contributor Author

barmintor commented Jan 17, 2017 via email

@ajs6f
Copy link
Contributor

ajs6f commented Jan 18, 2017

Can I just say right here that I think Github's change to this "review or single comment" thing has NOT been an improvement? Just getting that off my chest.

Okay, let me see if I can coagulate some of our threads:

  1. I think we're all cool for the mo with 412 for "Your fixity is WRONG", 417 for "I don't understand what you mean by that fixity" and 202 for "Your fixity is correct" and 406 for "your Accept header disgusts me to the core of my being".

  2. I vaguely wonder if it would be good ergonomics to offer a way to both do fixity and retrieve the bitstream (get a 200 instead of a 202). But maybe it's just a case of "client can do two requests, it's not that big a deal". I am certainly not going to push this. I will say that I think we can usefully say that servers SHOULD include an informative message for every case. (Obviously, the nature of the information would be very different.)

  3. @barmintor's criticisms of my suggestion about folding things together (digest and link options) are certainly right. Let's definitely do instance digest | digest-link-param . That's probably a little more semantically clear, to boot.

Did I catch every thread? Probably not. Fedora is like a fractal cocoon winding and unraveling through our minds and our networks, never completed and never untangled.

@birkland
Copy link
Contributor

@ajs6f It's a little bit hard to follow at this point, but I think those are the major threads.. maybe merge it in soon and raise individual issues for any loose ends or open questions? That would be a little easier to follow.

@ajs6f
Copy link
Contributor

ajs6f commented Jan 18, 2017

I'm fine with that, if @barmintor agrees and feels that he can make a last commit to match the decisions we've outlined. Let's make some progress!

@ajs6f
Copy link
Contributor

ajs6f commented Jan 18, 2017

Also, do we want an additional param for the link that suggests what to do with it (ie, use it as a framemd5 document). I know we lack a vocabulary to refer to, but it seems like another parameter would be necessary.

Mmmm..... I'd kind of like to leave it be. We can imagine for now that any linked fixity resource will provide of itself, if necessary, some means by which to understand its semantics. For example, a server could natively understand fixity resource with certain mimetypes, or an RDF-encoded fixity resource could refer to some mechanism with an internal link.

The thing I'm trying to understand is the use case for two requests to the same resource, each with the same linked fixity resource, but with two different links for interpreting those fixity resources...

replace 406 on digest verification failure with 412
explicate non-position on conneg and fixity
@ajs6f ajs6f merged commit 525bc6e into fcrepo:master Jan 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants