Permalink
Switch branches/tags
Nothing to show
Find file
Fetching contributors…
Cannot retrieve contributors at this time
414 lines (337 sloc) 13.1 KB
<html>
<head>
<title>SWORD 001: HTTP header fields for packaged content delivery</title>
<style type="text/css">
body {
font-family: sans-serif;
font-size: 12pt;
}
.toc {
font-size: 10pt;
}
.code {
font-family: monospace;
background: #ffffaa;
padding-left: 15px;
padding-right: 15px;
padding-top: 5px;
padding-bottom: 5px;
margin-top: 5px;
margin-bottom: 5px;
font-size: 10pt;
}
.code_inline {
font-family: monospace;
background: #ffffaa;
padding-right: 5px;
padding-left: 5px;
padding-top: 3px;
padding-bottom: 3px;
font-size: 10pt;
}
</style>
</head>
<body>
<h1>SWORD 001: HTTP header fields for packaged content delivery</h1>
<h2>Credits</h2>
<p>
<strong>SWORD 2.0 Technical Lead</strong>: Richard Jones, Cottage Labs
</p>
<p>
<strong>SWORD 2.0 Community Manager</strong>: Stuart Lewis, University of Auckland
</p>
<p>
<strong>SWORD 2.0 Technical Advisory Group</strong><br/>
Julie Allinson, University of York<br/>
Tim Brody, University of Southampton<br/>
David Flanders, JISC<br/>
Graham Klyne, University of Oxford<br/>
Alister Miles, University of Oxford<br/>
Ben O'Steen, Cottage Labs<br/>
Mark MacGillivray, Cottage Labs<br/>
Rob Sanderson, LANL<br/>
Nick Sheppard, Leeds Metropolitan University<br/>
Eddie Shin, MediaShelf<br/>
Ian Stuart, University of Edinburgh<br/>
Ed Summers, Library of Congress<br/>
David Tarrant, University of Southampton<br/>
Graham Triggs, BioMed Central<br/>
Scott Wilson, University of Bolton<br/>
</p>
<p>
<strong>Further acknowledgements of input</strong><br/>
Aaron Birkland (Cornell University), Tim Donohue (DuraSpace), Jim Downing (University of Cambridge), Ross Gardler (OSS Watch), Steve Midgley (US Department of Education), Glen Robson (National Library of Wales), Peter Sefton (University Of Southern Queensland), Adrian Stevenson (UKOLN), Paul Walk (UKOLN), Nigel Ward (University of Queensland)
</p>
<h2>1. Introduction</h2>
<p>
Document management systems often present a requirement to move content
between applications and servers in groups of files and associated
metadata. HTTP and AtomPub protocols provide a good match for this
requirement, but do not make certain information easily available to the
systems exchanging data.
This specification describes HTTP headers to
enhance the delivery of packaged content over HTTP to document
servers and similar systems from client software, and authority information
relating to that transfer.
</p>
<p>
The specification has been informed by the JISC funded SWORD series of projects to develop
deposit technology for repository systems. A first version of SWORD addressed simple one-time
fire-and-forget transfer of packaged content from a client environment to a repository.
Subsequently, the need to fully support CRUD against stuch repositories and other scholarly
systems has become evident, and the HTTP headers presented here have been derived from a
detailed requirements analysis of the sector.
</p>
<p>
During the requirements analysis, existing internet standards were considered. In particular
the possibility of extending the existing Accept headers, or the use of Media Features were
considered in detail, and found to either not be sufficiently flexible or to offer a
degree of complexity to the work which was unnecessary.
</p>
<h3>1.1 Introduction to Packaging</h3>
<p>
"Packaging" in the context of this document refers to resources delivered over
HTTP which are comprised of component resources. For example, a ZIP file containing multiple
files can be considered a "package". Further, some ZIP files have a well defined internal
structure which is not accurately portrayed by the <span class="code_inline">Content-Type</span>
header (which would usually be application/zip).
</p>
<p>
The aim of this specification is to enable the client and server to talk meaningfully about
those packaged, compound objects
</p>
<h2>2. Notational Conventions</h2>
<p>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [<a href="#rfc2119">RFC2119</a>].
</p>
<p>
The version of BNF used in this document is taken from [<a href="#rfc2616">RFC2616</a>], and
many of the nonterminals used are defined there. Note that the
underlying charset is US-ASCII.
</p>
<h2>3. Packaging</h2>
<p>
The <span class="code_inline">Packaging</span> header applies to resources delivered over
HTTP which are comprised of component resources, and is for uniquely identifying these well structured
packaged objects in a similar way that <span class="code_inline">Content-Type</span> does for
MIME formats.
</p>
<p>
The <span class="code_inline">Packaging</span> header is constructed as follows:
</p>
<div class="code">
Packaging = "Packaging" ":" absoluteURI
</div>
<p>
Note that <span class="code_inline">absoluteURI</span> is defined in
[<a href="#rfc2616">RFC2616</a>].
</p>
<p>
For example:
</p>
<div class="code">
<pre>
POST /collection HTTP/1.1
Host: example.org
Content-Type: application/zip
Content-Length: [content length]
Content-MD5: [md5-digest]
Content-Disposition: attachment; filename=[filename]
Slug: [suggested identifier]
Packaging: http://purl.org/net/terms/package/default
[request entity]
</pre>
</div>
<p>
The <span class="code_inline">Packaging</span> request header SHOULD be used by the client during
HTTP POST to give information to the server about the packaging
format used to construct the content being POSTed or PUT. Servers
SHOULD use this information to unpack the supplied content into its
component parts. If the server does not understand the package
format it MUST either store the content as delivered without
unpacking or respond with <span class="code_inline">415</span> (Unsupported Media Type).
</p>
<p>
The <span class="code_inline">Packaging</span> request header MUST contain a
URI. At this point there is no registry of allowed URIs. MIME media types are
not permitted in this field, and if a MIME media type exists to describe the
packaging format, it SHOULD be used in the corresponding <span class="code_inline">
Content-Type</span> header.
</p>
<h2>4. Accept-Packaging</h2>
<p>
The <span class="code_inline">Accept-Packaging</span> header applies to resources retrieved over
HTTP which are comprised of component resources, and is for requesting these well structured packaged
objects from a server. It should be considered
similar to the standard <span class="code_inline">Accept</span> headers provided by HTTP, but
it currently only takes a single string, and does not support q values.
</p>
<p>
The <span class="code_inline">Accept-Packaging</span> header is constructed as follows:
</p>
<div class="code">
Accept-Packaging = "Accept-Packaging" ":" absoluteURI
</div>
<p>
Note that <span class="code_inline">absoluteURI</span> is defined in
[<a href="#rfc2616">RFC2616</a>].
</p>
<p>
For example:
</p>
<div class="code">
<pre>
GET /item/27 HTTP/1.1
Host: example.org
Accept-Packaging: http://purl.org/net/terms/package/default
</pre>
</div>
<p>
The <span class="code_inline">Accept-Packaging</span> request header SHOULD be used by the client
during HTTP GET to indicate to the server the package format that the
content identified by the URI should be returned in. Servers MUST
either return the content in the specified format or return a <span class="code_inline">406 Not
Acceptable</span>.
</p>
<h2>5. On-Behalf-Of</h2>
<p>
The <span class="code_inline">On-Behalf-Of</span> request header MAY be used by the client to
alert the server as to the identity of the true owner of the content
during PUT, POST, DELETE and GET. Traditional HTTP authentication is
insufficient in this regard as it is possible that content will be
delivered machine-to-machine and the authenticating user will be the
client not the content owner. Servers SHOULD use this
information to determine whether and how they accept the content.
</p>
<div class="code">
On-Behalf-Of = "On-Behalf-Of" ":" (token | quoted-string)
</div>
<p>
For example:
</p>
<div class="code">
<pre>
POST /collection HTTP/1.1
Host: example.org
Authorization: [auth]
Content-Type: application/zip
Content-Length: [content length]
Content-MD5: [md5-digest]
Content-Disposition: attachment; filename=[filename]
Slug: [suggested identifier]
Packaging: http://purl.org/net/terms/package/default
On-Behalf-Of: jbloggs
[request entity]
</pre>
</div>
<h2>6. In-Progress</h2>
<p>
The <span class="code_inline">In-Progress</span> request header MAY be used by the client to
inform the server that the current content payload is not yet
complete in some unspecified way during PUT, POST or DELETE. For
example, there may by further content packages that the client
plans to deliver to the server before the full content has been
delivered, or the client may need to carry out other actions
against the server before confirming that the server can proceed to
fully process the content. Exact interpretation of this header is
left to the server, so it is necessary that server/client pairs will
have to have a common understanding of its meaning which is beyond
the scope of this document.
</p>
<div class="code">
In-Progress = "In-Progress" ":" ("true" | "false")
</div>
<p>
For example:
</p>
<div class="code">
<pre>
POST /collection HTTP/1.1
Host: example.org
Authorization: Basic ZGFmZnk6c2VjZXJldA==
Content-Type: application/zip
Content-Length: [content length]
Content-MD5: [md5-digest]
Content-Disposition: attachment; filename=[filename]
Slug: [suggested identifier]
Packaging: http://purl.org/net/terms/package/default
On-Behalf-Of: jbloggs
In-Progress: true
[request entity]
</pre>
</div>
<h2>7. Metadata-Relevant</h2>
<p>
The <span class="code_inline">Metadata-Relevant</span> request header MAY be used by the client to
instruct the server to (attempt to) extract metadata from the
supplied content package, during PUT, POST or DELETE. Content
packages commonly contain both file content and metadata about its
contents, and during unpacking servers may process this metadata in a
way which is meaningful to them. If the content package is being
supplied to an HTTP resource which is not interested in metadata, then it may be that the
enclosed information will not be correctly or adequately treated. This
directive allows the client to indicate to the server that there is metadata
contained within the package which may be of interest to related resources (for
example a resource which contains the resource receiving the content),
and that the server should be free to update those resources accordingly.
</p>
<div class="code">
Metadata-Relevant = "Metadata-Relevant" ":" ("true" | "false")
</div>
<p>
For example:
</p>
<div class="code">
<pre>
POST /media-resource/27 HTTP/1.1
Host: example.org
Authorization: [auth]
Content-Type: application/zip
Content-Length: [content length]
Content-MD5: [md5-digest]
Content-Disposition: attachment; filename=[filename]
Slug: [suggested identifier]
Packaging: http://purl.org/net/terms/package/default
On-Behalf-Of: jbloggs
In-Progress: true
Metadata-Relevant: false
[request entity]
</pre>
</div>
<h2>8. References</h2>
<h3>8.1. Normative References</h3>
<p>[<a name="rfc2119">RFC2119</a>] <a href="http://www.ietf.org/rfc/rfc2119.txt">S. Bradner, "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.</a></p>
<p>[<a name="rfc2616">RFC2616</a>] <a href="http://www.ietf.org/rfc/rfc2616.txt">R. Fielding, UC Irvine, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee
"Hypertext Transfer Protocol -- HTTP/1.1",
RFC 2616, June 1999.</a></p>
<h2>Copyright and License Notice</h2>
<p>
Copyright (C) SWORD (2011). All Rights Reserved.
</p>
<p>
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to SWORD, except as needed for the
purpose of developing Internet standards in which case the procedures
for copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
</p>
<p>
The limited permissions granted above are perpetual and will not be
revoked by SWORD or its successors or assigns.
</p>
This document and the information contained herein is provided on an
"AS IS" basis and SWORD DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
</p>
</body>
</html>