New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option 3: An Alternative Proposal #6

Open
ikreymer opened this Issue May 25, 2017 · 0 comments

Comments

Projects
None yet
1 participant
@ikreymer
Copy link

ikreymer commented May 25, 2017

Based on all the other comments and thoughts here, I wanted to suggest a new proposal, a variation on Option 1 as well as a few other things.

This proposal is optimized for the Memento Reconstruct and http://oldweb.today use case, and considers individual archives as well as aggregators. Using this system, it should be possible to implement aggregators which efficiently query archives that support raw mementos, while filtering out those that do not. Such an aggregator would be suitable for use with the Memento Reconstruct and http://oldweb.today services.

Values for Prefer/Preference-Applied

  • Prefer: raw - request raw unrewritten content and raw headers, where possible. Hop-by-Hop headers should be prefixed with X-Archive-Orig-

  • Prefer: rewritten - request rewritten content, suitable for displaying to a user (optional, default preference if omitted)

TimeGate

User makes a request with a preference: curl -H "Prefer: raw" -H "Accept-Datetime ..." "http://archive.example.com/timegate/http://example.com/"

  • If the archive can satisfy the preference, 302 Redirect is returned with Preference-Applied, Vary: accept-datetime, prefer

  • If the archive does not support the preference for any URI-R, return 415 Unsupported, Vary: prefer

  • If the archive does not support the preference for this URI-R only, return 404 Not Found, Vary: prefer

  • If the archive does not yet support this feature, Prefer not included in Vary (fallback to default Memento behavior)

TimeGate Aggregator

The aggregator TimeGate accepts a Prefer header and passes it on to each individual TimeGate.

  • If an individual TimeGate returns 3xx or 2xx response and Vary: Prefer and Preference-Applied is present, include the response

  • If an individual TimeGate returns 415, this preference is not supported by this archive, so no need to query it again for this preference.

  • If an individual TimeGate returns 404, this preference is not supported for this URI-R, but include it in future queries for other URI-R

  • If an individual TimeGate does not yet support this extension, eg. no Preference-Applied or no Vary: Prefer, but a valid TimeGate response, treat it same as a 415.

    • Alternative: have a lax and strict mode that would allow including ambiguous responses, maybe through an additional Prefer: strict or Prefer: laxsetting. For simplicity, just default to strict always.

TimeMap

  • A TimeMap should also accept a Prefer header and include only responses that satisfy the preference, eg. a TimeMap of Prefer: raw should only include URI-Ms that are raw mementos.

  • If a TimeMap does not support a specific preference for any URI-R, it should return 415 Unsupported Type and Vary: Prefer

  • If a TimeMap does not support a specific preference for just this URI-R, it should return 404 Not Found and Vary: Prefer

  • If the aggregator is cacheing the TimeMap, the key should include the preference and the url.

TimeMap Aggregator

  • A TimeMap Aggregator should pass the Prefer header to the individual TimeMap URLs, and should merge only 200 responses with Preference-Applied and Vary: Prefer.

  • Similar to TimeGate, 415 error should indicate don't query again with this preference and should be cached.

  • Similar to TimeGate considerations, TimeMaps without Vary: Prefer should be considered erroneous and not included, unless supporting a lax and strict option.

Memento

(Optional) Each URI-M includes a Preference-Applied describing the dimension of rawness (so far, just rewritten or raw), the same one as was returned by the TimeGate prior to the redirect, or listed in the TimeMap. The URI-M should not include a Vary: Prefer

Each URI-M can have one and only one Preference-Applied associated with it and it must not change based on any other header.

ikreymer added a commit to ukwa/pywb that referenced this issue Feb 28, 2018

memento prefer header: add support for Prefer header for specifying '…
…raw' or 'rewritten' mementos (ukwa/ukwa-pywb#12, based on mementoweb/rfc-extensions#6)

- 'enable_prefer: true' in config can be used to enable experimental Memento Prefer behavior
- Prefer header support both redirect and non-redirect style negotiation, extending existing Memento patterns
- Prefer header can be applied both on memento and timegate endpoints
- for redirect style negotiation, Prefer results in a redirect to final memento (if needed), both on Timegate and URL-M (Memento Pattern 2.3)
- for non-redirect style negotiation (Memento Pattern 2.2), Prefer header affects content being served and changes the Content-Location to the canonical representation
- Vary: Prefer and Preference-Applied headers always added to URL-M and Timegate responses
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment