-
-
Notifications
You must be signed in to change notification settings - Fork 206
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Docs and README Update for 2.0.0 (#277)
* docs and version update: - add docs for compatibility features - add docs for memento - updat rewriter docs - bump version to 2.0.0, update README, and changelist
- Loading branch information
Showing
8 changed files
with
284 additions
and
21 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
APIs | ||
==== | ||
|
||
pywb supports the following APIs: | ||
|
||
.. toctree:: | ||
|
||
cdxserver_api | ||
memento | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
.. _memento-api: | ||
|
||
Memento API | ||
=========== | ||
|
||
pywb supports the Memento Protocol as specified in `RFC 7089 <https://tools.ietf.org/html/rfc7089>`_ and provides API endpoints | ||
for Memento Timemaps and Timegates per collection. | ||
|
||
Memento support is enabled by default and can be controlled via the ``enable_memento: true|false`` setting in the ``config.yaml`` | ||
|
||
|
||
TimeMap API | ||
----------- | ||
|
||
The timemap API is available at ``/<coll>/timemap/<type>/<url>`` for any pywb collection ``<coll>`` and ``<url>`` in the collection. | ||
|
||
The timemap (URL-T) can be provided in several output formats, as specified by the ``<type>`` param: | ||
|
||
* ``link`` -- returns an ``application/link-format`` as required by the `Memento spec <https://tools.ietf.org/html/rfc7089#section-5>`_ | ||
* ``cdxj`` -- returns a timemap in the native CDXJ format. | ||
* ``json`` -- returns the timemap as newline-delimited JSON lines (NDJSON) format. | ||
|
||
|
||
Although not required by the Memento spec, the Link output produced by timemap also includes the extra ``collection=`` field, specifying | ||
the collection of each url. This is especially useful when accessing the timemap for the special :ref:`auto-all` to view a timemap across | ||
multiple collections in a single response. | ||
|
||
|
||
The Timemap API is implemented as a subset of the :ref:`cdx-server-api` and should produce the same result as the equivalent CDX server query. | ||
|
||
For example, the timemap query: | ||
``http://localhost:8080/pywb/timemap/link/http://example.com/`` is equivalent to the CDX server query: | ||
``http://localhost:8080/pywb/cdx?url=http://example.com/&output=link`` | ||
|
||
|
||
TimeGate API | ||
------------ | ||
|
||
The TimeGate API for any pywb collection is ``/<coll>/<url>``, eg. ``/my-coll/http://example.com/`` | ||
|
||
The timegate can either be a non-redirecting timegate (URL-M, 200-style negotiation) and return a URL-M response, or a redirecting timegate (302-style negotiation) and redirect to a URL-M. | ||
|
||
.. _memento-no-redirect: | ||
|
||
Non-Redirecting TimeGate (Memento Pattern 2.2) | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
This behavior is consistent with `Memento Pattern 2.2 <https://tools.ietf.org/html/rfc7089#section-4.2.2>`_ and is the default behavior. | ||
|
||
To avoid an extra redirect, the TimeGate returns the requested memento directly (200-style negotiation) without redirecting to its canonical, timestamped url. | ||
The 'canonical' URL-M is included in the ``Content-Location`` header and should be used to reference the memento in the future. | ||
|
||
|
||
(For HTML Mementos, the rewriting system also injects the url and timestamp into the page so that it can be displayed to the user). This behavior optimizes network traffic by avoiding unneeded redirects. | ||
|
||
|
||
Redirecting TimeGate (Memento Pattern 2.3) | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
This behavior is consistent with `Memento Pattern 2.3 <https://tools.ietf.org/html/rfc7089#section-4.2.3>`_ | ||
|
||
To enable this behavior, add ``redirect_to_exact: true`` to the config. | ||
|
||
In this mode, the TimeGate always issues a 302 to redirect a request to the "canonical" URL-M memento. The ``Location`` header is always present | ||
with the redirect. | ||
|
||
As this approach always includes a redirect, use of this system is discouraged when the intent is to render mementos. However, this approach is useful when the goal is to determine the URL-M and to provide backwards compatibility. | ||
|
||
|
||
URL-M Headers | ||
------------- | ||
|
||
When serving a URL-M (any archived url), the following additional headers are included in accordance with Memento spec: | ||
|
||
* ``Vary: accept-datetime`` is included as required | ||
* ``Link`` header with at least ``original``, ``timegate`` and ``timemap`` relations | ||
* ``Content-Location`` is included if using :ref:`memento-no-redirect` behavior | ||
|
||
(Note: the ``Content-Location`` may also be included in case of fuzzy-matching response, where the actual/canonical url is different than requested url due to an inexact match) | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
Oops, something went wrong.