This project contains the setup for running pywb web archive replay system with the Internet Archive web archives.
It is still in experimental/alpha phase and should only be used for testing replay only.
pip install -r requirements.txt
which includes installing latest pywb and uwsgi and gevent.
Run with uwsgi uwsgi.ini
(The current default is gevent+uwsgi but feel free to modify uwsgi.ini
as needed)
/web/
-> replays fromhttps://web.archive.org/web/
For example, http://localhost:8080/web/20111231161728//example.com/
will replay equivalent content from http://web.archive.org/web/20111231161728/http://www.iana.org/domains/example/
using pywb replay system.
/ait/
-> replays fromhttp://wayback.archive-it.org/<COLLID>
/ait/all/
-> replays fromhttp://wayback.archive-it.org/all/
<COLLID>
corresponds to a collection from the http://archive-it.org/ service.
/item/<ITEMNAME>
-> replays from WARC files stored underhttp://archive.org/details/<ITEMNAME>
For any public ITEMNAME that has a cdx files, replay content from that item only.
This will download the item .idx
file locally on first use, and access the .cdx.gz
and WARC remotely.
The item's .idx
, .cdx.gz
and WARC files must be accessible for this to work.