General overview

hhockx edited this page Sep 3, 2014 · 1 revision

Wayback is an open source Java application designed to query and access archived web material. It was first released by the Internet Archive in September 2005, based on the (then) perl-based Internet Archive Wayback Machine, to enable public distribution of the application and increase its maintainability and extensibility. The Open Source Wayback Machine (OSWM) since then has been widely used by members of the International Internet Preservation Consortium (IIPC) and become the de facto rendering software for web archives.

The Internet Archive (IA), also a member of the IIPC, handed over the lead repository of OSWM to the IIPC, who launched the OpenWayback project in October 2013. The objectives are to address common requirments, and to set up stable testing and release processes so that changes from all parties (including the IA) could be tested across a range of deployment contexts in different organisations before making it into an ‘official’ release.

The Internet Archive continues to develop Wayback. The latest IA Wayback fork can be found at https://github.com/internetarchive/wayback/. The intention is to include and merge changes on the IA fork so that the two forks do not diverge significantly.

OpenWayback supports two access or Replay modes: "Archival URL" and "Proxy".

In Archival URL mode, HTML documents returned to users are modified from the original version to provide a replay experience more consistent to viewing the original content. This is accomplished by one of two methods. The first includes modification of a subset of the HTML tags on the server, combined with the insertion of JavaScript into the HTML page. The inserted JavaScript executes in the client browser after the page has loaded, and modifies the remaining URLs within the HTML page, so that they become appropriate Archival URL requests back to the Wayback application. The second method involves rewriting all HTML tags within the page on the server, to make embedded URLs point back into the Wayback application.

In Proxy URL mode, the Wayback Replay UI acts as an HTTP proxy server, allowing users to configure their browsers to proxy all HTTP requests through the Wayback application. No hyperlink alteration is needed – content just works as-is. Any hyperlink references found in replayed documents will automatically be requested though Wayback, including dynamic content.

Please see AccessPoint Overview for more details.