Skip to content
Scripts for triaging and submitting URLs for web archiving
Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
LICENSE
README.md
archiveis-check.sh
archiveis-save.sh
mementoweb-check.sh
wayback-memento-check.sh
wayback-save.sh

README.md

web-archive-triage

web-archive-triage is a set of simple scripts designed to triage web archiving of a list of URLs.

Requirements

Usage

  • wayback-memento-check.sh - takes a list of URLs on standard input and checks for the presence of at least one snapshot in the Internet Archive Wayback Machine via the Memento API.

     ./wayback-memento-check.sh urls-present.txt urls-missing.txt < urls.txt
    
  • wayback-save.sh - takes a list of URLs on standard input and submits them for saving by the Internet Archive Wayback Machine.

     ./wayback-save.sh save-success.txt save-failure.txt < urls-missing.txt
    
  • mementoweb-check.sh - takes a list of URLs on standard input and checks for the presence of at least one snapshot in any web archive known by the mementoweb.org Time Travel service via the Memento API.

     ./mementoweb-check.sh urls-present.txt urls-missing.txt < urls.txt
    
  • archiveis-check.sh - takes a list of URLs on standard input and checks for the presence of at least one snapshot in archive.is via the Memento API.

     ./archiveis-check.sh urls-present.txt urls-missing.txt < urls.txt
    
  • archiveis-save.sh - takes a list of URLs on standard input and submits them for saving by archive.is.

     ./archiveis-save.sh save-success.txt save-failure.txt < urls-missing.txt
    
You can’t perform that action at this time.