Skip to content
YUL Web archiving scripts
Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
LICENSE
README.md
yulWA
yulWA-calendars
yulWA-yfile

README.md

YUDL Web archiving

Description

This is a collection of shell scripts to capture and preserve York University and Government of Canada websites using Heritrix with the Web ARChive (WARC) standard, wkhtmltopdf/image, and a descriptive metadata (MODS) record.

Requirements

Installation

Setup the above requirements, clone the repository, and put the shell scripts in a path that cron can execute:

git clone https://github.com/yorkulibraries/yul-web-archiving.git
ln -s /path/to/web/archiving/script /path/that/cron/can/execute

Usage

Add to cron. Please use an appropriate time. Don't want to blow up anybody's server.

Ex:

0 3 * * * bash -c '/usr/local/bin/yulWA-yfile'

License

Public Domain

CC0

Thanks

Peter Binkley

You can’t perform that action at this time.