Grow your team on GitHub
GitHub is home to over 28 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.Sign up
An Awesome List for getting started with web archiving
Centralised repository for WARC usage specifications.
url canonicalization library for python and java
The OpenWayback Development
Inventory of Web Archiving Training Resources
Common web archive utility code.
IIPC Open Development
Shared config for Travis CI for IIPC.
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
Command line utility for working with CDX files
IIPC Parent POM
Using social media to steer web archiving and curation.
web access control (exclusion oracle) tools for optional use with wayback machine
Sample Wayback Config using OpenWayback