• url canonicalization library for python and java

    Java 11 6 Updated Aug 16, 2018
  • Centralised repository for WARC usage specifications.

    HTML 29 10 Updated Jul 17, 2018
  • An Awesome List for getting started with web archiving

    230 31 Updated Jul 6, 2018
  • The OpenWayback Development

    Java 229 141 Apache-2.0 Updated Apr 30, 2018
  • Inventory of Web Archiving Training Resources

    1 Updated Nov 29, 2017
  • Common web archive utility code.

    Java 35 62 Apache-2.0 Updated Aug 9, 2017
  • IIPC Open Development

    4 4 Apache-2.0 Updated Jun 16, 2017
  • Shared config for Travis CI for IIPC.

    Shell 1 3 Apache-2.0 Updated May 3, 2017
  • Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

    Java 10 570 Updated Mar 9, 2017
  • Command line utility for working with CDX files

    Java 4 Apache-2.0 Updated Sep 29, 2016
  • IIPC Parent POM

    2 Apache-2.0 Updated May 24, 2016
  • Using social media to steer web archiving and curation.

    JavaScript 12 3 Updated Nov 20, 2015
  • web access control (exclusion oracle) tools for optional use with wayback machine

    JavaScript 4 8 Apache-2.0 Updated Mar 7, 2014
  • Sample Wayback Config using OpenWayback

    3 7 Updated Feb 7, 2014