web-archiving
Here are 17 public repositories matching this topic...
Chrome debugging protocol client for Java
-
Updated
Feb 21, 2020 - Java
A Splitable Hadoop InputFormat for Concatenated GZIP Files and *.(w)arc.gz
-
Updated
Feb 7, 2018 - Java
HTTP/S proxy server which replays content from a web archive
-
Updated
Apr 13, 2023 - Java
This module builds our Waybacks in the various different configurations we require.
-
Updated
Jun 30, 2018 - Java
(used on swap vm 6/2020) Stanford's fork of iipc/openwayback, which is used on our "swap" (Stanford Web Archiving Portal) machines. (See also sul-dlss/swap which is intended as a replacement)
-
Updated
Jan 6, 2021 - Java
Partition (W)ARC Files by MIME Type and Year
-
Updated
Feb 13, 2017 - Java
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
-
Updated
May 15, 2024 - Java
-
Updated
Jun 14, 2023 - Java
Improve this page
Add a description, image, and links to the web-archiving topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the web-archiving topic, visit your repo's landing page and select "manage topics."