Popular repositories
-
solrwayback
solrwayback PublicA search interface and wayback machine for the UKWA Solr based warc-indexer framework.
-
Repositories
Showing 10 of 53 repositories
- solrwayback Public
A search interface and wayback machine for the UKWA Solr based warc-indexer framework.
-
-
- jwarc-cdx-indexer-workflow Public
Will process all warc-files defined in a text file with JWARC and send to a CDX-server (Outback CDX etc.) . If process is stopped and restarted it will continue from where it was.
-
- jwarc Public Forked from iipc/jwarc
Java library for reading and writing WARC files with a typed API
- heritrix3 Public Forked from Landsbokasafn/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
- webarchive-discovery Public Forked from ukwa/webarchive-discovery
WARC and ARC indexing and discovery tools.
- cdx-summarize-warc-indexer Public Forked from ymaurer/cdx-summarize-warc-indexer
Summarize Web Archive holdings using an existing SOLR index