Loading…

Ruby 2 2

Cinch2

A project to add site crawling, file normalization, natural language processing and increased scalability to the current Cinch project. Cinch is a project to develop a bulk download service to a central repository that will maintain original file timestamps, virus check, extract file level metadata, create file checksums and periodically validat…

Updated

Cinch

A project to develop a bulk download service to a central repository that will maintain original file timestamps, virus check, extract file level metadata, create file checksums and periodically validate checksums for continued file integrity. Users merely need to upload a list of URLs to download and when the process completes they can download…

Updated

NCLives

A project to merge and wrap content from the Internet Archive and CONTENTdm using OAI-PMH harvesting as well as the CONTENTdm API. The goal is to amalgamate disparate content and make it full text searchable using Apache Solr (currently uses the Zend Search Lucene library).

Updated

Constraint-Analysis

Simple constraint analysis tool for Archive-it crawls.

Updated