Skip to content

jjjake/collections-cleaners

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

collections-cleaners

This is a collection of scripts in use by the collections team to clean up the collections at the Internet Archive. They require administration privileges and the use of the InternetArchive Python interface script.

These scripts are being published on github to allow commentary and learning materials for the internal maintenance of the Archive's stores of materials.

  • frontpage.sh is a script that goes through all the items in a collection (with a texts mediatype) and ensures the first page of every item's document is the thumbnail, and that the online reader starts on the cover.
  • semi.sh is run against an item and if the "Subject" field is a set of subjects separated by semicolons instead of a set of different Subject settings with each entry, it will split them up and add them in properly.

Releases

No releases published

Packages

 
 
 

Languages

  • Shell 100.0%