Skip to content

WSU-CDSC/microservices

Repository files navigation

microservices

This is a repository of scripts/microservices that are being used at WSU Libraries. Usage documentation is provided in links for individual scripts.

List of Scripts and Microservices

  • Metadata Creation & Monitoring

  • Cloud Migration Scripts

    • makeaip.rb: A script for generating archival packages from source directories.
    • uploadaip.rb: A script that works to upload AIPs generated with makeaip.rb to Backblaze B2. Generates a new JSON file incorporating makeaip.rb log and Backblaze upload PREMIS event.
  • Caption Workflow Scripts

    • caption-crunch.sh: A quick and dirty loop script to take an input file containing a list of video links and harness Gnu Parallel to run them through the caption process.
    • vid2watson.rb: Converts audio track of input file and runs it through IBM Watson speech to text service (must be edited with valid Watson login information). Creates a folder with raw JSON output as well as roughly parsed content.
    • wastson2vtt.rb: Takes the JSON output of vid2watson.sh and attempts to parse it into a .vtt subtitle file by using time stamps associated with identified words.
  • Misc

    • extsurvey.rb: A tool for rapidly surveying directories for file types by extension. Can create an output in csv of extension types and counts, a file with complete file paths for a given extension, and a file with complete file paths for all extensions whose total count falls under a certain threshold.
    • make-ead.rb A tool for generating finding aids via applying WSU's adaption of Archivists' Toolkit's EAD to HTML style sheet.
    • ocr_test.rb A tool for scanning PDFs for OCR text data. Creates a CSV file with results.