Project to digitize the core components of the M. Watt Espy Papers and make them availible for a wide variety of uses
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
preprocessing
seeding-db
.gitignore
LICENSE
README.md
development.md
metadata.md
metadata_05-04-17.md
systems.md

README.md

Espy Project

The Digital Archive of Executions in the United States, 1608-2002

Stage One: Planning, Communication, Digitization (April-September)

  • Outline pace for the project in each major area
  • Define contributions from vendor and library departments
  • Acquire comfort with major tools
  • Begin digitization of Series 1
Espy Scanning
  • Engage Vendor for Series 1
    • Vendor Consultation
    • Ship card files to vendor
    • Sample scan review
  • Evaluate scanning of Series 2
    • Vendor consultations
    • Workflow testing
    • Item-level differentiation/removing staples
    • Scanning Timeline
  • Review
    • Make minor filename fixes
  • Preprocessing
    • OCR testing
    • Imagemagik scripts for working files
    • Tesseract script for OCR text extraction
    • Python scripts to create CSVs for database seed tables
Espy Metadata Creation
  • Pre-metadata mission and goal setting
  • Consult with ICPSR about data sharing
    • Approval from ICPSR
  • Begin Data Modeling
    • Vocabulary selections
    • consult Technical Services
  • Pre-metadata ingest
  • Wireframing for metadata creations system
  • Throw out all the data models
  • Preprocessing of data from The Espy File (provided by ICPSR)
  • Workflow testing
    • Database backup procedures
  • Time trials
Repository Development
  • Rails experimentation, ramp up plan
  • Set up Hydra dev server
  • Frontend wire framing
  • Establish development timeline
  • Beta Metadata Creation Tool Completed!
  • Prototyping date
  • User testing planning
    • Consultation with Public Services
System Support and Implementation
  • Technology demo
  • Library Systems planning/timeline development
  • Initial development server delivered
  • Establish Solr production goal
  • Establish Fedora production goal
Preservation Planning
  • Establish preservation planning team
  • Examining old draft policy
  • Consult with Preservation dept
  • Define deliverable and timeline

Stage Two: Metadata Creation and Repository Development

Espy Scanning
  • Review Series 2 files as they return from vendor
  • Continue preprocessing workflow
  • Seed database for metadata creation tool
Repository Development
  • Hyrax testing
  • Workflow modeling
Espy Metadata Creation
  • Evaluate progress
  • Decide on joined or separate OCR cleanup workflow
System Support and Implementation
  • Provision Solr Server
  • Consult with ITS on storage levels
  • Establish requirements and timeline for Fedora Server
Preservation Planning
  • Consult with Systems and ITS
  • Set long term storage benchmarks