Skip to content

Rosa Archive Development

jabrah edited this page Jan 22, 2015 · 11 revisions

The rosa archive framework supports CRUD access to collections of digital book facsimiles on a file system through an API and a command line tool. The digital book content includes high resolution images and metadata.

The framework will be refactored from the existing modules in the rosa repository.

Requirements

  • Support existing file system content
  • Ease migration of content from a file system to a repository.
  • Unit test all public methods.
  • Make model, API, and command line tool separate maven modules.
  • Javadoc on all classes and public methods.

Code organization

The code is organized into a maven module containing submodules.

Group: rosa
Artifact: rosa-archive
Version: 2.0.0-SNAPSHOT
Modules:

Technology

  • Java 7
  • Use Apache Commons IO instead of homegrown utilities.

Archive code modules

Model

Group: rosa
Artifact: rosa-archive-model
Version: 2.0.0-SNAPSHOT

Requirements:

  • Must be GWT compatible, model classes have to implement isSerializable, etc
  • Encompass all preserved content
  • Accommodate manuscripts, printed books, etc especially those from Archaeology of Reading
  • Implement equals/hashCode appropriately
  • Implement toString to help debugging
  • Unit test public methods

Core api

API plus implementation for interacting with content.

Group: rosa
Artifact: rosa-archive-core
Version: 2.0.0-SNAPSHOT

####Requirements:

  • Build model from file system or stream
  • Serialize to stream
  • List all content
  • Add content
  • Check the "consistency" of content
  • Verify bit level integrity of data
  • Document all

Tool

Command line tool for handling data in the archive.

Group: rosa
Artifact: rosa-archive-tool
Version: 2.0.0-SNAPSHOT

####Requirements:

  • list content
  • check content
  • verify content (bit check)
  • add content
  • update checksums
  • guess image order and write file
  • document workflow ex: how do you add/create content?
  • run all existing content