Skip to content
This repository has been archived by the owner on Jan 5, 2021. It is now read-only.
Nathan Tallman edited this page Mar 14, 2019 · 5 revisions

Import

Files for import into CHO should conform to the following specifications. Each batch of files should be bagged, zipped (uncompressed), and staged for import. These specs apply to all work types and work types do not have file format requirements.

Users will upload a CSV in the GUI to initiate the import. The CSV will includes the batchID which will be used to construct the path to the files for a given work (e.g. //CHO/choStaging/batchID/data/workID). While the typical user will be uploading all the works to the same collection, this process can be used to upload works to multiple collections because the CSV specifies to which collection each work in the bag belongs.

Update

During MVP, when updating existing metadata, the identifier and work type fields cannot be changed.

Notes

  • A staging directory is on Isilon, connected via SMB (mapped network drive, connect to server) to upload files. CHO will mount the same directory (NFS) to read the files for import. After a user has completed the import process in the CHO GUI, the files will be deleted from the staging directory.
  • Any csv files in the ingest can be auto-loaded via a rake task in the command line.
  • Representative file sets are not explicitly included in CSV metadata or assigned an identifier