Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finish remainder of the crawl workflow (SIPs and submission) #11

Open
5 tasks
anjackson opened this issue Nov 17, 2016 · 0 comments
Open
5 tasks

Finish remainder of the crawl workflow (SIPs and submission) #11

anjackson opened this issue Nov 17, 2016 · 0 comments
Assignees
Milestone

Comments

@anjackson
Copy link
Contributor

anjackson commented Nov 17, 2016

  • Mint ARKs for each payload file, incrementally.
  • Store full WARC-to-ARK identifier mapping somewhere and update it over time.
  • Build current METS SIP for final package (avoiding hard-coded fields - see below)
  • Submit final SIP to DLS.
  • Verify SIP appears in DLS.

There are a number of hard-coded fields in the SIP generation process. The ones in creator.py refer only to the temporary BagIt that is used to send the SIP to DLS so that's probably acceptable. However, the others (in mets.py) become part of the METS in the archival package.

Note that we could get the ClamD version string by posting the string VERSION to it's API.

Other fields should probably get picked up from the general configuration file, but note that this does not seem to be picked up correctly outside of the Celery run-time (TBC).

@anjackson anjackson added this to the 1.0.0 Release milestone Nov 17, 2016
@anjackson anjackson self-assigned this Nov 30, 2016
@anjackson anjackson modified the milestones: 1.1.0 Release, 1.0.0 Release Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant