Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AIP METS file has 2 different versions of premis #370

Closed
evelynPM opened this issue Dec 7, 2018 · 7 comments
Closed

AIP METS file has 2 different versions of premis #370

evelynPM opened this issue Dec 7, 2018 · 7 comments
Assignees
Labels
Milestone

Comments

@evelynPM
Copy link

evelynPM commented Dec 7, 2018

Please describe the problem you'd like to be solved.
In the AIP METS file, premis 3.0 is used for intellectual entity but premis 2.2 is used everywhere else

Describe the solution you'd like to see implemented.
There was only one backward-incompatible change in premis 3.0 vs premis 2.2. In premis 3.0 eventDetail was changed to eventDetailInformation and made a repeatable & extensible container instead of a single semantic unit, as follows:

eventDetailInformation (O, R)
--eventDetail (O, NR)
--eventDetailExtension (O, R)

Sample eventDetail from premis 2.2:

<premis:eventDetail>program="python"; module="hashlib.sha256()"</premis:eventDetail>

Same sample in premis 3.0:

<premis:eventDetailInformation>
<premis:eventDetail>program="python"; module="hashlib.sha256()"</premis:eventDetail>
</premis:eventDetailInformation>

Describe alternatives you've considered.

Additional context


For Artefactual use:
Please make sure these steps are taken before moving this issue from Review to Verified in Waffle:

  • All PRs related to this issue are properly linked 👍
  • All PRs related to this issue have been merged 👍
  • Test plan for this issue has been implemented and passed 👍
  • Documentation regarding this issue has been written and it has been added to the release notes, if needed 👍
@evelynPM evelynPM added the Severity: high A high-priority situation where performance or other uses are significantly impacted or degraded. label Dec 11, 2018
@evelynPM
Copy link
Author

I've marked this as severity:high because PREMIS 3.0 came out in November 2015 - i.e. more than three years ago. The quality of Archivematica's metadata is a key to widespread adoption, and this is a relatively minor change that will bring the system up-to-date with this important standard.

@sromkey sromkey added the Status: refining The issue needs additional details to ensure that requirements are clear. label Dec 17, 2018
@sevein sevein self-assigned this Dec 19, 2018
@sromkey sromkey added this to the 1.9.0 milestone Dec 19, 2018
@sromkey sromkey added Status: ready The issue is sufficiently described/scoped to be picked up by a developer. and removed Status: refining The issue needs additional details to ensure that requirements are clear. labels Dec 19, 2018
@ablwr ablwr added Status: in progress Issue that is currently being worked on. and removed Status: ready The issue is sufficiently described/scoped to be picked up by a developer. labels Dec 25, 2018
@ablwr
Copy link
Contributor

ablwr commented Dec 25, 2018

It looks like there's a little more than just adjusting the eventDetailInformation to get valid PREMIS. Other than updating the namespace and versions everywhere, a couple of other things:

  • premis:objectCharacteristicsExtension should not exist if it is going to be blank. It should only exist if it has content, right now we make a self-closing tag.
  • an update for <relatedObjectIdentification> and <relatedEventIdentification> which are now <relatedObjectIdentifier> and <relatedEventIdentifier>

I also get this:

  • Updated needed here: "There must be PREMIS elements inside the METS container."
    But I get that with allegedly valid PREMIS in METS from the loc.gov website and there are PREMIS elements inside the METS container, so... 🤷‍♀️ ?

Looks like we'll have to update our tests and their five .xml fixtures too.

@ablwr
Copy link
Contributor

ablwr commented Dec 25, 2018

Another: premis:copyrightApplicableDates expects premis:startDate as its content, even if empty.

@ablwr
Copy link
Contributor

ablwr commented Dec 25, 2018

Storage Service's relationship with mets-reader-writer will also have to be updated. It's writing our pointer files, which are currently coming out wrong, and I believe this also affects the ability to re-ingest.

@sromkey sromkey added triage-release-1.9 and removed Severity: high A high-priority situation where performance or other uses are significantly impacted or degraded. labels Jan 8, 2019
@sromkey sromkey removed this from the 1.9.0 milestone Jan 14, 2019
@sevein
Copy link
Contributor

sevein commented Jan 18, 2019

The PREMIS3 update is now part of AM dev/issue-24-handle-old-aips (#24) where we're doing some additional changes to reingest old METS documents, being PREMIS3 only one of the changes needed. It's still work in progress. We've been successfully generating METS with PREMIS3 and reingest too, but old METS are causing some issues.

@sevein sevein added Status: review The issue's code has been merged and is ready for testing/review. and removed Status: in progress Issue that is currently being worked on. labels May 10, 2019
@evelynPM
Copy link
Author

evelynPM commented Jul 8, 2019

Regarding the comment "premis:objectCharacteristicsExtension should not exist if it is going to be blank. It should only exist if it has content, right now we make a self-closing tag." @ablwr would you mind filing that as a separate issue & describe the circumstances under which an empty premis:ObjectCharacteristicsExtension semantic unit would be created?

@evelynPM evelynPM removed the Status: review The issue's code has been merged and is ready for testing/review. label Jul 8, 2019
@evelynPM evelynPM closed this as completed Jul 8, 2019
@ablwr
Copy link
Contributor

ablwr commented Jul 9, 2019

@evelynPM I tried to recreate this issue but I think it was resolved in the update. I can see empty self-closing <premis:objectCharacteristicsExtension/> in 1.9x (viewing a processed SampleTransfers/Images) but in 1.10x, it is either present and full, or not present at all, as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants