-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recommendation for storing and versioning AIPs without the use of BagIt #83
Comments
Hello, I am not quite understanding the use cases that are meant to be supported. In my view, AIPs live independently of SIPs, but might be affected by them, such as they might be affected by other operations such as metadata enrichment, file format convertion or even redaction or destruction by retention processes. Note also that AIPs might be created by SIPs, or by operations, such when creating an AIC (and AIP relative to a collection or a case with only metadata). Also, I am not very confortable with this level of complexity, the versioning should be an acessory level, that you could take it or leave it without much change to the overall format of the AIP. As such, I would like to suggest the following:
Given this, I would like to suggest the following layout of an AIP with two versions, were the first version was created by a SIP and the second version is just an update of the descriptive metadata.
|
I don't have strong feelings, and I don't have skin in the game. I'm not quite able to make like for like comparison because one or two points in @luis100 response aren't clear to me. I think you're suggesting no BAGIT and no use of TAR for OCFL versions. I'm inclined to agree about BAGIT; I'm not sure we gain much from its use and much of the metadata is redundant. I agree that using TAR archives in versions is perhaps a bit messy and obscures the content/metadata changed.
Storage and processing impact IS and implementation detail to some degree. Institutional policy/budget/choice will also be a factor. Making them optional appears a sensible decision. |
Note that the OCFL format does not belong to the AIP. This is just one possible way how to store the original SIP (in this example as v00000) if you want to keep it, and the versioned AIPs are separate instances of AIPs. We moved away from integrating the versions into the AIP since E-ARK3. Packaging as TAR/ZIP/etc. is a technical implementation detail that depends on storage system and requirements. It is adequate if the packages need to be transferred and may be a good approach if you have a tape system where the AIPs are stored for the long-term. However, if the AIPs are still being updated, the continuous re-packaging causes a lot of processing and redundancy. The question here was about the use of BagIt to wrap E-ARK AIPs. In E-ARK, the manifest is included in the METS, but bagit has a simpler, non-XML format (payload manifest) for this purpose. |
The AIP working group discussed the use of BagIt and recommends to take it out of the main recommendation for AIP packaging. Instead, it would be moved to an appendix where it will be explained how to wrap E-ARK information packages using BagIt. As optional BagIt packaging is also relevant for the SIP, it should be added to the CSIP rather than to the AIP. This decision is independent from the use of OCFL which will be dealt with in a separate issue. The suggestion is:
Board members acknowledgment of the issue:
Voting Tick the box in front of you name to say yes to the suggestion.
|
6 DILCIS Board members have acknowledge the issue The BagIt recommendation for packaging will be removed to to a CSIP appendix |
For the versioning of AIPs the plan is to recommend the use of OCFL.
Assuming the following structure for an original submission information package
example.sip.001.tar
stored as versionv00000
and an AIPurn+uuid+81bd3aa2-7350-44f6-ad54-d8181858605a.tar
stored as versionv00001
:The
inventory.json
could look as follows:Note that there is an overlap of fixity information which is provided in the METS already.
The question for voting is if the container files
example.sip.001.tar
for the original SIP andurn+uuid+81bd3aa2-7350-44f6-ad54-d8181858605a.tar
for the AIP should be wrapped in a bagit container, for example:Note that this way fixity information would possibly be provided in up to four layers:
To reduce complexity and redundancy, the proposal is store the E-ARK information package as TAR files instead of wrapping them as bagit containers as shown in the example above.
The E-ARK AIP container file
urn+uuid+81bd3aa2-7350-44f6-ad54-d8181858605a.tar
would then have the following form, for example:The suggestion is:
As part of the general AIP recommendations, the proposal is to store the E-ARK information package as TAR files instead of wrapping them as bagit containers.
The text was updated successfully, but these errors were encountered: