-
Notifications
You must be signed in to change notification settings - Fork 1
Closed
Labels
OCUL: AM-DataverseOCUL: AM-DataverseOCUL: AM-Dataverse
Description
Problem to be solved.
In the Dataverse project with OCUL we are identifying a number of issues with the current Proof of Concept integration. Many of these need further analysis before we can sort out which are 'defects' and which are enhancement ideas for future projects.
This issue is an umbrella issue for now while we carry out that analysis.
Describe the solution you'd like to see implemented.
From Meghan, a number of data mapping type questions identified so far:
- Not all of the DDI fields are mapped to METS (missing publication date, abstract/description, keywords, subjects). Note that these are required fields in Dataverse. Can we extend the current mapping?
- TODO: analysis of current data mapping in the wiki
- A large number of datasets have handles rather than DOIs for the datasets that were deposited prior to version 4.x (see this DV dataset and JSON, for example). Will we be able to capture datasets with handles properly in the METS file under IDNo (pulling persistentURL and protocol from the JSON)?
- I set up an SS space/location for the production version of dataverse, so that I could attempt to retrieve & preserve the QUBS dataset. It fails pretty early on due to a checksum error, but I'm guessing that the prod version may not be up to date with demodv.scholarsportal.info ?
- The METS contains at least two dmdSec IDs (or more if there are derivatives). For some reason (and this is documented in the PoC) the dmdSecs in the structMap section are not separated between the objects and metadata folders - they are both put on the first line. Is the dmdSec/structMap relationship possible to parse in more detail for subfolders in a transfer?
- TODO: revisit data mapping on wiki
- For the Bala Parental and Statistical Software datasets, in the objects folder there are zips which contain the original files and DV-generated derivatives, along with metadata files. These zips are also duplicated. See for example, in the StatisticalSoftware AIP, the “MusiciansANOVA.zip” and “MusiciansANOVA.zip-2018-07-19T19_28_21.949093_00_00”. It seems redundant to include both zips in the AIP.
FIXED: this is no longer happening...for any 'bundles', the zips get extracted by the storage service now and only the contents are included in the Transfer to Archivematica.
Metadata
Metadata
Assignees
Labels
OCUL: AM-DataverseOCUL: AM-DataverseOCUL: AM-Dataverse