Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4593 nesstar ddi import #5170

Merged
merged 15 commits into from
Nov 30, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
21 changes: 21 additions & 0 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,27 @@ Before calling the API, make sure the data files referenced by the ``POST``\ ed
* A Dataverse server can import datasets with a valid PID that uses a different protocol or authority than said server is configured for. However, the server will not update the PID metadata on subsequent update and publish actions.


Import a Dataset into a Dataverse with a DDI file
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. note:: This action requires a Dataverse account with super-user permissions.

To import a dataset with an existing persistent identifier (PID), you have to provide the PID as a parameter at the URL. The following line imports a dataset with the PID ``PERSISTENT_IDENTIFIER`` to Dataverse, and then releases it::

curl -H "X-Dataverse-key: $API_TOKEN" -X POST $SERVER_URL/api/dataverses/$DV_ALIAS/datasets/:importddi?pid=$PERSISTENT_IDENTIFIER&release=yes --upload-file ddi_dataset.xml

The optional ``pid`` parameter holds a persistent identifier (such as a DOI or Handle). The import will fail if the provided PID fails validation.

The optional ``release`` parameter tells Dataverse to immediately publish the dataset. If the parameter is changed to ``no``, the imported dataset will remain in ``DRAFT`` status.

The file is a DDI xml file.

.. warning::

* This API does not handle files related to the DDI file.
* A Dataverse server can import datasets with a valid PID that uses a different protocol or authority than said server is configured for. However, the server will not update the PID metadata on subsequent update and publish actions.


Publish a Dataverse
~~~~~~~~~~~~~~~~~~~

Expand Down
73 changes: 73 additions & 0 deletions src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
import edu.harvard.iq.dataverse.Dataverse;
import edu.harvard.iq.dataverse.DataverseFacet;
import edu.harvard.iq.dataverse.DataverseContact;
import edu.harvard.iq.dataverse.api.imports.ImportException;
import edu.harvard.iq.dataverse.api.imports.ImportServiceBean;
import edu.harvard.iq.dataverse.authorization.DataverseRole;
import edu.harvard.iq.dataverse.DvObject;
import edu.harvard.iq.dataverse.GlobalId;
Expand Down Expand Up @@ -109,6 +111,9 @@ public class Dataverses extends AbstractApiBean {

@EJB
ExplicitGroupServiceBean explicitGroupSvc;

@EJB
ImportServiceBean importService;
// @EJB
// SystemConfig systemConfig;

Expand Down Expand Up @@ -303,6 +308,74 @@ public Response importDataset(String jsonBody, @PathParam("identifier") String p
}
}

// TODO decide if I merge importddi with import just below (xml and json on same api, instead of 2 api)
@POST
@Path("{identifier}/datasets/:importddi")
public Response importDatasetDdi(String xml, @PathParam("identifier") String parentIdtf, @QueryParam("pid") String pidParam, @QueryParam("release") String releaseParam) throws ImportException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I merge this with https://github.com/IQSS/dataverse/blob/develop/src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java#L240 in order to do one api only ? I used nearly the same structure.

try {
User u = findUserOrDie();
if (!u.isSuperuser()) {
return error(Status.FORBIDDEN, "Not a superuser");
}
Dataverse owner = findDataverseOrDie(parentIdtf);
Dataset ds = null;
try {
ds = jsonParser().parseDataset(importService.ddiToJson(xml));
}
catch (JsonParseException jpe) {
return badRequest("Error parsing datas as Json: "+jpe.getMessage());
}
ds.setOwner(owner);
if (nonEmpty(pidParam)) {
if (!GlobalId.verifyImportCharacters(pidParam)) {
return badRequest("PID parameter contains characters that are not allowed by the Dataverse application. On import, the PID must only contain characters specified in this regex: " + BundleUtil.getStringFromBundle("pid.allowedCharacters"));
}
Optional<GlobalId> maybePid = GlobalId.parse(pidParam);
if (maybePid.isPresent()) {
ds.setGlobalId(maybePid.get());
} else {
// unparsable PID passed. Terminate.
return badRequest("Cannot parse the PID parameter '" + pidParam + "'. Make sure it is in valid form - see Dataverse Native API documentation.");
}
}

boolean shouldRelease = StringUtil.isTrue(releaseParam);
DataverseRequest request = createDataverseRequest(u);

Dataset managedDs = null;
if (nonEmpty(pidParam)) {
managedDs = execCommand(new ImportDatasetCommand(ds, request));
}
else {
managedDs = execCommand(new CreateNewDatasetCommand(ds, request));
}

JsonObjectBuilder responseBld = Json.createObjectBuilder()
.add("id", managedDs.getId())
.add("persistentId", managedDs.getGlobalIdString());

if (shouldRelease) {
DatasetVersion latestVersion = ds.getLatestVersion();
latestVersion.setVersionState(DatasetVersion.VersionState.RELEASED);
latestVersion.setVersionNumber(1l);
latestVersion.setMinorVersionNumber(0l);
if (latestVersion.getCreateTime() != null) {
latestVersion.setCreateTime(new Date());
}
if (latestVersion.getLastUpdateTime() != null) {
latestVersion.setLastUpdateTime(new Date());
}
PublishDatasetResult res = execCommand(new PublishDatasetCommand(managedDs, request, false, shouldRelease));
responseBld.add("releaseCompleted", res.isCompleted());
}

return created("/datasets/" + managedDs.getId(), responseBld);

} catch (WrappedResponse ex) {
return ex.getResponse();
}
}

private Dataset parseDataset(String datasetJson) throws WrappedResponse {
try (StringReader rdr = new StringReader(datasetJson)) {
return jsonParser().parseDataset(Json.createReader(rdr).readObject());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ public class ImportDDIServiceBean {
public static final String NOTE_TYPE_REPLICATION_FOR = "DVN:REPLICATION_FOR";
private static final String HARVESTED_FILE_STORAGE_PREFIX = "http://";
private XMLInputFactory xmlInputFactory = null;
private static final Logger logger = Logger.getLogger(ImportDDIServiceBean.class.getName());

@EJB CustomFieldServiceBean customFieldService;

Expand Down Expand Up @@ -129,6 +130,7 @@ public Map<String, String> mapDDI(ImportType importType, String xmlToParse, Data
StringReader reader = new StringReader(xmlToParse);
XMLStreamReader xmlr = null;
XMLInputFactory xmlFactory = javax.xml.stream.XMLInputFactory.newInstance();
xmlFactory.setProperty("javax.xml.stream.isCoalescing", true); // allows the parsing of a CDATA segment into a single event
xmlr = xmlFactory.createXMLStreamReader(reader);
processDDI(importType, xmlr, datasetDTO, filesMap);

Expand Down Expand Up @@ -200,9 +202,13 @@ private void processDDI(ImportType importType, XMLStreamReader xmlr, DatasetDTO

}
}

if (isHarvestImport(importType)) {
datasetDTO.getDatasetVersion().setVersionState(VersionState.RELEASED);


}
else {
datasetDTO.getDatasetVersion().setVersionState(VersionState.DRAFT);
}


Expand Down Expand Up @@ -410,9 +416,7 @@ else if (xmlr.getLocalName().equals("relStdy")) {
// rp.setText( (String) rpFromDDI );
}
publications.add(set);
if (publications.size()>0) {
getCitation(dvDTO).addField(FieldDTO.createMultipleCompoundFieldDTO(DatasetFieldConstant.publication, publications));
}


} else if (xmlr.getLocalName().equals("otherRefs")) {

Expand All @@ -422,7 +426,9 @@ else if (xmlr.getLocalName().equals("relStdy")) {

}
} else if (event == XMLStreamConstants.END_ELEMENT) {

if (publications.size()>0) {
getCitation(dvDTO).addField(FieldDTO.createMultipleCompoundFieldDTO(DatasetFieldConstant.publication, publications));
}
if (xmlr.getLocalName().equals("othrStdyMat")) {
return;
}
Expand Down Expand Up @@ -484,7 +490,8 @@ private void processStdyInfo(XMLStreamReader xmlr, DatasetVersionDTO dvDTO) thro
} else if (xmlr.getLocalName().equals("abstract")) {
HashSet<FieldDTO> set = new HashSet<>();
addToSet(set,"dsDescriptionDate", xmlr.getAttributeValue(null, "date"));
addToSet(set,"dsDescriptionValue", parseText(xmlr, "abstract"));
Map<String, String> dsDescriptionDetails = parseCompoundText(xmlr, "abstract");
addToSet(set,"dsDescriptionValue", dsDescriptionDetails.get("name"));
if (!set.isEmpty()) {
descriptions.add(set);
}
Expand Down Expand Up @@ -741,7 +748,8 @@ private void processMethod(XMLStreamReader xmlr, DatasetVersionDTO dvDTO ) throw
if (NOTE_TYPE_EXTENDED_METADATA.equalsIgnoreCase(noteType) ) {
processCustomField(xmlr, dvDTO);
} else {
addNote("Subject: Study Level Error Note, Notes: "+ parseText( xmlr,"notes" ) +";", dvDTO);
processNotes(xmlr, dvDTO);
// addNote("Subject: Study Level Error Note, Notes: "+ parseText( xmlr,"notes" ) +";", dvDTO);

}
} else if (xmlr.getLocalName().equals("anlyInfo")) {
Expand Down Expand Up @@ -897,6 +905,7 @@ private void processDataColl(XMLStreamReader xmlr, DatasetVersionDTO dvDTO) thro
String collMode = "";
String timeMeth = "";
String weight = "";
String dataCollector = "";

for (int event = xmlr.next(); event != XMLStreamConstants.END_DOCUMENT; event = xmlr.next()) {
if (event == XMLStreamConstants.START_ELEMENT) {
Expand All @@ -911,7 +920,14 @@ private void processDataColl(XMLStreamReader xmlr, DatasetVersionDTO dvDTO) thro
}
//socialScience.getFields().add(FieldDTO.createPrimitiveFieldDTO("timeMethod", parseText( xmlr, "timeMeth" )));
} else if (xmlr.getLocalName().equals("dataCollector")) {
socialScience.getFields().add(FieldDTO.createPrimitiveFieldDTO("dataCollector", parseText( xmlr, "dataCollector" )));
// socialScience.getFields().add(FieldDTO.createPrimitiveFieldDTO("dataCollector", parseText( xmlr, "dataCollector" )));
String thisValue = parseText( xmlr, "dataCollector");
if (!StringUtil.isEmpty(thisValue)) {
if (!"".equals(dataCollector)) {
dataCollector = dataCollector.concat(", ");
}
dataCollector = dataCollector.concat(thisValue);
}
// frequencyOfDataCollection
} else if (xmlr.getLocalName().equals("frequenc")) {
socialScience.getFields().add(FieldDTO.createPrimitiveFieldDTO("frequencyOfDataCollection", parseText( xmlr, "frequenc" )));
Expand Down Expand Up @@ -968,6 +984,9 @@ private void processDataColl(XMLStreamReader xmlr, DatasetVersionDTO dvDTO) thro
if (!StringUtil.isEmpty(weight)) {
socialScience.getFields().add(FieldDTO.createPrimitiveFieldDTO("weighting", weight));
}
if (!StringUtil.isEmpty(dataCollector)) {
socialScience.getFields().add(FieldDTO.createPrimitiveFieldDTO("dataCollector", dataCollector));
}
return;
}
}
Expand Down Expand Up @@ -1049,7 +1068,7 @@ private void processVerStmt(ImportType importType, XMLStreamReader xmlr, Dataset
if (isNewImport(importType)) {
// If this is a new, Draft version, versionNumber and minor versionNumber are null.
dvDTO.setVersionState(VersionState.DRAFT);
}
}
}

private void processDataAccs(XMLStreamReader xmlr, DatasetVersionDTO dvDTO) throws XMLStreamException {
Expand Down Expand Up @@ -1632,7 +1651,8 @@ private void addToSet(HashSet<FieldDTO> set, String typeName, String value ) {
set.add(FieldDTO.createPrimitiveFieldDTO(typeName, value));
}
}


// TODO : determine what is going on here ?
private void processOtherMat(XMLStreamReader xmlr, DatasetDTO datasetDTO) throws XMLStreamException {
FileMetadataDTO fmdDTO = new FileMetadataDTO();

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -383,6 +383,23 @@ public Dataset doImportHarvestedDataset(DataverseRequest dataverseRequest, Harve
}
return importedDataset;
}

public JsonObject ddiToJson(String xmlToParse) throws ImportException{
DatasetDTO dsDTO = null;

try {
dsDTO = importDDIService.doImport(ImportType.IMPORT, xmlToParse);
} catch (XMLStreamException e) {
throw new ImportException("XMLStreamException" + e);
}
// convert DTO to Json,
Gson gson = new GsonBuilder().setPrettyPrinting().create();
String json = gson.toJson(dsDTO);
JsonReader jsonReader = Json.createReader(new StringReader(json));
JsonObject obj = jsonReader.readObject();

return obj;
}

public JsonObjectBuilder doImport(DataverseRequest dataverseRequest, Dataverse owner, String xmlToParse, String fileName, ImportType importType, PrintWriter cleanupLog) throws ImportException, IOException {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@
public interface ImportUtil {
public enum ImportType{
/** ? */
NEW,
NEW,
/** TODO: had to do a distinction because of otherMath tag causing problem, will be discussing about it in pull request **/
IMPORT,

/** Data is harvested from another Dataverse instance */
HARVEST
Expand Down