Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDCC/8605-add-archival-status-support #8696

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
de62791
Archival status success/pending/failure/null support
qqmyers May 13, 2022
8c82c61
flyway to update existing
qqmyers May 13, 2022
b354bc3
fix typos/mistakes
qqmyers May 13, 2022
9c9ac65
basic status logging in existing archivers
qqmyers May 13, 2022
221ca0b
API docs
qqmyers May 13, 2022
8902d9a
Merge remote-tracking branch 'IQSS/develop' into GDCC/8605-add-archiv…
qqmyers May 24, 2022
a37922b
Merge remote-tracking branch 'IQSS/develop' into GDCC/8605-add-archiv…
qqmyers May 26, 2022
cefa12c
rename flyway
qqmyers May 26, 2022
e1c62af
Merge remote-tracking branch 'IQSS/develop' into GDCC/8605-add-archiv…
qqmyers May 27, 2022
d2bf93c
Merge remote-tracking branch 'IQSS/develop' into GDCC/8605-add-archiv…
qqmyers Jun 26, 2022
ae1c97c
Merge remote-tracking branch 'IQSS/develop' into GDCC/8605-add-archiv…
qqmyers Jul 14, 2022
d3a7b04
update flyway naming
qqmyers Jul 14, 2022
5295bcd
Merge remote-tracking branch 'IQSS/develop' into GDCC/8605-add-archiv…
qqmyers Jul 15, 2022
9223e7d
updates per review
qqmyers Jul 15, 2022
f5396d8
swap native update
qqmyers Jul 15, 2022
986f9ff
Merge remote-tracking branch 'IQSS/develop' into
qqmyers Jul 18, 2022
8750e62
missed logger.fine
qqmyers Jul 18, 2022
5d617f0
test tweak
qqmyers Jul 19, 2022
8fcb59c
fix jsonpath
qqmyers Jul 19, 2022
d2d817e
fix URLs
qqmyers Jul 19, 2022
6a70d42
add content type on set
qqmyers Jul 19, 2022
e498417
application/json
qqmyers Jul 19, 2022
8a99685
in docs, show verbs for clarity, s/Json/JSON/ #8605
pdurbin Jul 19, 2022
7362e1c
lower logging #8605
pdurbin Jul 19, 2022
7410c5b
format urls in docs
qqmyers Jul 21, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 56 additions & 1 deletion doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1861,7 +1861,7 @@ The API call requires a Json body that includes the embargo's end date (dateAvai
Remove an Embargo on Files in a Dataset
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

/api/datasets/$dataset-id/files/actions/:unset-embargo can be used to remove an embargo on one or more files in a dataset. Embargoes can be removed from files that are only in a draft dataset version (and are not in any previously published version) by anyone who can edit the dataset. The same API call can be used by a superuser to remove embargos from files that have already been released as part of a previously published dataset version.
``/api/datasets/$dataset-id/files/actions/:unset-embargo`` can be used to remove an embargo on one or more files in a dataset. Embargoes can be removed from files that are only in a draft dataset version (and are not in any previously published version) by anyone who can edit the dataset. The same API call can be used by a superuser to remove embargos from files that have already been released as part of a previously published dataset version.

The API call requires a Json body that includes the list of the fileIds that the embargo should be removed from. All files listed must be in the specified dataset. For example:

Expand All @@ -1873,6 +1873,61 @@ The API call requires a Json body that includes the list of the fileIds that the
export JSON='{"fileIds":[300,301]}'

curl -H "X-Dataverse-key: $API_TOKEN" -H "Content-Type:application/json" "$SERVER_URL/api/datasets/:persistentId/files/actions/:unset-embargo?persistentId=$PERSISTENT_IDENTIFIER" -d "$JSON"


Get the Archival Status of a Dataset By Version
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Archiving is an optional feature that may be configured for a Dataverse instance. When that is enabled, this API call be used to retrieve the status. Note that this requires "superuser" credentials.

``GET /api/datasets/$dataset-id/$version/archivalStatus`` returns the archival status of the specified dataset version.

The response is a JSON object that will contain a "status" which may be "success", "pending", or "failure" and a "message" which is archive system specific. For "success" the message should provide an identifier or link to the archival copy. For example:

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/7U7YBV
export VERSION=1.0

curl -H "X-Dataverse-key: $API_TOKEN" -H "Accept:application/json" "$SERVER_URL/api/datasets/:persistentId/$VERSION/archivalStatus?persistentId=$PERSISTENT_IDENTIFIER"

Set the Archival Status of a Dataset By Version
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Archiving is an optional feature that may be configured for a Dataverse instance. When that is enabled, this API call be used to set the status. Note that this is intended to be used by the archival system and requires "superuser" credentials.

``PUT /api/datasets/$dataset-id/$version/archivalStatus`` sets the archival status of the specified dataset version.

The body is a JSON object that must contain a "status" which may be "success", "pending", or "failure" and a "message" which is archive system specific. For "success" the message should provide an identifier or link to the archival copy. For example:

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/7U7YBV
export VERSION=1.0
export JSON='{"status":"failure","message":"Something went wrong"}'

curl -H "X-Dataverse-key: $API_TOKEN" -H "Content-Type:application/json" -X PUT "$SERVER_URL/api/datasets/:persistentId/$VERSION/archivalStatus?persistentId=$PERSISTENT_IDENTIFIER" -d "$JSON"

Delete the Archival Status of a Dataset By Version
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Archiving is an optional feature that may be configured for a Dataverse instance. When that is enabled, this API call be used to delete the status. Note that this is intended to be used by the archival system and requires "superuser" credentials.

``DELETE /api/datasets/$dataset-id/$version/archivalStatus`` deletes the archival status of the specified dataset version.

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/7U7YBV
export VERSION=1.0

curl -H "X-Dataverse-key: $API_TOKEN" -X DELETE "$SERVER_URL/api/datasets/:persistentId/$VERSION/archivalStatus?persistentId=$PERSISTENT_IDENTIFIER"


Files
-----
Expand Down
48 changes: 47 additions & 1 deletion src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@
import edu.harvard.iq.dataverse.branding.BrandingUtil;
import edu.harvard.iq.dataverse.dataset.DatasetUtil;
import edu.harvard.iq.dataverse.license.License;
import edu.harvard.iq.dataverse.util.BundleUtil;
import edu.harvard.iq.dataverse.util.FileUtil;
import edu.harvard.iq.dataverse.util.StringUtil;
import edu.harvard.iq.dataverse.util.SystemConfig;
import edu.harvard.iq.dataverse.util.DateUtil;
import edu.harvard.iq.dataverse.util.json.JsonUtil;
import edu.harvard.iq.dataverse.util.json.NullSafeJsonBuilder;
import edu.harvard.iq.dataverse.workflows.WorkflowComment;
import java.io.Serializable;
Expand All @@ -27,6 +27,7 @@
import javax.json.Json;
import javax.json.JsonArray;
import javax.json.JsonArrayBuilder;
import javax.json.JsonObject;
import javax.json.JsonObjectBuilder;
import javax.persistence.CascadeType;
import javax.persistence.Column;
Expand Down Expand Up @@ -94,6 +95,14 @@ public enum VersionState {
public static final int ARCHIVE_NOTE_MAX_LENGTH = 1000;
public static final int VERSION_NOTE_MAX_LENGTH = 1000;

//Archival copies: Status message required components
public static final String ARCHIVAL_STATUS = "status";
public static final String ARCHIVAL_STATUS_MESSAGE = "message";
//Archival Copies: Allowed Statuses
public static final String ARCHIVAL_STATUS_PENDING = "pending";
public static final String ARCHIVAL_STATUS_SUCCESS = "success";
public static final String ARCHIVAL_STATUS_FAILURE = "failure";

@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
Expand Down Expand Up @@ -152,6 +161,11 @@ public enum VersionState {
// removed pending further investigation (v4.13)
private String archiveNote;

// Originally a simple string indicating the location of the archival copy. As
// of v5.12, repurposed to provide a more general json archival status (failure,
// pending, success) and message (serialized as a string). The archival copy
// location is now expected as the contents of the message for the status
// 'success'. See the /api/datasets/{id}/{version}/archivalStatus API calls for more details
@Column(nullable=true, columnDefinition = "TEXT")
private String archivalCopyLocation;

Expand Down Expand Up @@ -180,6 +194,8 @@ public enum VersionState {
@Transient
private DatasetVersionDifference dvd;

@Transient
private JsonObject archivalStatus;

public Long getId() {
return this.id;
Expand Down Expand Up @@ -319,9 +335,39 @@ public void setArchiveNote(String note) {
public String getArchivalCopyLocation() {
return archivalCopyLocation;
pdurbin marked this conversation as resolved.
Show resolved Hide resolved
}

public String getArchivalCopyLocationStatus() {
populateArchivalStatus(false);

if(archivalStatus!=null) {
return archivalStatus.getString(ARCHIVAL_STATUS);
}
return null;
}
public String getArchivalCopyLocationMessage() {
populateArchivalStatus(false);
if(archivalStatus!=null) {
return archivalStatus.getString(ARCHIVAL_STATUS_MESSAGE);
}
return null;
}

private void populateArchivalStatus(boolean force) {
if(archivalStatus ==null || force) {
if(archivalCopyLocation!=null) {
try {
archivalStatus = JsonUtil.getJsonObject(archivalCopyLocation);
} catch(Exception e) {
logger.warning("DatasetVersion id: " + id + "has a non-JsonObject value, parsing error: " + e.getMessage());
logger.fine(archivalCopyLocation);
}
}
}
}

public void setArchivalCopyLocation(String location) {
this.archivalCopyLocation = location;
populateArchivalStatus(true);
}

public String getDeaccessionLink() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1187,4 +1187,12 @@ private DatasetVersion getPreviousVersionWithUnf(DatasetVersion datasetVersion)
return null;
}

/**
* Merges the passed datasetversion to the persistence context.
* @param ver the DatasetVersion whose new state we want to persist.
* @return The managed entity representing {@code ver}.
*/
public DatasetVersion merge( DatasetVersion ver ) {
return em.merge(ver);
}
Comment on lines +1195 to +1197
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this merge method doesn't already exist on DatasetVersionServiceBean.java. It is because most changes to versions happen through commands? It is because once a version is published there's no need to go back and change the version (except for deaccessioning, I guess, which is a command)? I don't think it's bad to add this method but I wonder why we're only adding it now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I think everything uses a Command of some sort. I was also surprised that it didn't exist as the dataset service has a merge() and the file service has several methods that don't do much more than a merge.

} // end class
108 changes: 106 additions & 2 deletions src/main/java/edu/harvard/iq/dataverse/api/Datasets.java
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@
import edu.harvard.iq.dataverse.util.json.JSONLDUtil;
import edu.harvard.iq.dataverse.util.json.JsonLDTerm;
import edu.harvard.iq.dataverse.util.json.JsonParseException;
import edu.harvard.iq.dataverse.util.json.JsonUtil;
import edu.harvard.iq.dataverse.search.IndexServiceBean;
import static edu.harvard.iq.dataverse.util.json.JsonPrinter.*;
import static edu.harvard.iq.dataverse.util.json.NullSafeJsonBuilder.jsonObjectBuilder;
Expand Down Expand Up @@ -216,6 +217,9 @@ public class Datasets extends AbstractApiBean {
@Inject
DataverseRoleServiceBean dataverseRoleService;

@EJB
DatasetVersionServiceBean datasetversionService;

/**
* Used to consolidate the way we parse and handle dataset versions.
* @param <T>
Expand Down Expand Up @@ -2259,7 +2263,7 @@ public Response completeMPUpload(String partETagBody, @QueryParam("globalid") St
eTagList.add(new PartETag(Integer.parseInt(partNo), object.getString(partNo)));
}
for(PartETag et: eTagList) {
logger.info("Part: " + et.getPartNumber() + " : " + et.getETag());
logger.fine("Part: " + et.getPartNumber() + " : " + et.getETag());
}
} catch (JsonException je) {
logger.info("Unable to parse eTags from: " + partETagBody);
Expand Down Expand Up @@ -2524,7 +2528,7 @@ public Command<DatasetVersion> handleLatestPublished() {
if ( dsv == null || dsv.getId() == null ) {
throw new WrappedResponse( notFound("Dataset version " + versionNumber + " of dataset " + ds.getId() + " not found") );
}
if (dsv.isReleased()) {
if (dsv.isReleased()&& uriInfo!=null) {
MakeDataCountLoggingServiceBean.MakeDataCountEntry entry = new MakeDataCountEntry(uriInfo, headers, dvRequestService, ds);
mdcLogService.logEntry(entry);
}
Expand Down Expand Up @@ -3282,4 +3286,104 @@ public Response getCurationStates() throws WrappedResponse {
csvSB.append("\n");
return ok(csvSB.toString(), MediaType.valueOf(FileUtil.MIME_TYPE_CSV), "datasets.status.csv");
}

// APIs to manage archival status

@GET
@Produces(MediaType.APPLICATION_JSON)
@Path("/{id}/{version}/archivalStatus")
public Response getDatasetVersionArchivalStatus(@PathParam("id") String datasetId,
@PathParam("version") String versionNumber, @Context UriInfo uriInfo, @Context HttpHeaders headers) {

try {
AuthenticatedUser au = findAuthenticatedUserOrDie();
if (!au.isSuperuser()) {
return error(Response.Status.FORBIDDEN, "Superusers only.");
}
DataverseRequest req = createDataverseRequest(au);
DatasetVersion dsv = getDatasetVersionOrDie(req, versionNumber, findDatasetOrDie(datasetId), uriInfo,
headers);

if (dsv.getArchivalCopyLocation() == null) {
return error(Status.NO_CONTENT, "This dataset version has not been archived");
} else {
JsonObject status = JsonUtil.getJsonObject(dsv.getArchivalCopyLocation());
return ok(status);
}
} catch (WrappedResponse wr) {
return wr.getResponse();
}
}

@PUT
@Consumes(MediaType.APPLICATION_JSON)
@Path("/{id}/{version}/archivalStatus")
public Response setDatasetVersionArchivalStatus(@PathParam("id") String datasetId,
@PathParam("version") String versionNumber, JsonObject update, @Context UriInfo uriInfo,
@Context HttpHeaders headers) {

logger.fine(JsonUtil.prettyPrint(update));
try {
AuthenticatedUser au = findAuthenticatedUserOrDie();

if (!au.isSuperuser()) {
return error(Response.Status.FORBIDDEN, "Superusers only.");
}

if (update.containsKey(DatasetVersion.ARCHIVAL_STATUS) && update.containsKey(DatasetVersion.ARCHIVAL_STATUS_MESSAGE)) {
String status = update.getString(DatasetVersion.ARCHIVAL_STATUS);
if (status.equals(DatasetVersion.ARCHIVAL_STATUS_PENDING) || status.equals(DatasetVersion.ARCHIVAL_STATUS_FAILURE)
|| status.equals(DatasetVersion.ARCHIVAL_STATUS_SUCCESS)) {

DataverseRequest req = createDataverseRequest(au);
DatasetVersion dsv = getDatasetVersionOrDie(req, versionNumber, findDatasetOrDie(datasetId),
uriInfo, headers);

if (dsv == null) {
return error(Status.NOT_FOUND, "Dataset version not found");
}

dsv.setArchivalCopyLocation(JsonUtil.prettyPrint(update));
dsv = datasetversionService.merge(dsv);
logger.fine("location now: " + dsv.getArchivalCopyLocation());
logger.fine("status now: " + dsv.getArchivalCopyLocationStatus());
logger.fine("message now: " + dsv.getArchivalCopyLocationMessage());

return ok("Status updated");
}
}
} catch (WrappedResponse wr) {
return wr.getResponse();
}

return error(Status.BAD_REQUEST, "Unacceptable status format");
}

@DELETE
@Produces(MediaType.APPLICATION_JSON)
@Path("/{id}/{version}/archivalStatus")
public Response deleteDatasetVersionArchivalStatus(@PathParam("id") String datasetId,
@PathParam("version") String versionNumber, @Context UriInfo uriInfo, @Context HttpHeaders headers) {

try {
AuthenticatedUser au = findAuthenticatedUserOrDie();
if (!au.isSuperuser()) {
return error(Response.Status.FORBIDDEN, "Superusers only.");
}

DataverseRequest req = createDataverseRequest(au);
DatasetVersion dsv = getDatasetVersionOrDie(req, versionNumber, findDatasetOrDie(datasetId), uriInfo,
headers);
if (dsv == null) {
return error(Status.NOT_FOUND, "Dataset version not found");
}
dsv.setArchivalCopyLocation(null);
dsv = datasetversionService.merge(dsv);

return ok("Status deleted");

} catch (WrappedResponse wr) {
return wr.getResponse();
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@
import java.util.Map;
import java.util.logging.Logger;

import javax.json.Json;
import javax.json.JsonObjectBuilder;

import org.apache.commons.codec.binary.Hex;
import org.duracloud.client.ContentStore;
import org.duracloud.client.ContentStoreManager;
Expand Down Expand Up @@ -88,6 +91,11 @@ public WorkflowStepResult performArchiveSubmission(DatasetVersion dv, ApiToken t
.replace('.', '-').toLowerCase() + "_v" + dv.getFriendlyVersionNumber();

ContentStore store;
//Set a failure status that will be updated if we succeed
JsonObjectBuilder statusObject = Json.createObjectBuilder();
statusObject.add(DatasetVersion.ARCHIVAL_STATUS, DatasetVersion.ARCHIVAL_STATUS_FAILURE);
statusObject.add(DatasetVersion.ARCHIVAL_STATUS_MESSAGE, "Bag not transferred");

try {
/*
* If there is a failure in creating a space, it is likely that a prior version
Expand Down Expand Up @@ -194,7 +202,9 @@ public void run() {
sb.append("/duradmin/spaces/sm/");
sb.append(store.getStoreId());
sb.append("/" + spaceName + "/" + fileName);
dv.setArchivalCopyLocation(sb.toString());
statusObject.add(DatasetVersion.ARCHIVAL_STATUS, DatasetVersion.ARCHIVAL_STATUS_SUCCESS);
statusObject.add(DatasetVersion.ARCHIVAL_STATUS_MESSAGE, sb.toString());

logger.fine("DuraCloud Submission step complete: " + sb.toString());
} catch (ContentStoreException | IOException e) {
// TODO Auto-generated catch block
Expand All @@ -217,6 +227,9 @@ public void run() {
} catch (NoSuchAlgorithmException e) {
logger.severe("MD5 MessageDigest not available!");
}
finally {
dv.setArchivalCopyLocation(statusObject.build().toString());
}
} else {
logger.warning(
"DuraCloud Submision Workflow aborted: Dataset locked for finalizePublication, or because file validation failed");
Expand Down
Loading