Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow publishing conditional on indexing being done, add /timestamps #7640

Merged
merged 4 commits into from Mar 8, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
13 changes: 12 additions & 1 deletion doc/sphinx-guides/source/api/native-api.rst
Expand Up @@ -1020,6 +1020,8 @@ When publishing a dataset it's good to be aware of the Dataverse Software's vers

If this is the first version of the dataset, its version number will be set to ``1.0``. Otherwise, the new dataset version number is determined by the most recent version number and the ``type`` parameter. Passing ``type=minor`` increases the minor version number (2.3 is updated to 2.4). Passing ``type=major`` increases the major version number (2.3 is updated to 3.0). (Superusers can pass ``type=updatecurrent`` to update metadata without changing the version number.)

This call also supports an optional boolean query parameter: ``assureIsIndexed``. If true, the call will fail with a 409 ("CONFLICT") response if the dataset is awaiting re-indexing. If indexing occurs during publishing it could cause the publish request to fail, after a 202 response has been received. Using this parameter allows the caller to wait for indexing to occur and avoid this possibility. It is most useful in situations where edits are made immediately before publication.

.. note:: See :ref:`curl-examples-and-environment-variables` if you are unfamiliar with the use of ``export`` below.

.. code-block:: bash
Expand All @@ -1039,7 +1041,7 @@ The fully expanded example above (without environment variables) looks like this

The quotes around the URL are required because there is more than one query parameter separated by an ampersand (``&``), which has special meaning to Unix shells such as Bash. Putting the ``&`` in quotes ensures that "type" is interpreted as one of the query parameters.

You should expect JSON output and a 200 ("OK") response in most cases. If you receive a 202 ("ACCEPTED") response, this is normal for installations that have workflows configured. Workflows are described in the :doc:`/developers/workflows` section of the Developer Guide.
You should expect JSON output and a 200 ("OK") response in most cases. If you receive a 202 ("ACCEPTED") response, this is normal for installations that have workflows configured. Workflows are described in the :doc:`/developers/workflows` section of the Developer Guide. A 409 ("CONFLICT") response is also possible if you set ``assureIsIndexed`=true. (In this case, one could then repeat the call until a 200/202 response is sent.)

.. note:: POST should be used to publish a dataset. GET is supported for backward compatibility but is deprecated and may be removed: https://github.com/IQSS/dataverse/issues/2431

Expand Down Expand Up @@ -1742,6 +1744,15 @@ Configure a Dataset to Use a Specific File Store

``/api/datasets/$dataset-id/storageDriver`` can be used to check, configure or reset the designated file store (storage driver) for a dataset. Please see the :doc:`/admin/dataverses-datasets` section of the guide for more information on this API.

View the Timestamps on a Dataset
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

``/api/datasets/$dataset-id/timestamps`` can be used to view timestamps associated with various events in the dataset's lifecycle. For published datasets, this API call provides the ``createTime``, ``publicationTime``, ``lastMetadataExportTime`` and ``lastMajorVersionReleaseTime``, as well as two booleans - ``hasStaleIndex`` and ``hasStalePermissionIndex`` - which, if false, indicate the Dataverse displays for the dataset are up-to-date. The response is ``application/json`` with the timestamps included in the returned ``data`` object.

When called by a user who can view the draft version of the dataset, additional timestamps are reported: ``lastUpdateTime``, ``lastIndexTime``, ``lastPermissionUpdateTime``, and ``globalIdCreateTime``.

One use case where this API call could be useful is in allowing an external application to poll and wait for changes being made by the Dataverse software or other external tool to complete prior to continuing its own processing.

Files
-----

Expand Down
102 changes: 99 additions & 3 deletions src/main/java/edu/harvard/iq/dataverse/api/Datasets.java
Expand Up @@ -112,6 +112,8 @@
import java.io.StringReader;
import java.sql.Timestamp;
import java.text.MessageFormat;
import java.time.ZoneId;
import java.time.format.DateTimeFormatter;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Date;
Expand Down Expand Up @@ -969,12 +971,12 @@ private String validateDatasetFieldValues(List<DatasetField> fields) {
@Deprecated
public Response publishDataseUsingGetDeprecated( @PathParam("id") String id, @QueryParam("type") String type ) {
logger.info("publishDataseUsingGetDeprecated called on id " + id + ". Encourage use of POST rather than GET, which is deprecated.");
return publishDataset(id, type);
return publishDataset(id, type, false);
}

@POST
@Path("{id}/actions/:publish")
public Response publishDataset(@PathParam("id") String id, @QueryParam("type") String type) {
public Response publishDataset(@PathParam("id") String id, @QueryParam("type") String type, @QueryParam("assureIsIndexed") boolean mustBeIndexed) {
try {
if (type == null) {
return error(Response.Status.BAD_REQUEST, "Missing 'type' parameter (either 'major','minor', or 'updatecurrent').");
Expand Down Expand Up @@ -1002,6 +1004,29 @@ public Response publishDataset(@PathParam("id") String id, @QueryParam("type") S
}

Dataset ds = findDatasetOrDie(id);
if (mustBeIndexed) {
logger.fine("IT: " + ds.getIndexTime());
logger.fine("MT: " + ds.getModificationTime());
logger.fine("PIT: " + ds.getPermissionIndexTime());
logger.fine("PMT: " + ds.getPermissionModificationTime());
if (ds.getIndexTime() != null && ds.getModificationTime() != null) {
logger.fine("ITMT: " + (ds.getIndexTime().compareTo(ds.getModificationTime()) <= 0));
}
/*
* Some calls, such as the /datasets/actions/:import* commands do not set the
* modification or permission modification times. The checks here are trying to
* see if indexing or permissionindexing could be pending, so they check to see
* if the relevant modification time is set and if so, whether the index is also
* set and if so, if it after the modification time. If the modification time is
* set and the index time is null or is before the mod time, the 409/conflict
* error is returned.
*
*/
if ((ds.getModificationTime()!=null && (ds.getIndexTime() == null || (ds.getIndexTime().compareTo(ds.getModificationTime()) <= 0))) ||
(ds.getPermissionModificationTime()!=null && (ds.getPermissionIndexTime() == null || (ds.getPermissionIndexTime().compareTo(ds.getPermissionModificationTime()) <= 0)))) {
return error(Response.Status.CONFLICT, "Dataset is awaiting indexing");
}
}
if (updateCurrent) {
/*
* Note: The code here mirrors that in the
Expand Down Expand Up @@ -2324,5 +2349,76 @@ public Response resetFileStore(@PathParam("identifier") String dvIdtf,
datasetService.merge(dataset);
return ok("Storage reset to default: " + DataAccess.DEFAULT_STORAGE_DRIVER_IDENTIFIER);
}
}

@GET
@Path("{identifier}/timestamps")
@Produces(MediaType.APPLICATION_JSON)
public Response getTimestamps(@PathParam("identifier") String id) {

Dataset dataset = null;
DateTimeFormatter formatter = DateTimeFormatter.ISO_LOCAL_DATE_TIME;
try {
dataset = findDatasetOrDie(id);
User u = findUserOrDie();
Set<Permission> perms = new HashSet<Permission>();
perms.add(Permission.ViewUnpublishedDataset);
boolean canSeeDraft = permissionSvc.hasPermissionsFor(u, dataset, perms);
JsonObjectBuilder timestamps = Json.createObjectBuilder();
logger.fine("CSD: " + canSeeDraft);
logger.fine("IT: " + dataset.getIndexTime());
logger.fine("MT: " + dataset.getModificationTime());
logger.fine("PIT: " + dataset.getPermissionIndexTime());
logger.fine("PMT: " + dataset.getPermissionModificationTime());
// Basic info if it's released
if (dataset.isReleased() || canSeeDraft) {
timestamps.add("createTime", formatter.format(dataset.getCreateDate().toLocalDateTime()));
if (dataset.getPublicationDate() != null) {
timestamps.add("publicationTime", formatter.format(dataset.getPublicationDate().toLocalDateTime()));
}

if (dataset.getLastExportTime() != null) {
timestamps.add("lastMetadataExportTime",
formatter.format(dataset.getLastExportTime().toInstant().atZone(ZoneId.systemDefault())));
}

if (dataset.getMostRecentMajorVersionReleaseDate() != null) {
timestamps.add("lastMajorVersionReleaseTime", formatter.format(
dataset.getMostRecentMajorVersionReleaseDate().toInstant().atZone(ZoneId.systemDefault())));
}
// If the modification/permissionmodification time is
// set and the index time is null or is before the mod time, the relevant index is stale
timestamps.add("hasStaleIndex",
(dataset.getModificationTime() != null && (dataset.getIndexTime() == null
|| (dataset.getIndexTime().compareTo(dataset.getModificationTime()) <= 0))) ? true
: false);
timestamps.add("hasStalePermissionIndex",
(dataset.getPermissionModificationTime() != null && (dataset.getIndexTime() == null
|| (dataset.getIndexTime().compareTo(dataset.getModificationTime()) <= 0))) ? true
: false);
}
// More detail if you can see a draft
if (canSeeDraft) {
timestamps.add("lastUpdateTime", formatter.format(dataset.getModificationTime().toLocalDateTime()));
if (dataset.getIndexTime() != null) {
timestamps.add("lastIndexTime", formatter.format(dataset.getIndexTime().toLocalDateTime()));
}
if (dataset.getPermissionModificationTime() != null) {
timestamps.add("lastPermissionUpdateTime",
formatter.format(dataset.getPermissionModificationTime().toLocalDateTime()));
}
if (dataset.getPermissionIndexTime() != null) {
timestamps.add("lastPermissionIndexTime",
formatter.format(dataset.getPermissionIndexTime().toLocalDateTime()));
}
if (dataset.getGlobalIdCreateTime() != null) {
timestamps.add("globalIdCreateTime", formatter
.format(dataset.getGlobalIdCreateTime().toInstant().atZone(ZoneId.systemDefault())));
}

}
return ok(timestamps);
} catch (WrappedResponse wr) {
return wr.getResponse();
}
}
}