Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DS-3851 Endpoint to interact with the workflow #2312

Merged
merged 46 commits into from Mar 27, 2019

Conversation

Projects
None yet
6 participants
@abollini
Copy link
Member

abollini commented Jan 2, 2019

This PR enables the configurable workflow and implement the workflowItem endpoint providing the ability to claim, unclaim and approve tasks transforming an inprogress submission in a "final item".

It contains the improvements from the first review round done "early" on the 4Science#37

Integration Tests included

Out of scope of this PR:

  • indexing of the in progress submissions in SOLR as needed by the MyDSpace UI implementation (a separate PR will be created for that)
  • messaging between Submitter and Validators (i.e. Librarians), TBD
  • complete removal of the Basic Workflow (it will be done in one or multiple PRs, see for example #2268)
  • revisit the current workflow/task REST Contract in a more HATEOAS way
  • exposing the workflow configuration, allow configuration of Workflow Roles by collection

@abollini abollini added the REST API v7 label Jan 2, 2019

@benbosman
Copy link
Member

benbosman left a comment

Thanks for the PR.
I've limited my review to the basics only so far. A more elaborate review would first require the ITs to be included.

I do have some general questions to understand the approach:

  • How do you restrict a reviewer (or submitter) from modifying metadata not in the forms. Will they no longer have WRITE access on the item, but only on the workflowitem, or will the item REST at /api/core/items/<:uuid> verify whether the item is in submission/workflow? (I also included an inline question about this)
  • Are admins now able to execute workflow tasks independent of whether they are in the workflow group? Or is this still only possible if the user is in the pool or has claimed the task?
  • How are tasks claimed/returned to the pool
  • How are the potential next actions of a task communicated to REST (e.g. accept, reject, edit metadata, …)
  • Does this already solve any of the open issues of #2187

I will review again once the ITs are ready

@tdonohue
Copy link
Member

tdonohue left a comment

@abollini : I gave this a review today. It looks reasonable overall, but I have some inline comments/questions below.

I also had a general question. I noticed some changes in 4Science#37 are missing from this PR. Is that on purpose? Namely, I notice that the new org.dspace.app.rest.model.patch.LateObjectEvaluator class has been dropped. This is disappointing because we need that new class in order to upgrade to the latest version of Spring & Spring Boot (as Spring has dropped their org.springframework.data.rest.webmvc.json.patch.LateObjectEvaluator). See comments in ticket https://jira.duraspace.org/browse/DS-3802 for more information.

@abollini

This comment has been minimized.

Copy link
Member Author

abollini commented Feb 14, 2019

About the other general questions raised by @benbosman

How do you restrict a reviewer (or submitter) from modifying metadata not in the forms. Will they no longer have WRITE access on the item, but only on the workflowitem, or will the item REST at /api/core/items/<:uuid> verify whether the item is in submission/workflow? (I also included an inline question about this)

Nothing has been added here the security is currently delegated to the dspace-api layer

Are admins now able to execute workflow tasks independent of whether they are in the workflow group? Or is this still only possible if the user is in the pool or has claimed the task?

Nothing has been added here pros or contro the above scenarios. The XmlWorkflow framework should be responsible to deal with a consistent execution of the workflow

How are tasks claimed/returned to the pool

to claim: https://github.com/DSpace/Rest7Contract/blob/master/polltasks.md#post-method-single-resource-level

to unclaim (see also discussion in the inline comments): https://github.com/DSpace/Rest7Contract/blob/master/claimedtasks.md#delete-method

How are the potential next actions of a task communicated to REST (e.g. accept, reject, edit metadata, …)

There are currently no built-in solution for that. The available actions are defined in the contract https://github.com/DSpace/Rest7Contract/blob/master/claimedtasks.md#post-method-single-resource-level

I agree that is something that should be improved in a next PR

Does this already solve any of the open issues of #2187

This PR is only about DS-3851, the other JIRA tickets will be managed separately

@benbosman
Copy link
Member

benbosman left a comment

Thanks for the update @abollini

Continuing my general questions (I removed what is resolved):

How do you restrict a reviewer (or submitter) from modifying metadata not in the forms. Will they no longer have WRITE access on the item, but only on the workflowitem, or will the item REST at /api/core/items/<:uuid> verify whether the item is in submission/workflow? (I also included an inline question about this)

Nothing has been added here the security is currently delegated to the dspace-api layer

Is there any functionality for this in the dspace-api layer?

Are admins now able to execute workflow tasks independent of whether they are in the workflow group? Or is this still only possible if the user is in the pool or has claimed the task?

Nothing has been added here pros or contro the above scenarios. The XmlWorkflow framework should be responsible to deal with a consistent execution of the workflow

AFAIK the XmlWorkflow framework will verify whether you have permissions, implying admins will most likely be able to perform these changes. But I assume this will be easier to verify once the ITs for performing the workflow steps are ready

How are tasks claimed/returned to the pool

to claim: https://github.com/DSpace/Rest7Contract/blob/master/polltasks.md#post-method-single-resource-level

Where is this implemented? I can't find this in the current codebase.
I also noticed there's a typo in the REST contract, it's named polltasks instead of pooltasks

How are the potential next actions of a task communicated to REST (e.g. accept, reject, edit metadata, …)

There are currently no built-in solution for that. The available actions are defined in the contract https://github.com/DSpace/Rest7Contract/blob/master/claimedtasks.md#post-method-single-resource-level
I agree that is something that should be improved in a next PR

Good to see you agree that this is a problem with the current REST Contract.
The list of available next actions shouldn't be defined in the REST Contract, but should be made available using a REST endpoint, asking REST to describe the current task. This will need to be solved in both the Contract and the implementation.
However, the current implementation will still have quite a few issues until this is resolved. There's no error handling if an incorrect action is used, the UI can't identify the applicable actions to display unless it's hardcoded, ….

I couldn't fine any ITs on progressing through the workflow yet, but I also couldn't find any code to claim a task.

I also noticed Travis failed on quite a few ITs

@abollini

This comment has been minimized.

Copy link
Member Author

abollini commented Mar 7, 2019

@benbosman I have further extended the test to demonstrate that never the admin can manipulate other tasks 000c47e

I have not a definitive answer for your question

But my question here was whether metadata updates to /api/core/items are forbidden for workflow and submission items.

but IMHO this is not in the scope of this PR. We should discuss it as a separate ticket (feel free to create it) or as part of the PR related to the item endpoint. The #2313 is indeed a better place for such discussion

In regards to the strategy used to index workflowitem, pool and claimed tasks having them as separate document in SOLR make much easier and natural faceting and access to the tasks. Keep in mind that a single workflow could also lead - as we use the configurable workflow - to multiple parallel tasks, this for instance will make impossible to create a facet for the mydspace based on a broad concept of "workflow step" (the type of task that you have) when you have a single SOLR document with two values for such field one related to the reviewer1 and one related to the reviewer2

@tdonohue
Copy link
Member

tdonohue left a comment

Added a few inline comments to BrowseableObject, as on second glance, some of its required methods are unclear to me.

@tdonohue

This comment has been minimized.

Copy link
Member

tdonohue commented Mar 14, 2019

Notes from today's DSpace 7 meeting (well, the post-meeting):

  • We decided to move forward with this PR's implementation direction for the Preview Release.
    • Future discussion around possibly rethinking WorkspaceItem and WorkflowItem (as concepts, since both might be better as Item states, instead of separate objects) will be tabled, as those are larger changes/discussions. I will create a DSpace 8 wiki page to begin gathering notes for this future discussion.
  • Discussed possibly renaming BrowsableObject interface to IndexableObject (as they are all objects which can be indexed).
  • Discussed possibly renaming a few of the Constants. WORKFLOW_POOL should likely be TASK_POOL. WORKFLOW_CLAIMED should likely be TASK_CLAIMED.
  • All agreed that this PR should NOT be indexing individual Item objects when indexing either a WorkspaceItem or a WorkflowItem. Andrea said he doesn't think this is happening, but if it is, it should be considered a bug.
@abollini

This comment has been minimized.

Copy link
Member Author

abollini commented Mar 14, 2019

I have applied the latest missing feedback as detailed in the inline comment's reply. @tdonohue the last three points of your general comment are most suitable to be managed in the MyDSpace PR (renamed the BrowsableObject interface, the Constants values and verify the absence of a SOLR document about an item when it is actually a workspaceitem or workflowitem)

@abollini

This comment has been minimized.

Copy link
Member Author

abollini commented Mar 14, 2019

BTW, I have already checked the code in the MyDSpace PR and if the item is an actual workspaceitem (or workflowitem) NO solr doc are created related to the item itself but only a solr doc related to the workspaceitem or to the workflowitem (plus in such case solr docs related to each task associated). In all this case not solr doc with search.resourcetype:2 is present

@abollini

This comment has been minimized.

Copy link
Member Author

abollini commented Mar 15, 2019

The previous Travis failure
https://travis-ci.org/DSpace/DSpace/builds/506542668?utm_source=github_status&utm_medium=notification
was due to a merge conflict not identified by git resulting in a compilation failure in the "merged PR". As you can see here
https://travis-ci.org/4Science/DSpace/builds/506542653
the PR branch was compiling successful

Anyway, I have manually merged the latest code in the master here and manually solved the conflict so Travis should soon confirm that the PR can be applied without any issue

@KevinVdV

This comment has been minimized.

Copy link
Member

KevinVdV commented Mar 19, 2019

Took a look and I noticed that issues found in my review of the code were cleaned up. So I took the next step & set the branch up locally & performed some actual testing on the default workflow endpoints, this resulted in the following issues:

POST /api/workflow/workflowitems

  • Issue 1: Attempt to send a correct workspaceItem url but with a workspaceItem identifier that doesn't exist I get a 500 status code and a NullPointerException. I would rather expect a 422 to be returned. (Up for debate which status code is to be returned)
  • Issue 2: Attempt to send a POST with empty body this results in a "java.lang.ArrayIndexOutOfBoundsException", I would have expected an "Malformed body" exception (or something similar).

GET /api/workflow/pooltasks/<:id> AND /api/workflow/claimedtasks/<:id>

This endpoint remains unprotected, anybody with the identifier of a pooled task can access the data of this task (including the metadata of the workflowItem). The only people who should be able to do a get like this are:

  • the task owner
  • an admin

GET /api/workflow/pooltasks/search/findByUser AND /api/workflow/claimedtasks/search/findByUser

Same as above, this is also unprotected. It should be either admin OR the person from which you pass along the uuid.

DELETE /api/workflow/claimedtasks/<:id>

  • Issue 1: When you pass along a nonexisting claimed task you get a 500 status code with a "NullPointerException", one would expect a 404 here.
  • Issue 2: Anybody with WRITE rights on the item can return a claimed task back to the pool. This should really be limited to the owner of the task and perhaps the admins, else a 403 response.

POST /api/workflow/claimedtasks

  • Issue 1: When you pass along a non existing claimed task identifier, a 500 is returned with a NullPointerException, one would expect a 422 here.
  • Issue 2: When you perform a POST with parameters that don't make sense (I tested with submit_rejectt) a 204 is returned, leading the end user to believe that the post was successful. Even though nothing happenend. One would expect a 400 if the parameters for that particular task don't make sense (instead of proving a false sense of success)
@abollini

This comment has been minimized.

Copy link
Member Author

abollini commented Mar 21, 2019

Hi @KevinVdV
thanks for your test and findings!
I agree with your analysis, I can provide the fix and the test for such use cases in a couple of days, I assume on Monday/Tuesday at latest.

To speedup a bit the process, as maybe we will like to discuss a bit about the exact return code for the different scenarios, I like to suggest to accept this PR as is. In such case, I will open a ticket with the above issues that will be fixed asap, before the preview release. In this way we can work in parallel on this fix and the further refactoring discussed for the MyDSpace PR.

@benbosman
Copy link
Member

benbosman left a comment

I've verified the status of all review feedback so far, and I'm summarizing my point of view here:

Regarding the approach of creating the BrowsableObject:

  • We've agreed to implement the discovery indexing as tasks with an associated item rather than items with an assigned claimed task as part of 4Science#65
  • We've agreed to rename BrowsableObject to IndexableObject in Java as part of 4Science#65
  • We've agreed to rename "resultObject" to "indexableObject" in the discover REST result as part of 4Science#65

I'm also ok with moving the security fix to ensure the metadata updates to /api/core/items are forbidden for workflow and submission items to another PR or a JIRA ticket, but it is something which has to be solved since it is a severe security issue.

I don't see it as a security risk that admins now can perform workflow tasks even without being part of the workflow. They would be able to add themselves to the workflow group and perform the tasks anyway. So I'm fine with leaving that as is

What I still see as feedback to be processed in this PR is the feedback from
#2312 (comment)

@abollini

This comment has been minimized.

Copy link
Member Author

abollini commented Mar 21, 2019

I add some information about the schedule here as I'm going to miss the weekly meeting

Option 1) we wait for the feedback provided by @KevinVdV #2312 (comment) to be solved. This would mean that this PR will be ready to be merged on Tuesday 26th. On Wednesday 27th the PR of the mydspace (currently 4Science#65 ) will be recreated against the official DSpace master including the changes noted by Art

Option 2) we merge this PR now. The PR for the mydspace will be recreated against the official DSpace master including the changes noted by Art by the end of the week so that it will be ready to be further reviewed/merged on Monday 25th. On Tuesday 26th, regardless to the merge status of the mydspace PR, a new PR will be created to deal with the latest feedback from @KevinVdV

@tdonohue

This comment has been minimized.

Copy link
Member

tdonohue commented Mar 21, 2019

As discussed in today's DSpace 7 meeting, we are proposing Option 1 from those listed by @abollini in the previous comment. We'd like to minimize any security issues that make it into master.

@abollini

This comment has been minimized.

Copy link
Member Author

abollini commented Mar 26, 2019

@tdonohue @benbosman as agreed comments from @KevinVdV #2312 (comment) have been resolved.
To speedup the review Integration Tests have been introduced for all the listed scenarios

POST /api/workflow/workflowitems

Issue 1: Attempt to send a correct workspaceItem url but with a workspaceItem identifier that doesn't exist I get a 500 status code and a NullPointerException. I would rather expect a 422 to be returned. (Up for debate which status code is to be returned)
Issue 2: Attempt to send a POST with empty body this results in a "java.lang.ArrayIndexOutOfBoundsException", I would have expected an "Malformed body" exception (or something similar).

https://github.com/DSpace/DSpace/pull/2312/files#diff-c881cb09fde7833cded0ffdff8d91f98R728

GET /api/workflow/pooltasks/<:id> AND /api/workflow/claimedtasks/<:id>

This endpoint remains unprotected, anybody with the identifier of a pooled task can access the data of this task (including the metadata of the workflowItem). The only people who should be able to do a get like this are:

the task owner
an admin

https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R123
https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R165

https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R655
https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R709

GET /api/workflow/pooltasks/search/findByUser AND /api/workflow/claimedtasks/search/findByUser

Same as above, this is also unprotected. It should be either admin OR the person from which you pass along the uuid.

https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R379
https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R455

https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R1141
https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R1217

DELETE /api/workflow/claimedtasks/<:id>

Issue 1: When you pass along a nonexisting claimed task you get a 500 status code with a "NullPointerException", one would expect a 404 here.
Issue 2: Anybody with WRITE rights on the item can return a claimed task back to the pool. This should really be limited to the owner of the task and perhaps the admins, else a 403 response.

https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R836
https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R897
https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R938
https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R990

POST /api/workflow/claimedtasks

Issue 1: When you pass along a non existing claimed task identifier, a 500 is returned with a NullPointerException, one would expect a 422 here.
Issue 2: When you perform a POST with parameters that don't make sense (I tested with submit_rejectt) a 204 is returned, leading the end user to believe that the post was successful. Even though nothing happenend. One would expect a 400 if the parameters for that particular task don't make sense (instead of proving a false sense of success)

https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R1548
https://github.com/DSpace/DSpace/pull/2312/files#diff-c417cf9dd7bee44c0e9b199e85704428R1601

@KevinVdV

This comment has been minimized.

Copy link
Member

KevinVdV commented Mar 27, 2019

The fixes above check out, but I found another small bug, but that can be fixed in the MyDspace PR.

When you pass along a non existing pooled task identifier to claim a pooled task you get a 500 with a NullPointer, should be a 404 with a message stating that the task could not be found.

@benbosman
Copy link
Member

benbosman left a comment

My only open concerns for this PR were to solve #2312 (comment)

Kevin reported that was solved, so the remainder can be handled in the MyDSpace PR

@tdonohue
Copy link
Member

tdonohue left a comment

👍 I reviewed this again today, and it looks good to me. However, I found a minor misspelling in Constants, which is duplicated into three @PreAuthorize annotations. After this is corrected, feel free to merge @abollini !

@abollini

This comment has been minimized.

Copy link
Member Author

abollini commented Mar 27, 2019

Ok I will wait for travis to confirm that everything is fine

@abollini abollini merged commit d25463f into DSpace:master Mar 27, 2019

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+1.3%) to 31.185%
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.