IQSS/9126- Fix workflow token access #9129

qqmyers · 2022-11-03T21:46:08Z

What this PR does / why we need it: This PR removes some of the custom permissions checking in the Access class which had been stopping workflow token access to these endpoints.

Which issue(s) this PR closes:

Closes #9126

Special notes for your reviewer: I've tried to comment some in the PR - the logic here was pretty convoluted and I think had some dead code (which I noted back when embargoes went in). Hopefully cleaner now but would be good to have someone check my assumptions about the old code as noted in the comments.

Suggestions on how to test this: New functionality - easiest to have DANS test it - you need to run an async workflow and use the workflow token to access the download file endpoints to test it. Regression - this could have but shouldn't break any access by Guest/PrivateUrl/Authenticated Users with appropriate permissions (DownloadFile for published restricted or embargoed files, ViewUnpublishedDataset for unpublished files) to the main file, aux files, etc.
[from Leonid:] For regression, I just want to second the above, about retesting extra carefully the less trivial access cases. In addition to the ones listed above, maybe attempting a zip download, via the api, of a number of files only some of which the user is authorized to access.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?: as a workflow related bug fix, probably not?

Additional documentation:

qqmyers · 2022-11-03T21:47:00Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java

    //@EJB

    // TODO: 
    // versions? -- L.A. 4.0 beta 10
    @Path("datafile/bundle/{fileId}")
    @GET
    @Produces({"application/zip"})
-    public BundleDownloadInstance datafileBundle(@PathParam("fileId") String fileId, @QueryParam("fileMetadataId") Long fileMetadataId,@QueryParam("gbrecs") boolean gbrecs, @QueryParam("key") String apiToken, @Context UriInfo uriInfo, @Context HttpHeaders headers, @Context HttpServletResponse response) /*throws NotFoundException, ServiceUnavailableException, PermissionDeniedException, AuthorizationRequiredException*/ {
+    public BundleDownloadInstance datafileBundle(@PathParam("fileId") String fileId, @QueryParam("fileMetadataId") Long fileMetadataId,@QueryParam("gbrecs") boolean gbrecs, @Context UriInfo uriInfo, @Context HttpHeaders headers, @Context HttpServletResponse response) /*throws NotFoundException, ServiceUnavailableException, PermissionDeniedException, AuthorizationRequiredException*/ {


Most api calls don't add the key as a QueryParam so I've removed them in this class.

Note to self: we need to retire this "bundle access" method.

qqmyers · 2022-11-03T21:47:50Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java


        if (gbrecs != true && df.isReleased()){
            // Write Guestbook record if not done previously and file is released
-            User apiTokenUser = findAPITokenUser(apiToken);
+            User apiTokenUser = findAPITokenUser();


This calls findUserOrDie which will retrieve the key param or api token header, or the workflow token header.

qqmyers · 2022-11-03T21:48:19Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java

        // This will throw a ForbiddenException if access isn't authorized: 
-        checkAuthorization(df, apiToken);
+        checkAuthorization(df);


comments in this method.

coveralls · 2022-11-03T21:48:55Z

Coverage increased (+0.06%) to 20.054% when pulling 49ab161 on GlobalDataverseCommunityConsortium:IQSS/9126-allow_workflow_tokens_in_access_api into 6c87b39 on IQSS:develop.

qqmyers · 2022-11-03T21:49:01Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java

@@ -859,15 +833,15 @@ public void write(OutputStream os) throws IOException,
                        logger.fine("token: " + fileIdParams[i]);
                        Long fileId = null;
                        try {
-                            fileId = new Long(fileIdParams[i]);
+                            fileId = Long.parseLong(fileIdParams[i]);


unrelated - removed deprecated methods.

qqmyers · 2022-11-03T21:49:38Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java

-
-        if (isAccessAuthorized(dataFile, getRequestApiKey())) {
+        //Already have access
+        if (isAccessAuthorized(dataFile)) {


added a comment - my understanding is that in a request access call, if you already have access, this is a BAD_REQUEST

qqmyers · 2022-11-03T21:51:40Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java

         */

-        if (session != null) {
+        User apiTokenUser = null;


The logic used to handle 3 cases separately, mostly so that fine logging could be done to indicate which branch was being used: session - logged in user, guest user, and apitoken user. One tricky thing is that both session and apitoken calls could return a GuestUser and this wasn't broken out.

I like how the code has been reorganized and simplified. And the logic seems correct. But could you please clarify the "this wasn't broken out" part above. Aside from the old code being convoluted and unwieldy, was there actually a problem with the old logic, was there a situation where the guest user returned by the apitoken call would overwrite an authenticated user from the session? - I've stared at the old code for a while now and I'm not seeing that, but figured I would ask to confirm.

(We may have assumed in the past that nobody would be calling the api with both a session and a token in real life).

I don't think it was broken, so no real issue. I think my comment related to the old line 1877 and the code after it - there was no distinction between guest coming back from a session or from the apitoken logic. In refactoring, I had to make sure that both were still handled, e.g. in line 1790 below.

I was looking at that code after line 1877 - it was just meaningless, wasn't it? I.e., it doesn't seem to be doing anything that the line 1861 hasn't already attempted.
It just looks like the old code may have been written with the assumption that session == null when it's a direct API call - and it should never be null of course.

qqmyers · 2022-11-03T21:53:47Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java

         */

-        if (session != null) {
+        User apiTokenUser = null;
+        //If we get a non-GuestUser from findUserOrDie, use it. Otherwise, check the session


The logic now looks for an apitoken authenticated user and uses it if it exists. If not, and a session user exists, we use that. If the apitoken method indicates a GuestUser, we will use that if there's no session.

qqmyers · 2022-11-03T21:55:16Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java

+        //If we get a non-GuestUser from findUserOrDie, use it. Otherwise, check the session
+        try {
+            logger.fine("calling apiTokenUser = findUserOrDie()...");
+            apiTokenUser = findUserOrDie();


With the rearchitecting, I assume this call will end up handling both session and token logins and this part of the code will be able to be replaced. Until then, this is the only part of the api that was already designed to be usable with sessions or tokens.

qqmyers · 2022-11-03T21:56:02Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java

+        //If we don't have a user, nothing more to do. (Note session could have returned GuestUser)
+        if (user == null && apiTokenUser == null) {
+            logger.warning("Unable to find a user via session or with a token.");
+            return false;


The code had/has a check for both being null at the end, but there's no reason to proceed further if we have no user.

src/main/java/edu/harvard/iq/dataverse/api/Access.java

qqmyers · 2022-11-03T21:58:59Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java

+            dvr = createDataverseRequest(apiTokenUser);
+        } else {
+            // used in JSF context, user may be Guest
+            dvr = dvRequestService.getDataverseRequest();


Simplifying the code below by getting the appropriate request ready now. This means we can't use the permissionService.on() shortcut call, but it also means that session and token users are handled in the same single lines below.

qqmyers · 2022-11-03T22:00:26Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java

+            // used in JSF context, user may be Guest
+            dvr = dvRequestService.getDataverseRequest();
+        }
+        if (!published) { // and restricted or embargoed (implied by earlier processing)


Since published and not restricted/embargoed is handled above, the main split now is whether it is published or not. If it's published, the only case left is with restricted/embargoed. With unpublished, both the restricted/embargoed and not restricted/embargoed both get handled the same way.

qqmyers · 2022-11-03T22:01:11Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java

-            // last option - guest user in either contexts
-            // Guset user is impled by the code above.
-            if ( permissionService.requestOn(dvRequestService.getDataverseRequest(), df.getOwner()).has(Permission.ViewUnpublishedDataset) ) {
+            if (permissionService.requestOn(dvr, df.getOwner()).has(Permission.ViewUnpublishedDataset)) {


Given the changes above, this line now handles all three authenticated session user, token user, and guest cases.

qqmyers · 2022-11-03T22:02:13Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java

-                }
-            } else {
-                logger.log(Level.FINE, "API token-based auth: User {0} is not authorized to access the datafile.", user.getIdentifier());
+            if (permissionService.requestOn(dvr, df).has(Permission.DownloadFile)) {


same here - all three cases. Also note in the old code I think the lines after #1955 were unreachable, repeating a case handled earlier.

qqmyers · 2022-11-03T22:03:58Z

src/main/java/edu/harvard/iq/dataverse/api/Access.java

+        try {
+            logger.fine("calling apiTokenUser = findUserOrDie()...");
+            apiTokenUser = findUserOrDie();
+            if(apiTokenUser instanceof GuestUser) {


As noted, the idea here is to not let a guest user returned from findUserOrDie (which happens when there is no key/token, and which we want if there's no session) from overriding an authenticated session user. This could probably be cleaned up further, but I'm again assuming the rearchitecture work will end up doing that in the AbstractApiBean soon. (Hopefully the rest of the cleanup here, in addition to solving the workflow token issue, will make it easier for this class to use that new method.)

…flow_tokens_in_access_api

mreekie · 2022-12-14T21:20:03Z

added to sprint Dec 15, 2022

src/main/java/edu/harvard/iq/dataverse/api/Access.java

landreev

I'm glad to see this logic streamlined and simplified.
I asked a couple of simple questions, and there's a trivial typo, but I'm generally ready to move it into qa.
I appreciate the comments explaining what's going on. My only question is, is there a reason not to incorporate at least some of them into the code?

mreekie · 2023-01-11T20:45:51Z

QA is left:
Size for this sprint: 3

Refactored permissions checks and fixed workflow token access

7d9327e

qqmyers added the GDCC: DANS related to GDCC work for DANS label Nov 3, 2022

qqmyers commented Nov 3, 2022

View reviewed changes

src/main/java/edu/harvard/iq/dataverse/api/Access.java Show resolved Hide resolved

qqmyers commented Nov 3, 2022

View reviewed changes

Merge remote-tracking branch 'IQSS/develop' into IQSS/9126-allow_work…

09e4f09

…flow_tokens_in_access_api

qqmyers added the Size: 10 A percentage of a sprint. 7 hours. label Dec 13, 2022

Merge remote-tracking branch 'IQSS/develop' into IQSS/9126-allow_work…

bbe1c58

…flow_tokens_in_access_api

mreekie added this to Ready for Review in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) via automation Dec 14, 2022

landreev self-assigned this Dec 19, 2022

landreev self-requested a review December 19, 2022 23:14

landreev moved this from Ready for Review to Review 🔎 in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) Dec 20, 2022

landreev reviewed Dec 21, 2022

View reviewed changes

src/main/java/edu/harvard/iq/dataverse/api/Access.java Outdated Show resolved Hide resolved

landreev reviewed Dec 22, 2022

View reviewed changes

qqmyers added 2 commits December 22, 2022 11:56

typo

e4efe4a

include comments from PR

49ab161

landreev approved these changes Dec 22, 2022

View reviewed changes

IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) automation moved this from Review 🔎 to QA ✅ Dec 22, 2022

kcondon assigned kcondon and unassigned landreev Jan 4, 2023

kcondon merged commit 4960f03 into IQSS:develop Jan 18, 2023

IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) automation moved this from QA ✅ to Done 🚀 Jan 18, 2023

pdurbin added this to the 5.13 milestone Jan 23, 2023

mikeaintworkin mentioned this pull request Jan 29, 2023

Retrieve a list of the issues & PRs in the global backlog and include Status thisaintwork/experimentsWithGithub#10

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IQSS/9126- Fix workflow token access #9129

IQSS/9126- Fix workflow token access #9129

qqmyers commented Nov 3, 2022 •

edited by landreev

qqmyers Nov 3, 2022

landreev Dec 21, 2022

qqmyers Nov 3, 2022

qqmyers Nov 3, 2022

coveralls commented Nov 3, 2022 •

edited

qqmyers Nov 3, 2022

qqmyers Nov 3, 2022

qqmyers Nov 3, 2022

landreev Dec 22, 2022

qqmyers Dec 22, 2022

landreev Dec 22, 2022

qqmyers Nov 3, 2022

qqmyers Nov 3, 2022

qqmyers Nov 3, 2022

qqmyers Nov 3, 2022

qqmyers Nov 3, 2022

qqmyers Nov 3, 2022

qqmyers Nov 3, 2022

qqmyers Nov 3, 2022 •

edited

mreekie commented Dec 14, 2022

landreev left a comment

mreekie commented Jan 11, 2023

IQSS/9126- Fix workflow token access #9129

IQSS/9126- Fix workflow token access #9129

Conversation

qqmyers commented Nov 3, 2022 • edited by landreev

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Nov 3, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qqmyers Nov 3, 2022 • edited

Choose a reason for hiding this comment

mreekie commented Dec 14, 2022

landreev left a comment

Choose a reason for hiding this comment

mreekie commented Jan 11, 2023

qqmyers commented Nov 3, 2022 •

edited by landreev

coveralls commented Nov 3, 2022 •

edited

qqmyers Nov 3, 2022 •

edited