Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

signed URLs with private url users, autogenerate API token #10098

Conversation

qqmyers
Copy link
Member

@qqmyers qqmyers commented Nov 3, 2023

What this PR does / why we need it: This PR addresses issues related to using signedUrls as a PrivateUrlUser and, as part of refactoring/reducing duplicate code, also adds generating an API token if needed to launch a config tool.

For PrivateUrlUsers, signedUrls weren't working for a couple reasons:

  • The signedUrl generator was not being given a username for PrivateUrlUsers which is required in Dataverse
  • PrivateUrlUsers have names of the form "PREFIX + datasetId" where PREFIX was '#'. Using this name in a signedUrl is problematic as '#' indicates the end of a URL (and start of an anchor) unless escaped.
  • The SignedUrl validation mechanism assumed only AuthenticatedUsers were valid and did not have logic to look up the tokens for PrivateUrlUsers.

Which issue(s) this PR closes:

Special notes for your reviewer:
Re: the PrivateUrlUser PREFIX - I chose to change this to '!' rather than use a character that has to be escaped (given that other rounds of escaping/unescaping may be happening). The only place this PREFIX was recorded is in the roleassignment table, and a flyway update script is given. I don't think these were exposed anywhere - perhaps in some api call? Alternately, the username used for SignedUrl purposes could be different than what is used in the db, but that seemed to be less straight-forward. (One could go without a PREFIX in signedUrl usernames, but it seems useful to be able to distinguish PrivateUrlUsers without having to see if the rest of the name can be parsed as a datasetId.)
W.r.t. #10045/autogenerating api keys - we do this in some places now, This PR adds it for dataset-level config tools. With this PR enabling signedUrls for PrivateUrlUsers and a planned PR to convert the existing Previewers to use signedUrls, it seems like were close to a model where we could keep the API key hidden unless the user wanted to use it explicitly. That may be a good way to address the concerns in #9898.

Suggestions on how to test this:
Configure a file previewer to use signedUrls and verify that the signedUrls provided work on a draft file (or restricted file) for a normal user and for a PrivateUrlUser, both in the file page and when launched as a separate page.
The signedUrls can be verified manually or by using one of the previewers now enabled for signedUrls (new PR being added to https://github.com/gdcc/dataverse-previewers). Manual verification involves looking at the URL used to launch the previewer in the browser console/network tab, getting the callback parameter, base64 decoding it, using that signedUrl in the browser to retrieve the params and allowedApiCall signedUrls and then verify that they also work. (I can do a quick walkthrough for anyone.)
Re: #10045 - this is needed for TurboCurator and using the TurboCurator configuration currently set up on the beta machine would be easiest (i.e. use the same configuration json as their to register the TurboCurator tool on your test instance.) As with previewers, manually verifying that the signedURLs returned work would be enough. (afaik, TurboCurator is not yet publicly visible.)

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?: I've included some notes - not sure what rises to the level of being called out. I also went ahead and made the notes reference new Previewers that can use signed Urls. While those aren't in this PR and aren't part of a 6.1 release per se, they will work with 6.1 (and the PrivateUrl fix here means that 6.1 is the first version where they will work for PrivateUrl users. They should work for normal authenticated users in 5.14/6.0 as well.)

Additional documentation:

https://dataverse-guide--10098.org.readthedocs.build/en/10098/api/external-tools.html

@qqmyers qqmyers added Size: 3 A percentage of a sprint. 2.1 hours. Size: 10 A percentage of a sprint. 7 hours. and removed Size: 3 A percentage of a sprint. 2.1 hours. labels Nov 3, 2023
@qqmyers qqmyers added this to Ready for Review ⏩ in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) via automation Nov 3, 2023
@qqmyers qqmyers added this to the 6.1 milestone Nov 3, 2023
@pdurbin
Copy link
Member

pdurbin commented Nov 11, 2023

@pdurbin pdurbin moved this from Ready for Review ⏩ to In Review 🔎 in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) Nov 21, 2023
@pdurbin pdurbin self-assigned this Nov 21, 2023
@coveralls
Copy link

coveralls commented Nov 21, 2023

Coverage Status

coverage: 20.007% (+0.004%) from 20.003%
when pulling 5511946 on QualitativeDataRepository:IQSS/10093-signedUrls_with_privateUrlUsers
into 454e0bb on IQSS:develop.

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The release note is a little confusing. Otherwise, the code is looking good. I haven't tested it but API tests are passing.

Comment on lines +969 to +972
if (user instanceof AuthenticatedUser) {
apiToken = getValidApiTokenForAuthenticatedUser((AuthenticatedUser) user);
} else if (user instanceof PrivateUrlUser) {
PrivateUrlUser privateUrlUser = (PrivateUrlUser) user;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Netbeans is suggesting the fancy new Java 14 instanceof automatic casting feature ( https://blogs.oracle.com/javamagazine/post/pattern-matching-for-instanceof-in-java-14 ), but I'll leave it alone.

@@ -12,7 +12,7 @@
*/
public class PrivateUrlUser implements User {

public static final String PREFIX = "#";
public static final String PREFIX = "!";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember why I picked # but I think it's fine to change it (I'm glad there's a flyway script).

Comment on lines 64 to 67
if(!userId.startsWith(PrivateUrlUser.PREFIX)) {
targetUser = authSvc.getAuthenticatedUser(userId);
userApiToken = authSvc.findApiTokenByUser((AuthenticatedUser)targetUser);
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is off. I'll push a fix.

@@ -0,0 +1,5 @@
A new version of the standard Dataverse Previewers from https://github/com/gdcc/dataverse-previewers is available. The new version supports the use of signedUrls rather than API keys when previewing restricted files (including files in draft dataset versions). Upgrading is highly recommended.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by this. When Dataverse 6.1 is released, will Dataverse Previewers 1.4 be released?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is my intent. Feel free to hedge.

@pdurbin pdurbin changed the title IQSS/10093 signed urls with private url users signed URLs with private url users, autogenerate API token Nov 21, 2023
@pdurbin
Copy link
Member

pdurbin commented Nov 21, 2023

The signedUrls can be verified manually or by using one of the previewers now enabled for signedUrls (new PR being added to https://github.com/gdcc/dataverse-previewers).

Has this new PR been added yet? I can't find it.

@pdurbin
Copy link
Member

pdurbin commented Nov 21, 2023

@qqmyers can you please provide a tool and manifest to test with? I tried simply modifying the existing text previewer but it doesn't work. Here's the manifest I used:

curl -X POST -H 'Content-type: application/json' http://localhost:8080/api/admin/externalTools -d \
'{
  "displayName":"Signed URL Read Text",
  "description":"Read the text file.",
  "toolName":"textPreviewer",
  "scope":"file",
  "types":["preview"],
  "toolUrl":"http://localhost:8080/dataexplore/dataverse-previewers/previewers/v1.3/TextPreview.html",
  "toolParameters": {
      "queryParameters":[
        {"fileid":"{fileId}"},
        {"siteUrl":"{siteUrl}"},
        {"datasetid":"{datasetId}"},
        {"datasetversion":"{datasetVersion}"},
        {"locale":"{localeCode}"}
      ]
    },
  "allowedApiCalls": [
    {
      "name":"retrieveDataFile",
      "httpMethod":"GET",
      "urlTemplate":"/api/v1/access/datafile/{fileId}",
      "timeOut":270
    }
  ],
  "contentType":"text/plain"
}'

The URL in my browser (if it helps): http://localhost:8080/dataexplore/dataverse-previewers/previewers/v1.3/TextPreview.html?callback=aHR0cDovL2xvY2FsaG9zdDo4MDgwL2FwaS92MS9maWxlcy84L21ldGFkYXRhLzcvdG9vbHBhcmFtcy8yP3VudGlsPTIwMjMtMTEtMjFUMjA6Mjc6NTMuOTU3JnVzZXI9ZGF0YXZlcnNlQWRtaW4mbWV0aG9kPUdFVCZ0b2tlbj01NTJiYzZkNmUwZTk1NTNmNWI4ZmI2YTU4NTg5MWYyZmE1ZTdiM2JiMDlhN2JmZThmNjUyMmNmZmJmYmI2OWJjYjAyNzdjYjEzZGQ1NGUyYzBlODYwY2E4ZDM4ZWM5ZjVkMDc0MWJmNDUwMWNhYzVjZWUyZmY4MzU5ZDQwMjZiMQ==&locale=en

What I see in the browser:

Screenshot 2023-11-21 at 3 27 57 PM

@qqmyers
Copy link
Member Author

qqmyers commented Nov 22, 2023

The new previewers aren't out yet, so using the manual method described in how to test would be the best option. It looks like you have a valid config - the response you get has the base64 encoded callback param, which, when decoded, shows:
http://localhost:8080/api/v1/files/8/metadata/7/toolparams/2?until=2023-11-21T20:27:53.957&user=dataverseAdmin&method=GET&token=552bc6d6e0e9553f5b8fb6a585891f2fa5e7b3bb09a7bfe8f6522cffbfbb69bcb0277cb13dd54e2c0e860ca8d38ec9f5d0741bf4501cac5cee2ff8359d4026b1
If you had used that before the timeout, you would get json that includes json with the specific signed URL you requested. That URL should also work. The PR was primarily about making sure this also works with a privateURL user and should also cover the dataset-level config tool issue (though I'm not sure I tested that).

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to work fine, including creating an API token. Merging.

IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) automation moved this from In Review 🔎 to Ready for QA ⏩ Nov 27, 2023
@pdurbin pdurbin merged commit 410eb45 into IQSS:develop Nov 27, 2023
12 checks passed
IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) automation moved this from Ready for QA ⏩ to Done 🚀 Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Size: 10 A percentage of a sprint. 7 hours.
Projects
No open projects
3 participants