Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8191 require toa or req access #8308

Merged
merged 98 commits into from
Apr 20, 2022
Merged

Conversation

sekmiller
Copy link
Contributor

@sekmiller sekmiller commented Dec 13, 2021

What this PR does / why we need it:
For datasets that have restricted files we want to require that Request Access is enabled or Terms of Access are provided. For newly created datasets we are setting Request Access to true. For any existing datasets with restricted files we are putting up a banner with the new requirements and a link to the edit terms page. In all cases where there are restricted files we are disabling the applicable save button until the requirement is met. Also for those existing datasets in draft status with restricted files and out of compliance terms we are disabling the Publish/Submit button until the dataset is in compliance.

Which issue(s) this PR closes:

Closes #8191 Require Selection of enable access request or adding Terms of Access for restricted files.

Special notes for your reviewer: There was an existing validator on Terms of Use and Access that really wasn't doing anything that I enabled for this. I also tried to cache the tests for has restricted files and has valid terms so that these aren't recalculated too often. Some changes had to be made to various apis and constructors because the terms of use and access validator relies on knowing its dataset version (for getting "has restricted files") and to implement the new default value (true) of RequestAccess

Suggestions on how to test this: Since any new datasets created in this branch will be in compliance you may have to search for old datasets with restricted files that don't have Request Access or TOA. Here's a query I used to find prior datasets:

select max(v.id), dvo.id from datasetversion v
join termsofuseandaccess ua on ua.id = v.termsofuseandaccess_id
join filemetadata fm on v.id = fm.datasetversion_id
join datafile f on f.id = fm.datafile_id
join dvobject dvo on v.dataset_id = dvo.id
where ua.fileaccessrequest = false and ua.termsofaccess isnull
and f.restricted = true
group by v.dataset_id, dvo.id

You could also just use an update query on a compliant dataset to set termsofuseandaccess.requestaccess to false and
termsofuseandaccess.termsofaccess to blank or null.

There were also some resfactoring changes to simplify code related to popups - so it would be good to test delete and other functionality from the edit file page.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:
We have met with the design team about the UI changes.

Screen Shot 2021-12-13 at 1 53 28 PM

Screen Shot 2021-12-13 at 10 07 19 AM

Is there a release notes update needed for this change?: We need to highlight that the new default is for datasets to allow access requests and what will happen to existing noncompliant datasets. I will write a first draft.

Additional documentation: For the Harvard Dataverse we'd like to get in touch with the owners of noncompliant datasets with an email that includes a link to editing the Terms. I will work on that. (starting with that query above)

@djbrooke
Copy link
Contributor

Thanks @sekmiller! Is this ready for review?

@coveralls
Copy link

coveralls commented Dec 14, 2021

Coverage Status

Coverage decreased (-0.03%) to 18.972% when pulling 3eb6f4b on 8191-require-TOA-or-req-access into 396b7f7 on develop.

@sekmiller
Copy link
Contributor Author

This query will create a list of email addresses and links to datasets for all users with an edit role on a dataset that has restricted files but does not allow access requests and does not have an entry for Terms of Access
BulkEmailQuery.txt

@sekmiller
Copy link
Contributor Author

The query will yield results like this. Note if there are several contributors for a given dataset there will be a line in the results table for each. Likewise if a contributor is responsible for multiple non-compliant datasets there will be a line for each with their email address and a link to the non-compliant dataset:

https://docs.google.com/spreadsheets/d/1xGqKutRtXuSKdPbTicaUqhr0rqsLq-1TGxHsk4Ntk2U/edit?usp=sharing

@djbrooke djbrooke self-assigned this Dec 14, 2021
@djbrooke
Copy link
Contributor

@sekmiller thanks for inviting me to the party. I've updated the release notes and I added the query to the scripts folder - let me know if you need anything else here!

@sekmiller
Copy link
Contributor Author

sekmiller commented Apr 13, 2022

thanks to Kevin for running the query to see which email addresses/users are associated with what number of non-compliant datasets. There are something like 470 users involved. The vast majority are in the low single digits when it comes to number of datasets. (more than 1/2 have only one, and nearly 400 are at 1, 2, or 3). At the upper end there are 5 with 100 or more datasets with the big winner being dataverseadmi-at-iq.harvard.edu with nearly 1000 datasets.
Full results attached:

Kevin suggested that I not make this list public. he has a good point, sorry.

@sekmiller sekmiller removed their assignment Apr 15, 2022
@pdurbin
Copy link
Member

pdurbin commented Apr 15, 2022

It looks like the latest Jenkins run failed but the one before that (83) is showing a couple SWORD tests failing:

  • edu.harvard.iq.dataverse.api.SwordIT.testXmlExampleInGuides
  • edu.harvard.iq.dataverse.api.SwordIT.testLicenses

Details at https://jenkins.dataverse.org/job/IQSS-Dataverse-Develop-PR/job/PR-8308/83/testReport/

Anything to worry about? Fixed already?

@sekmiller sekmiller self-assigned this Apr 19, 2022
@sekmiller sekmiller removed their assignment Apr 19, 2022
@sekmiller
Copy link
Contributor Author

Fixed the test Phil was alluding to above. Terms in incoming sword datasets weren't being associated with their dataset versions and that was breaking the version validation.

@kcondon kcondon self-assigned this Apr 19, 2022
@kcondon kcondon merged commit b864d1a into develop Apr 20, 2022
@kcondon kcondon deleted the 8191-require-TOA-or-req-access branch April 20, 2022 16:20
@pdurbin pdurbin added this to the 5.11 milestone May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Require selection of enable access request or adding Terms of Access for restricted files
8 participants