Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2122 querying root dataverse contents (and other permission performance boosts) #4883

Merged
merged 36 commits into from Dec 11, 2018

Conversation

oscardssmith
Copy link
Contributor

This speeds up querying root dataverse contents by adding fast paths for non users and superusers who now don't get permissions checked, as well as speedifying permissionServiceBean with shortcut has methods that are more efficient than the methods to find all permissions for a user. I also removed redundant recursive calls to groupsFor which speeds things up.

These changes make the query have acceptable performance for non-users and superusers, and speeds up the time of a regular user from ????????? to 26 minutes (on my laptop). This level of speedup is probably also achieved for most other calls to permissionServiceBean, which may significantly speed up other unrelated pieces of the code.

The main thing this PR still needs is a set of integration tests for permissions involving groups and nested groups with inherited permissions to make sure there are no functional regressions.

In addition caching of recent groupsFor calls should be able to speed up the remaining case of users who are not superusers to a reasonable level.

List<Integer> roles = em.createNativeQuery(powerfull_roles).getResultList();
String x = "select id from dataverserole where (permissionbits&12)!=0";
return null;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self: delete this method

@coveralls
Copy link

coveralls commented Jul 23, 2018

Coverage Status

Coverage increased (+0.004%) to 17.681% when pulling 09b0e69 on 2122-querying-root-dataverse-contents into 2a02d7c on develop.

Copy link
Contributor

@matthew-a-dunlap matthew-a-dunlap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure of the caching makes sense to me and seems reasonable

import java.util.concurrent.ConcurrentHashMap;
import java.util.logging.Logger;

public class TimeoutCache<K, V> implements Map<K, V> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have a bit more explanation as to the use of this cache in a comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured that could wait until we made a final decision to use it.

if (d instanceof DataFile && p.contains(Permission.DownloadFile)) {
// unrestricted files that are part of a release dataset
// automatically get download permission for everybody:
// -- L.A. 4.0 beta12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't think they're stale

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably just take out the L.A. 4.0 tag or rewrite it... makes it seem like the code is ancient yet its new.

final Set<Permission> groupPremissions = permissionsForSingleRoleAssignee(g,dvo);
permissions.addAll(groupPremissions);
for (Group g : groupService.groupsFor(req, dvo)) {
permissionsForSingleRoleAssignee(g, dvo, permissions);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it kind of confusing that in some versions of "permissionsForSingleRoleAssignee" you have to assign the return value to the permissions while in other cases it is assigned to the passed in object. Maybe its better for them to all be uniform? Or maybe change the naming?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I just realized I introduced 2 different permissionForSingleRoleAssignee that pass sets with completely different meanings. Will fix.

Copy link
Contributor

@benjamin-martinez benjamin-martinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the things that Matthew touched on, I think it looks good. I'm not going to give it approval because I can only follow the java.



private static final Set<Permission> PERMISSIONS_FOR_AUTHENTICATED_USERS_ONLY
= EnumSet.copyOf(Arrays.asList(Permission.values()).stream()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this was done on purpose/is good syntax.

User user = req.getUser();

// quick cases
if (user.isSuperuser()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should a required isempty also go here?

…ent, or even no statements at all, when the closure seed is empty
Copy link
Contributor

@landreev landreev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment in the issue.
The code looks good to me. As much as I can tell from just looking at the code. Considering how many changes in the logic there are in this PR, we probably need to come up with some comprehensive test for it. I don't think it's possible to confirm that it's all still doing what we want otherwise.

…ad of .tsv;

(the test was originally meant to test "normal", non-tabular files; there is a separate
test for replacing tabular ones. but now that we support ingest on tab-delimited
files, this test was essentially duplicating the other one. ref #2122)
@landreev
Copy link
Contributor

landreev commented Oct 16, 2018

Thanks @pameyer, for the detailed report.
I checked in some fixes for all of the test failures below - EXCEPT ONE. (will add more info in a sec)

on ac11957:
``
Failed tests: testDeleteDatasetWhileFileIngesting(edu.harvard.iq.dataverse.api.DatasetsIT): Expected status code <201> doesn't match actual status code <500>.(..)
testMoveDataverse(edu.harvard.iq.dataverse.api.DataversesIT): Expected status code <200> doesn't match actual status code <500>.(..)
testJsonParserWithDirectoryLabels(edu.harvard.iq.dataverse.api.FileMetadataIT): Expected status code <201> doesn't match actual status code <500>.(..)
testJsonParserWithDirectoryLabels(edu.harvard.iq.dataverse.api.FileMetadataIT): Expected status code <200> doesn't match actual status code <404>.(..)
test_001_AddFileGood(edu.harvard.iq.dataverse.api.FilesIT): Expected status code <201> doesn't match actual status code <500>.(..)
testAddFileBadJson(edu.harvard.iq.dataverse.api.FilesIT): Expected status code <201> doesn't match actual status code <500>.(..)
test_005_AddFileBadPermissions(edu.harvard.iq.dataverse.api.FilesIT): Expected status code <201> doesn't match actual status code <500>.(..)
test_006_ReplaceFileGood(edu.harvard.iq.dataverse.api.FilesIT): Expected status code <201> doesn't match actual status code <500>.(..)
test_006_ReplaceFileGoodTabular(edu.harvard.iq.dataverse.api.FilesIT): Expected status code <201> doesn't match actual status code <500>.(..)
testForceReplace(edu.harvard.iq.dataverse.api.FilesIT): Expected status code <201> doesn't match actual status code <500>.(..)
test_007_ReplaceFileUnpublishedAndBadIds(edu.harvard.iq.dataverse.api.FilesIT): Expected status code <201> doesn't match actual status code <500>.(..)
test_008_ReplaceFileAlreadyDeleted(edu.harvard.iq.dataverse.api.FilesIT): Expected status code <201> doesn't match actual status code <500>.(..)
testReplaceFileBadJson(edu.harvard.iq.dataverse.api.FilesIT): Expected status code <201> doesn't match actual status code <500>.(..)
testAddTinyFile(edu.harvard.iq.dataverse.api.FilesIT): Expected status code <201> doesn't match actual status code <500>.(..)
testRestrictFile(edu.harvard.iq.dataverse.api.FilesIT): Expected status code <201> doesn't match actual status code <500>.(..)
testRestrictAddedFile(edu.harvard.iq.dataverse.api.FilesIT): Expected status code <201> doesn't match actual status code <500>.(..)
testAccessFacet(edu.harvard.iq.dataverse.api.FilesIT): Expected status code <201> doesn't match actual status code <500>.(..)
testCuratorSendsCommentsToAuthor(edu.harvard.iq.dataverse.api.InReviewWorkflowIT): XML path message doesn't match.(..)
testSearchPermisions(edu.harvard.iq.dataverse.api.SearchIT): Expected status code <201> doesn't match actual status code <500>.(..)
testCreateDataverseCreateDatasetUploadFileDownloadFileEditTitle(edu.harvard.iq.dataverse.api.SwordIT): Expected status code <400> doesn't match actual status code <500>.(..)
testCreateAndDeleteDatasetInRoot(edu.harvard.iq.dataverse.api.SwordIT): Expected status code <201> doesn't match actual status code <500>.(..)
testCreateDatasetPublishDestroy(edu.harvard.iq.dataverse.api.SwordIT): Expected status code <201> doesn't match actual status code <500>.(..)

Tests in error:
testCreateDataset(edu.harvard.iq.dataverse.api.DatasetsIT): Failed to parse the JSON document
testAddUpdateDatasetViaNativeAPI(edu.harvard.iq.dataverse.api.DatasetsIT): Failed to parse the JSON document
testSequentialNumberAsIdentifierGenerationStyle(edu.harvard.iq.dataverse.api.DatasetsIT): Failed to parse the JSON document
testPrivateUrl(edu.harvard.iq.dataverse.api.DatasetsIT): Failed to parse the JSON document
testFileChecksum(edu.harvard.iq.dataverse.api.DatasetsIT): Failed to parse the JSON document
testSearchCitation(edu.harvard.iq.dataverse.api.SearchIT): Failed to parse the JSON document
testDatasetThumbnail(edu.harvard.iq.dataverse.api.SearchIT): Failed to parse the JSON document
testIdentifier(edu.harvard.iq.dataverse.api.SearchIT): Failed to parse the JSON document
testDeleteFiles(edu.harvard.iq.dataverse.api.SwordIT): Failed to parse the XML document

Tests run: 74, Failures: 22, Errors: 9, Skipped: 0
``

@landreev
Copy link
Contributor

@oscardssmith
Hi Oscar,
Thanks for the offer of cleanup.
When you say that "90% of the changes here could be removed" - was that the plan for what Michael was going to do with this branch? So Michael didn't have time to work on it, so the PR was sitting there with his name on it; and I felt increasingly bad about not being able to merge it, so I finally picked it up.
But note that the cleanup was not the most pressing issue with this branch; it was put on hold because of all the failing tests (that Pete reported in early Sep., above).
I spent some time looking into it; most of the errors and 500s were caused by just 2 issues with 2 named queries.
Now all of the tests from the list are passing, except for this one: testCreateDatasetPublishDestroy in SwordIT - and this one appears to be a real problem.
It's no longer throwing a 500, as in the original report (that was fixed by one of my fixes). It goes further and bombs on line 594, when the test tries to list the contents of a dataverse that it has created, populated and published. So the error is right in the API for which this issue was opened:
/api/dataverses/.../contents
And it actually appears to be a permission issue.
Once again, it's a published dataverse, with a published dataset, so everybody should be able to see it.
If I list its contents as the superuser, like this:

 curl "http://localhost:8080/api/dataverses/dv56233a79/contents?key=SUPERUSER_KEY"

I get back its contents correctly. However, if I try it as the user who created the dataverse and the dataset (and this is what's happening in the test that is failing), I get nothing:

curl "http://localhost:8080/api/dataverses/dv56233a79/contents?key=USER_WHO_CREATED_THE_DATAVERSE"
{"status":"OK","data":[]}

(same thing when I try it without any api key).
Once again, this is the API for which the issue was originally opened, and it doesn't appear to be working.
The entries in the RoleAssignment table appear to be correct, for the dataverse and dataset in question. But both are published - so should be visible to everybody anyway.

I can't immediately tell what's going on. I was hoping it was some kind of an obvious conflict with something that's been introduced into the branch since you worked on it, as it was synced up with develop; but I'm not seeing anything obvious.

Do you have any pointers?
If we have already taken you off the dev group on github, I can definitely add you back. To be clear, no pressure/expectations for you to actually work on this (obviously). But if you have anything to suggest or contribute, it'll be appreciated.

I'm sure I can figure it out. But still hoping it may be something simple you could point out immediately...

Hope you're well!

Honestly, ~90% of the changes here could be removed while maintaining the performance improvement. I would do it but i no longer have permissions.

@pdurbin
Copy link
Member

pdurbin commented Oct 16, 2018

@landreev just checking that you know about pull request #4998 which includes commits from @oscardssmith (which are also in this pull request #4883) followed by commits by @michbarsinai

@landreev
Copy link
Contributor

Oh, yes, what is the relationship between this branch and #4998, does it have everything this branch has? Should we abandon this branch?
I'll test the functionality in #4998 and see if the API in question is working there; (but my understanding was that that PR was for cleanup only; and should not affect how it actually works... -?)

@oscardssmith
Copy link
Contributor Author

What I would recommend is abandoning both this and michael's PR's. Then cherry pick the docs changes and the change to how group transitive closure is computed (including removing the double recursion), and leaving the rest. Those changes account for almost all of the functional improvements.

@oscardssmith
Copy link
Contributor Author

The rewritten method for child permissions might be nice too as that gives a notable speedup for that 1 case.

@landreev
Copy link
Contributor

@oscardssmith I'd be ok with cherry-picking the changes that we need from the branches. But do you have any ideas/suggestions for the fact that the code in the branch just doesn't seem to be working?
It's not just the issue of cleaning up/getting rid of the code we don't need. There's something wrong with the core code that we actually want to keep, in its current state.

I tried building @michbarsinai's branch (#4998), but it appears to be even more broken. There I can't even initiate a new database - trying to create the top-level, root dataverse with /api/dataverses in setup-all fails with an internal server error. The stack trace suggests that it's also permission service related.

@michbarsinai
Copy link
Member

michbarsinai commented Oct 18, 2018 via email

@landreev
Copy link
Contributor

landreev commented Nov 7, 2018

@michbarsinai For the record, no this appears to have nothing to do with those NotNull annotations. Adding that fix to the "cleanup" branch doesn't change the behavior in question - you still can't create the initial, root dataverse w/ this build. I'm assuming it's because somewhere in the code something expects the root dataverse to always be present in order to perform any permission checks? - I'm back to debugging the original, "uncleaned-up" branch; but will take another look at this too.

The branches are identical up to commit 05b704b (August 10). I #4998 still has those breaking @NotNull annotations, which may cause the "more broken state".

-- Michael
On 18 Oct 2018, at 1:26, landreev @.***> wrote: @oscardssmith https://github.com/oscardssmith I'd be ok with cherry-picking the changes that we need from the branches. But do you have any ideas/suggestions for the fact that the code in the branch just doesn't seem to be working? It's not just the issue of cleaning up/getting rid of the code we don't need. There's something wrong with the core code that we actually want to keep, in its current state. I tried building @michbarsinai https://github.com/michbarsinai's branch (#4998 <#4998>), but it appears to be even more broken. There I can't even initiate a new database - trying to create the top-level, root dataverse with /api/dataverses in setup-all fails with an internal server error. The stack trace suggests that it's also permission service related. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4883 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/AB2UJFFaPjMI8L0wpsMrwbmSH1S8V7vAks5ul7y-gaJpZM4VbiVN.

…(), the method

that serves ListDataverseContentCommand and the /dataverses/*/content api. (#2122)
…(), the method that serves ListDataverseContentCommand and the /dataverses/*/content api. (#2122)
@kcondon kcondon merged commit 881694b into develop Dec 11, 2018
@kcondon kcondon deleted the 2122-querying-root-dataverse-contents branch December 11, 2018 22:10
@oscardssmith
Copy link
Contributor Author

Thanks for getting this across the finish line @landreev. Do we have a final measurement for getting children of root of harvard dataverse with this? Also has anyone checked the effect on re-indexing?

@pdurbin
Copy link
Member

pdurbin commented Dec 12, 2018

Now that this pull request has been merged, we can't run the API tests on the phoenix server: #5393

@kcondon
Copy link
Contributor

kcondon commented Dec 12, 2018

Looks like this is failing, using the builtin secret key. Maybe that was overlooked?
curl -H "Content-type:application/json" -d @data/role-admin.json http://localhost:8080/api/admin/roles/

@landreev
Copy link
Contributor

@oscardssmith
Yes, I'm glad we were able to move this PR finally; thanks for all you work on it again.
There is unfortunately another problem we have somehow introduced (above); with tests failing on the dev. branch since this was merged. The entire restassured test set and the setup scripts were definitely working for me 2 weeks ago, when I was actively working on the branch; so must be a conflict introduced in one of the most recent syncs up with the develop branch... Will be looking into it today.

But, answering your question, yes, I ran some performance tests; I made a script for recursive crawling, and ran it on the entire prod. dataverse, as different users, and on individual dataverses. I'll post the numbers; they seem sensible, but most importantly it is something you can do now - where it was just not practical at all to crawl the entire root dataverse holdings.

(I'll post the number in the issue, not here)

I haven't tested the effects on reindexing; I probably should, and will. But I honestly don't expect that much of an effect there. Full reindexing is always done as the superuser; and there is no recursive crawling involved (instead of going through dataverses recursively, it just gets a linear list of all the datasets from the database, and goes through them). I also believe we have some evidence that shows that most of the time there is actually spent doing the indexing... But as I said, I'll test it, to know for sure.

@landreev
Copy link
Contributor

@oscardssmith (I'm being reminded that we are actually indexing the permissions - so it's not just the cost of getting to the object... I still believe most of the time spent there is that of indexing the metadata. But as I said, I'll test and report.
There were some performance anomalies there; some datasets taking too long, for the amount of indexable content in them. So who knows - definitely worth checking.

@landreev
Copy link
Contributor

@pdurbin et al - there's a chance the last night's test failure was a result of a jenkins and/or phoenix fluke; I've added a comment in #5393.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants