Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error submitting or deleting items - URI too long when user is in a large number of groups #9544

Closed
ddean4040 opened this issue May 6, 2024 · 6 comments · Fixed by #9548
Closed
Assignees
Labels
bug component: submission Related to configurable submission system high priority performance / caching Related to performance or caching issues testathon Reported by a tester during Community Testathon
Milestone

Comments

@ddean4040
Copy link

Describe the bug
When submitting or deleting an item, a user who belongs to a large number of groups but is not an admin will get an error message and the operation will fail.

The REST service logs:

2024-05-01 14:32:57,281 INFO  ... ... org.dspace.app.rest.utils.DSpaceAPIRequestLoggingFilter @ Before request [GET /server/api/discover/search] originated from /mydspace?configuration=workspace
2024-05-01 14:32:57,300 ERROR ... ... org.dspace.app.rest.exception.DSpaceApiExceptionControllerAdvice @ An exception has occurred (status:500)
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr:8983/solr/search: Expected mime type application/octet-stream but got text/html. <h1>Bad Message 414</h1><pre>reason: URI Too Long</pre>
        at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:635) ~[solr-solrj-8.11.2.jar:8.11.2 ... - mdrob - 2022-06-13 11:27:56]
        at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266) ~[solr-solrj-8.11.2.jar:8.11.2 ... - mdrob - 2022-06-13 11:27:56]
        at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248) ~[solr-solrj-8.11.2.jar:8.11.2 ... - mdrob - 2022-06-13 11:27:56]
        at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:214) ~[solr-solrj-8.11.2.jar:8.11.2 ... - mdrob - 2022-06-13 11:27:56]
        at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1003) ~[solr-solrj-8.11.2.jar:8.11.2 ... - mdrob - 2022-06-13 11:27:56]
        at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1018) ~[solr-solrj-8.11.2.jar:8.11.2 ... - mdrob - 2022-06-13 11:27:56]
        at org.dspace.content.EntityTypeServiceImpl.getSubmitAuthorizedTypes(EntityTypeServiceImpl.java:151) ~[dspace-api-7.6.1.jar:7.6.1]

It looks like the query in EntityTypeServiceImpl.java is using GET instead of POST (it does not specify solrSearchCore.REQUEST_METHOD). We were able to workaround the issue for now by increasing the max header size in Solr.

This is on DSpace 7.6.1.

To Reproduce
Steps to reproduce the behavior:

  1. Add a user to a large number of groups (our user is in 550 groups) but not the Administrator group
  2. Try to submit or delete an item

Expected behavior
Operation completes successfully.

Related work
Looks similar to this old bug: #6732

@ddean4040 ddean4040 added bug needs triage New issue needs triage and/or scheduling labels May 6, 2024
@tdonohue
Copy link
Member

tdonohue commented May 6, 2024

This bug sounds similar to #9201 (possibly the same issue?) and both might be caused by #9164. We have a possible fix for #9164 in #9508 (and the PRs linked from that PR).

@tdonohue tdonohue added the component: submission Related to configurable submission system label May 6, 2024
@ddean4040
Copy link
Author

The URLs that threw 500s when our user ran into this issue were:

  • /server/api/core/entitytypes/search/findAllByAuthorizedExternalSource
  • /server/api/core/entitytypes/search/findAllByAuthorizedCollection

I haven't had any reports of slowness in the workflow tasks view, but I'll ask around.

@tdonohue
Copy link
Member

tdonohue commented May 6, 2024

@ddean4040 : Thanks for the additional details. Based on those details, I've determined this is a new issue (not a duplicate).

The core problem appears to be in EntityTypeServiceImpl.getSubmitAuthorizedTypes() (which is used by both of the endpoints you mentioned). That method appears to append every group you are a member of to a (potentially very large) Solr query. See the code at: https://github.com/DSpace/DSpace/blob/main/dspace-api/src/main/java/org/dspace/content/EntityTypeServiceImpl.java#L134-L139

So, this appears to be a very inefficient query, especially if you are a member of a large number of groups.

Pulling this over to our 8.0 board in search of a volunteer.

@tdonohue tdonohue added high priority performance / caching Related to performance or caching issues testathon Reported by a tester during Community Testathon and removed needs triage New issue needs triage and/or scheduling labels May 6, 2024
@tdonohue
Copy link
Member

tdonohue commented May 6, 2024

@ddean4040 : I just realized your analysis in the description is 100% correct. This problem should be fixed if we send this query to Solr via a POST instead of a GET. In other areas of DSpace, we are primarily using POST queries.

I'll create a small PR which should fix this.

@tdonohue
Copy link
Member

tdonohue commented May 6, 2024

Small PR which should fix the issue #9548. @ddean4040 : If you have a chance to verify/test this small fix, please do let us know your findings.

@tdonohue tdonohue self-assigned this May 6, 2024
@alanorth
Copy link
Contributor

alanorth commented May 7, 2024

I hadn't noticed these, but we do have a large repository with a number of users who belong to many groups. Sure enough, I see over a thousand in the past month or so (logs are zstd compressed):

$ zstdgrep -a 'URI Too Long' log/dspace.log-2024-04-* | wc -l
1423

I have tested the patch and left comments there.


BTW, there is a related "too many boolean clauses" error in Solr from other aspects of the UI, for example community and collection lists. I have one user who has 1,805 "OR" clauses!

@tdonohue tdonohue added this to the 7.6.2 milestone May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug component: submission Related to configurable submission system high priority performance / caching Related to performance or caching issues testathon Reported by a tester during Community Testathon
Projects
Development

Successfully merging a pull request may close this issue.

3 participants