Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DS-3543 shows pagination bug with withdrawn items #2406

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

abollini
Copy link
Member

This PR provides the alternative fix agreed in #2120

The first commit extends the current IT about items pagination to show the bug
https://travis-ci.org/4Science/DSpace/builds/524414847
the second one is the fix

@abollini abollini added quick win Pull request is small in size & should be easy to review and/or merge interface: REST API v7+ REST API for v7 and later (dspace-server-webapp module) labels Apr 25, 2019
Copy link
Member

@KevinVdV KevinVdV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The total from the "findAll()" rest call can still differ from the from the amount of items that are actually returned.

The query that we execute to retrieve all items right now is:

SELECT uuid FROM item WHERE in_archive=true OR withdrawn=true; (OK)

But the query to retrieve the total count is

SELECT COUNT(*) FROM item;

When there are items in the workflow / workspace the total count will be higher. I think we need a countTotalUnfiltered() method.

@abollini
Copy link
Member Author

abollini commented May 9, 2019

good catch @KevinVdV , thanks to check that. I will fix that change the IT to identify this issue (it should be sufficient to add some workspaceitem to the test env). I will ping you once done

@abollini
Copy link
Member Author

@KevinVdV after further consideration, for consistency reason, I think we should introduce an additional findAllUnfilteredAndInprogress method that really return all the items that should be exposed over the REST endpoint. The inprogress items indeed are exposed on the items endpoint if you know the uuid (and linked from the workspaceitem / workflowitem).
Probably, as I can see usual use cases where we are only interested to "final" items, we should add support for some search methods for them. Do you agree?

@benbosman
Copy link
Member

@abollini I would assume that if e.g. /rest/api/core/items/517f13ab-3105-42d3-bf55-271876fb726e exists, it would be exposed in /rest/api/core/items even if it's in the submission. I wouldn't see a reason for GET /rest/api/core/items to filter items even they can be requested when entering the exact ID.

On the other hand, the GET /rest/api/core/items should perhaps not even exist at all. It's an unsorted list of all items, and it has risks of being incorrectly used (e.g. for harvesting the repository), but no good use cases. It also can't be safely used to iterate all items because it will change if items are created/deleted while paging. I think the best use case for it is currently the ITs, but they can be replaced with direct calls to the ItemService.

I must admit that these methods can also get quite confusing when we have:

  • findAll: which is in fact not all, but only archived
  • findAllUnfiltered: which is in fact still filtered, but includes archived or withdrawn
  • countTotal: which is the real total, and makes sense
  • countNotArchivedItems: which omits the withdrawn items as well

Although a cleanup of all these names is out of scope for this PR, I would prefer to either:

  • Make sure GET /rest/api/core/items contains all items (there's no service for this yet)
  • Remove GET /rest/api/core/items (but that would probably require quite some work)

@abollini
Copy link
Member Author

thanks @benbosman for your feedback, I will proceed with the option

Make sure GET /rest/api/core/items contains all items (there's no service for this yet)

The items endpoint is just for administrators and it is convenient for the Integration Tests for sure as "double check" maybe there are maintenance use case that we don't see yet. I will also create a PR to update the contract to make this explicit

@dimitrispie
Copy link

thanks @benbosman for your feedback, I will proceed with the option

Make sure GET /rest/api/core/items contains all items (there's no service for this yet)

The items endpoint is just for administrators and it is convenient for the Integration Tests for sure as "double check" maybe there are maintenance use case that we don't see yet. I will also create a PR to update the contract to make this explicit

Have you considered the performance of such request, i.e. returning all items?

@tdonohue
Copy link
Member

tdonohue commented Jan 6, 2020

This may have been resolved (in a different manner) by #2580, as withdrawn & private items are no longer returned in default searches/counts...they are now a separate search configuration which is only accessible to Admins.

@tdonohue tdonohue changed the base branch from master to main July 13, 2020 21:51
@github-actions github-actions bot added the merge conflict PR has a merge conflict that needs resolution label Aug 26, 2020
@tdonohue tdonohue added the needs discussion Ticket or PR needs discussion before it can be moved forward. label Jul 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interface: REST API v7+ REST API for v7 and later (dspace-server-webapp module) merge conflict PR has a merge conflict that needs resolution needs discussion Ticket or PR needs discussion before it can be moved forward. quick win Pull request is small in size & should be easy to review and/or merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants