Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset files API extension for file counts #9853

Merged

Conversation

GPortas
Copy link
Contributor

@GPortas GPortas commented Aug 30, 2023

What this PR does / why we need it:

Includes new endpoint (/api/datasets/{id}/versions/{versionId}/files/counts) for obtaining file counts based on different criteria (Total count, per content type, per access status and per category name).

Includes new endpoint (/api/files/{id}/metadata/categories) for updating file categories (by name) for an existing file. If the specified categories do not exist, they will be created.

Which issue(s) this PR closes:

Closes #9834

Special notes for your reviewer:

Updating categories can also be done with the existing metadata update endpoint, but the new endpoint has been created to be more practical when it is only necessary to update categories and not other metadata fields.

Suggestions on how to test this:

Upload test files to a dataset version, categorize and embargo some, and ensure the counts are correct by calling the new counts endpoint:

curl -H "X-Dataverse-Key: <YOUR_API_KEY>" -X GET "http://localhost:8080/api/v1/datasets/:persistentId/versions/:latest/files/counts?persistentId=<DATASET_PID>"

For the new categories endpoint, call it and ensure categories are updated on the UI.

curl -H "X-Dataverse-key:<YOUR_API_KEY>" -X POST \
    -F 'jsonData={"categories":["Category1","Category2"]}' \
    "http://localhost:8080/api/files/<FILE_ID>/metadata/categories"

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

No

Is there a release notes update needed for this change?:

Yes

Additional documentation:

None

@GPortas GPortas added this to Ready for Review ⏩ in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) via automation Aug 30, 2023
@GPortas GPortas moved this from Ready for Review ⏩ to ▶ SPRINT READY in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) Aug 30, 2023
@GPortas GPortas added the Status: Waiting issues in need of input from someone currently unavailable label Aug 30, 2023
@GPortas GPortas added this to the 6.1 milestone Aug 30, 2023
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@scolapasta scolapasta moved this from ▶ SPRINT READY to Ready for Review ⏩ in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) Sep 11, 2023
@scolapasta scolapasta removed the Status: Waiting issues in need of input from someone currently unavailable label Sep 11, 2023
@scolapasta
Copy link
Contributor

scolapasta commented Sep 11, 2023

@GPortas I'll move this milestone 6.1 issue to "ready for Review", but I'm assigning you first so that you can handle the 6.0 merge and address any EE10 issues.

…averse into 9834-files-api-extension-file-counts
@GPortas GPortas removed their assignment Sep 12, 2023
@github-actions

This comment has been minimized.

…averse into 9834-files-api-extension-file-counts
…S/dataverse into 9851-datafile-payload-extension
@github-actions
Copy link

📦 Pushed preview images as

ghcr.io/gdcc/dataverse:9834-files-api-extension-file-counts
ghcr.io/gdcc/configbaker:9834-files-api-extension-file-counts

🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name.

@landreev landreev moved this from Ready for Review ⏩ to In Review 🔎 in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) Sep 26, 2023
@landreev landreev self-requested a review September 26, 2023 14:51
Copy link
Contributor

@landreev landreev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) automation moved this from In Review 🔎 to Ready for QA ⏩ Sep 26, 2023
@kcondon kcondon self-assigned this Sep 28, 2023
@kcondon
Copy link
Contributor

kcondon commented Sep 28, 2023

@GPortas This looks good. One thing I noticed is the tabular tags are not counted under categories. Is this expected? Categories currently are file tags, which are separate from tabular file tags. It is confusing, they seem the same but kind of mean different things. File tags are categories, sure, tabular tags were meant to indicate what type of data as a clue to future processing apps/functionality but never really was used that way. They look similar in the UI. Otherwise, this can be merged.

Screen Shot 2023-09-28 at 2 53 47 PM Screen Shot 2023-09-28 at 2 54 04 PM

Extend DataFile API payload and new endpoints for Files (getHasBeenDeleted) and Access (userFileAccessRequested)
Base automatically changed from 9785-files-api-extension-search to 9714-files-api-extension-filters September 28, 2023 20:46
@GPortas GPortas added the SPA These changes are required for the Dataverse SPA label Sep 29, 2023
@GPortas GPortas self-assigned this Oct 3, 2023
@GPortas
Copy link
Contributor Author

GPortas commented Oct 3, 2023

@kcondon

I had that same doubt when I developed these changes. If I'm not mistaken, tabular tags are not a very relevant feature within the UI, that is why they do not have an associated filter, as is the case for the "normal" tags.

Since the SPA is not going to change this behavior, I have not considered filtering by tabular tags via API. In any case, if at some point we find that filtering by tabular tags is useful for use cases other than SPA, we can add this option to the endpoint.

@kcondon
Copy link
Contributor

kcondon commented Oct 3, 2023

@GPortas I understand, this would be new behavior. However, I would say it isn't implemented in the UI currently not because it is not relevant but because tabular tags, though essentially just file tags/categories, is implemented separately, in a way that was often overlooked when implementing tag-related features. I can check with others -Leonid was of the opinion on counts that it should be counted. I have not yet asked others on UI. This is in part a policy question -would we ever change the behavior of the current implementation if it had a bug? I could see in many cases leaving it unless it was important or easy to change.

OK, I've done some digging and saw this: #8523 I'd reported this and it was included in a general code refactoring/design cleanup that never happened, #7082 (comment) So, this sounds like the intention was there but maybe needs more work/design so out of scope here.

@kcondon kcondon assigned kcondon and unassigned kcondon and GPortas Oct 3, 2023
@kcondon kcondon merged commit 68baacc into 9714-files-api-extension-filters Oct 3, 2023
12 of 14 checks passed
IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) automation moved this from Ready for QA ⏩ to Done 🚀 Oct 3, 2023
@kcondon kcondon deleted the 9834-files-api-extension-file-counts branch October 3, 2023 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Size: 3 A percentage of a sprint. 2.1 hours. SPA These changes are required for the Dataverse SPA
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants