Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CVOC : allow flexible params in retrievalUri (Ontoportal integration) #10404

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

luddaniel
Copy link
Contributor

@qqmyers Following agreements on next steps of #10145 here is a way to add flexibility in retrievalUri.
We can now add named fields (child or siblings) of term-uri-field and capacity to encodeUri value when usefull.
Example below :

cvoc_conf.json contains :

"field-name": "keyword",
"term-uri-field": "keywordTermURL",
"retrieval-uri": "https://data.agroportal.lirmm.fr/ontologies/{keywordVocabulary}/classes/{encodeUrl:keywordTermURL}",
"managed-fields": {
        "vocabularyName": "keywordVocabulary",
        "termName": "keywordValue",
        "vocabularyUri": "keywordVocabularyURI"
    },

Resulting in retrieval-uri = https://data.agroportal.lirmm.fr/ontologies/AGROVOC/classes/http%3A%2F%2Faims.fao.org%2Faos%2Fagrovoc%2Fc_331529

This code was tested with a classic skomos https://skosmos.loterre.fr/rest/v1/data?uri={0} and it's still working.

Refactoring has been made regarding the use of registerExternalTerm function around dataset imports / updates methods where it was used while parsing json fields meaning imcomplete related fields. Now registerExternalTerm is called after dataset metadata are ready to be commited and directly as step of Command of dataset create and update.

@coveralls
Copy link

coveralls commented Mar 21, 2024

Coverage Status

Changes unknown
when pulling 2774bc9 on Recherche-Data-Gouv:9276-allow-flexible-params-in-retrievaluri-cvoc
into ** on IQSS:develop**.

@luddaniel luddaniel added this to 🚧 Dev by Recherche Data Gouv in Recherche Data Gouv (formerly Data INRAE) Mar 22, 2024
Copy link
Member

@qqmyers qqmyers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Nice to have the register part out of the DatasetPage and in the commands (where it can be run with the SPA).

It would be good to add a release note (could be a combined one for this and #10331). It would also be helpful to have some testing instructions and/or a test: nominally existing scripts should still work, so those could be tested with the examples in the https://github.com/gdcc/dataverse-external-vocab-support/ repo. If there isn't an ontoportal example yet, perhaps just verifying that the substitution works in replaceRetrievalUriParam in a unit test would be helpful.

I'll also note that this is matched by a docs update at gdcc/dataverse-external-vocab-support#20.

@qqmyers qqmyers added the Size: 3 A percentage of a sprint. 2.1 hours. label Mar 22, 2024
@luddaniel
Copy link
Contributor Author

luddaniel commented Apr 15, 2024

Hello @qqmyers Thanks for the review,

For testing / unit tests / integration tests, we decided to go for unit tests of methods of DatasetFieldServiceBean, coming in another PR. Other kind of testing looks complicated for now. I'll talk to you on slack for this topic.

@luddaniel
Copy link
Contributor Author

Decided with @qqmyers, the unit tests for all PR for Ontoportal / CVOC will be pushed after the merging of PR as it will require merged codes.

@luddaniel luddaniel requested a review from qqmyers April 18, 2024 14:48
@qqmyers
Copy link
Member

qqmyers commented Apr 22, 2024

@luddaniel - There's 1 test failing with this PR - a 500 error returned instead of a 200 response at edu.harvard.iq.dataverse.api.EditDDIIT.testUpdateVariableMetadata(EditDDIIT.java:153) which is caused ultimately by an attempt to save a DatasetVersion that doesn't have a createTime set. From the lengthy stack trace, it looks like that is due to the call to ctxt.dsField().registerExternalVocabValues(df). That calls em.flush() if an external val is found - perhaps that or something else in the call is causing the DatasetVersion itself to be flushed to the db. (I also see @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW) on registerExternalTerm().)

This all happens before the date gets set in UpdateDatasetVersionCommand line ~261 (setting the last update time also sets the createTime if it was null) so anything causing the new DatasetVersion to be written to the db would trigger the problem since createTime is defined as non-nullable.

If that's all correct, I think editing/perhaps having to add a CVOC value to a published version (causing a new version to be created) in the UI would trigger the issue. Assuming you can repeat the problem, it might be that we no longer need the flush(), or that setting the date can be moved earlier in the method, etc.


#### Flexible params in retrievalUri

You can now use `managed-fields` field names `term-uri-field` field name as parameters of `retrieval-uri`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change:
You can now use managed-fields field names as well as the term-uri-field field name as parameters in the 'retrieval-uri' when configuring an external vocabulary service. '{0}' as an alternative to using the 'term-uri-field' name is still supported for backward compatibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Size: 3 A percentage of a sprint. 2.1 hours.
Projects
Status: In Review 🔎
Recherche Data Gouv (formerly Data IN...
  
🚧 Dev by Recherche Data Gouv
Development

Successfully merging this pull request may close these issues.

None yet

3 participants