-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring authority support / value pairs / controlled vocabularies #128
Conversation
@abollini : I'm not sure I agree with the decision to rename I also don't understand why we couldn't let value-pairs (like |
There are vocabularies that can offer different textual description of the same record all linked to the same entry (it can be a name authority that know the different variants used by the same person, or a vocabulary that manage synonyms, etc). The If we add a fake ID to the dropdown vocabulary it would become more tricky for the rest client to understand which id are true and which one generated by the system for uniformity. That said, I would like to keep this difference but I'm ok with renaming suggestions in whatever you find more understable including entries (terms, choices) if you feel comfortable with it |
hi @tdonohue I have updated the vocabulary to reflect the re-wording discussed on slack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abollini : I gave this a more thorough review. Overall, the concepts look fine, but I still have several inline questions about expected behavior of these endpoints. Nothing major honestly, but a lot of minor cleanup or clarifications are needed.
@tdonohue the contract now should reflect the agreed changes, all your feedback should have been included |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me now. Thanks @abollini !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've included some inline comments, but there's also quite some English language cleanup necessary. Some of the sentences are really hard to understand given the current wording.
* name: see id | ||
* scrollable: if true mean that it is possible to scroll all the entries in the authority without providing a filter parameter, see (vocabulary entries)[vocabularies.md#vocabulary-entries] | ||
* hierarchical: if true means that the vocabulary expose a tree structure where some entries are parent of others | ||
* preloadLevel: for hierarchical vocabularies express the preference to preload the tree at a specific level of depth (0 only the top nodes are shown, 1 also their children are preloaded and so on) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this intended to be used? This doesn't exist in DSpace 6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be used by the Angular UI to open an hierarchical vocabulary with a number of level pre-expanded. This was highly demanded by users in an internal UX test. It is not strictly necessary so if needed we can remove that from the contract and the implementation (that already exists, see screens here DSpace/DSpace#2743 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if I understand it correctly, there are requests for this, and screenshots, but no implementation or config yet?
In that case I think it's best to remove it since we don't want to add more features making the config even more complicated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the implementation is already here because we have implemented it firstly for DSpace-CRIS and in https://jira.lyrasis.org/browse/DS-4475 we have noted that explicitly as this allow to keep the effort for DSpace 7 limited.
About the configuration it is just a single line, with a proper default, similar to existing configuration for the DSpaceControlledVocabulary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see in https://github.com/DSpace/DSpace/pull/2743/files#diff-a60e58808bf42c545ed0dc04a4d6d5f9 there's indeed an open PR to get this in
I'll leave this for @tdonohue whether we allow more features outside the DSpace 6 scope
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll allow this into the contract (especially since an implementation PR exists as of now in DSpace/DSpace#2792). But, I'll want to review the implementation to see the impact. I'm generally OK with it though, as it seems small.
vocabularies.md
Outdated
|
||
The supported parameters are: | ||
* page, size [see pagination](README.md#Pagination) | ||
* metadata: the metadata field name (e.g. "dc.type") for which the controlled vocabulary is used: mandatory. The entries returned by a vocabulary may be different for different metadata fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When can they be different for separate metadata fields but still using the same vocabulary ID? Can you provide a sample?
Otherwise I'd recommend to remove this parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is used by the SOLRAuthority https://github.com/DSpace/DSpace/blob/30ad7676ef92df507cffb7a2040f5ef3ee5e56b9/dspace-api/src/main/java/org/dspace/content/authority/SolrAuthority.java#L79 and related to the previous discussion with Tim see #128 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this seems to be confusing to Tim and me, I'd suggest to add an example to the contract. This doesn't have to be a full JSON example, but simply state that for the vocabulary-name solr
(or what the actual name is), the behavior differs based on the metadata and collection because it's a single vocabulary whose data is retrieved via the dspace.cfg choices.plugin.dc.contributor.author = SolrAuthorAuthority
(if that's how this field is determined)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess to have included it here 0c57822#diff-1a0f138d702d2d84a2c28904e36e8150R271
or have I misunderstood your request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I understand it correctly, there are already separate endpoints per solr field. The fact that this vocabulary is named SolrAuthorAuthority
does determine which entries to return (and the collection or metadata field still won't impact it)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not exactly. It is named SolrAuthorAuthority because this is the name used in our dspace.cfg examples. It can be configured over more than one metadata and as you can see in the above link to the existing sourcecode (unchanged from previous version) the SolrAuthority will use the metadata information to query the authority core
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just added the below comment which applies here as well. metadata
needs to be a required field because of value-pairs (which require it). So, I think we'll have to keep this as-is for now in the contract. If need be, we can always merge this as-is & I will gladly add more text to describe why both collection & metadata are required at this time -- the best example of both is value-pairs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend to add in at least some detail on this line explaining this is currently only applicable for the solr authority, since it's still unclear from this contract.
Value pairs don't differ per metadata field, only per vocabulary-name
That should be a really small change, and this is very obviously an unclear area
vocabularies.md
Outdated
The supported parameters are: | ||
* page, size [see pagination](README.md#Pagination) | ||
* metadata: the metadata field name (e.g. "dc.type") for which the controlled vocabulary is used: mandatory. The entries returned by a vocabulary may be different for different metadata fields. | ||
* collection: the uuid of the collection where the item is being submitted: mandatory. The entries returned by a vocabulary may differ per collection based on your submission configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When can they be different for separate collections fields but still using the same vocabulary ID? Can you provide a sample?
Otherwise I'd recommend to remove this parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dc.type can be configured as a controlled vocabulary in the submission form for a specific collection (form definition) and as free text for another submission. Moreover, the collection parameter is part of the choice interface so some implementation could also return different entries depending on it similar to how the SOLRAuthority is currently using the metadata parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on #128 (comment), I assume this currently also only returns different entries for the same vocabulary-name for solr. If that example is included as I requested in #128 (comment), it should be much clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see #128 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benbosman : I initially had the same confusion over the collection
param until I looked at the current (DSpace 6 and below) code implementation. It is possible to configure in input-forms.xml (now submission-forms.xml) a value-pairs that is collection specific (as any submission field can be made to only appear for specific collections, or even have different configurations for different collections -- e.g. one collection could have a "dc.type" that uses value-pairs, while in another "dc.type" can be configured to be free text, or a different set of value-pairs). So, this means value-pairs really are both metadata field & collection specific....you need both to load the proper set of value-pairs.
You'll see in this prior discussion I was the one who suggested the above wording. So, I think that, for now both metadata and collection must be required to work with our current code (especially for value-pairs). In the future though, I fully agree this is a bit odd, and we may want to revisit this in a future release (not 7.0, more like 8.0 unless we find a need to change it earlier).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out, https://jira.lyrasis.org/browse/DS-4371 is indeed another example clarifying how the mandatory metadata and collection fields can be misinterpreted, this time the vocabulary was determined based on the parameters instead of the vocabulary name.
Perhaps the parameters should only be set mandatory for authorities (a smaller change compared to completely removing them), since the others are not supposed to use them. That makes their use clearer. Angular is of course still free to include them for non-authorities, assuming REST will ignore them.
By using that in-between step, it's also easier to redesign the authority core to expect a separate name in e.g. 7.1 and remove the parameters completely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abollini : I'm OK with @benbosman's suggestion here. If it's easier for now to just make these params only mandatory for authorities, then that seems like a step in the right direction. I also agree we can just have the Angular team pass these params the time (if desired), but in the REST API they can be ignored except when needed for authorities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately, I don't see that as a simplification nor as a step forward the right solution. All the three things value-pairs, vocabularies and authorities need to be managed in the same way. Add an exception here is not the way to go imho. If we don't want to remove the collection/metadata from the authority interfaces we need to keep them also for the other "things" making sure that they are used in a consistent ways (i.e. they match with the specified vocabulary name to avoid the issue DS-4371. This is exactly what is done here in the current implementation https://github.com/DSpace/DSpace/pull/2792/files#diff-151c8c35ac6f56dc3482b82c56b0474cR85 see also the corresponding test https://github.com/DSpace/DSpace/pull/2792/files#diff-379fecc95ee8bd1f6f91349f0cd7e9d3R193-R206 ). Anyway, I'm trying to implement the change to remove completely these params in a parallel branch so that you can check exactly how much changes are needed (it seems less than what I initially suspected)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here the result of my investigation: 4Science/DSpace#157 it looks to me as a better, cleaner approac. Do you agree? should I remove completely the metadata & collection parameters? if you like that I will apply the related change also to this contract 4Science/DSpace#157
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, since that change looks simple enough. I'd rather we just remove the metadata/collection parameters. If the three of us are confused by them, they will be even more confusing to others. So, if we can easily remove them now, I'm in favor of doing so.
} | ||
``` | ||
|
||
Attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't it planned to also contain a boolean authority?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems possible to avoid to introduce such attribute because the authority will be exposed only where it is liked to store it. In the srsc authority for instance where we need the "id" to allow the hierarchical browsing the authority is NOT included in the vocabularyEntry see
https://github.com/DSpace/Rest7Contract/pull/128/files#diff-1a0f138d702d2d84a2c28904e36e8150R215
https://github.com/DSpace/Rest7Contract/pull/128/files#diff-1a0f138d702d2d84a2c28904e36e8150R224
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For storage that would indeed work, but how can the submission UI determine when to display the lookup button if it's only determined in the returned values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the job of the submission-form definition endpoint. Looking to that I get the chance to update the attribute name to reflect the new wording (controlledVocabulary instead than authority)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, the UI should use the type and the controlledVocabulary attribute in the formField to determine which widget is needed (no real changes on this aspect compared to what is already implemented in the master).
It can be a dropdown, a lookup button or a suggest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the plan for determining the UI would be:
- If in the submission form type=dropdown => the vocabulary as a dropdown (and hierarchical is not supported)
- Otherwise, if in vocabulary definition the hierarchical=true => render as a hierarchical browse and suggest
- Otherwise if there's a vocabulary => render as a suggest
- Support for lookup is not exposed yet => needs to be handled as a separate ticket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added more details about the submission forms, no changes are proposed here is just to document what exists f7076e5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for documenting this functionality. Since this is new in 7.0 and was not documented yet, I wasn't aware of that implementation
I didn't verify whether it supports https://github.com/DSpace/DSpace/blob/master/dspace/config/dspace.cfg#L1507 as well, or only adds the lookup buttons based on the submission-forms. But that's an implementation detail, and not a contract detail.
With this info, I assume the plan for determining the UI would be:
- If in the submission form type=dropdown => the vocabulary as a dropdown (and hierarchical is not supported)
- If in the submission form type=lookup or lookup-name => render as a lookup with detailed metadata similar to https://wiki.lyrasis.org/display/DSDOC6x/ORCID+Integration#ORCIDIntegration-SubmissionofnewDSpaceitems-Authorlookup
- Otherwise, if in vocabulary definition the hierarchical=true => render as a hierarchical browse and suggest
- Otherwise if there's a vocabulary => render as a suggest
Since this is complicated set of rules, I would recommend this is documented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we need to provide advice to the UI / client on how this information should be used. I've added suggestions inline in this Contract for how to document what @benbosman noted above. See this inline comment: https://github.com/DSpace/Rest7Contract/pull/128/files#r447073406
hi @benbosman I have added more information and example as by your feedback. Thanks |
we are reviewing internally the implementation and discovered an inconsistency, the attribute used to hold the array in the json response for linked entries must reflect the subpath see e0de465 |
Thanks @abollini for the updates, I did update the threads and mark as resolved where I didn't see any further issues |
@abollini : it looks like this is nearly ready to go. Just needs minor updates per Ben's feedback from June 18: #128 (review) |
@benbosman and @abollini : I've added some inline comments above to try and help this along. For these outstanding questions though, they all seem minor & I think the approach in this Contract is "good enough" for now. It gives us what we wanted (which is better backwards compatibility to 6.x) without too much additional major refactoring of authority / value pairs / controlled vocab. (Larger refactors would have to wait for 8.0, when we have more of a chance to rethink these three similar concepts.) So, if you both agree, I'd like to move this forward as it currently stands. If we need to improve some of the examples, I'd volunteer to do this. I also think some of these questions might be answered in the new implementation PR at DSpace/DSpace#2792 |
Thanks for the update. There are still a few things which need documentation, but I think I know what the intended behavior is now. |
…a parameters to query a vocabulary
@tdonohue @benbosman I have simplified the contract removing the collection and metadata params as discussed here #128 (comment) this is now also aligned with the REST API PR DSpace/DSpace#2792 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look good to me
This PR try to formalize what have been discussed in the wiki https://wiki.lyrasis.org/display/DSPACE/4Science+proposal+to+improve+authority+support+in+DSpace7
to be sure that we have a common understanding to use as base of the implementation rework.
Of course some changes could become necessary (and will be proposed) during the implementation if we found any unforeseen technical issues.
Please note that compared to the latest name/wording proposed on the wiki I have only changes the subpath from
api/integration/vocabularies/<:vocabulary-name>/entries
to
api/integration/vocabularies/<:vocabulary-name>/suggestions
to make clear that the "objects" returned by this endpoint are not necessary resources available in the other endpoint
/api/integration/vocabularyEntries/<:vocabulary-name>:<entry-id>
the suggestions related to "value-pairs" are an example of such case as they don't have a stable ID (and no authority will be stored in the metadata)