-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DS-4281: Metadata suggestions in the live import #83
Conversation
Thank you @benbosman for adding this! |
metadata-suggestions.md
Outdated
The changes embedded are the suggested metadata changes to be applied to the given item | ||
|
||
### Changes for a single entry | ||
**/api/integration/metadata-suggestions/<:suggestion-name>/entryValues/<:entry-id>/changes** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This endpoint seems odd to me too... if you have the workspaceitem
or workflowitem
passed into /api/integration/metadata-suggestions/<:suggestion-name>/entryValues/<:entry-id>
, couldn't you just return the "changes" from that endpoint? I'm not sure this changes
subpath is necessary, plus it doesn't seem to follow our REST Best Practices. It's not really a "subresource" of the entry
(represented by entry-id
). It's more like a different "view" or "filter" of that entry. I'm not sure what to suggest here though yet, as I'm still trying to understand the parent endpoints here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes
are currently assumed not to be embedded in the /api/integration/metadata-suggestions/<:suggestion-name>/entries
endpoint for performance reasons (although it's hard to estimate the performance impact prior to the implementation).
In order to ensure this scales well, it was introduced as a sub-resource. The changes are calculated based on the combination of the information in the entry and in the item
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benbosman : To clarify, my original statement here, I wasn't suggesting to embed changes
...instead, I was suggesting that this specific endpoint might return a different response based on parameters. So for example:
GET /api/integration/metadata-suggestions/<:suggestion-name>/entryValues/<:entry-id>
would return the full external entry (which would seem to be nearly identical to what/api/integration/externalsources/<:source-name>/entryValue/<:entry-id>
might return (if the backends were identical)).GET /api/integration/metadata-suggestions/<:suggestion-name>/entryValues/<:entry-id>?workspaceitem=1111-2222-3333
would return the difference (i.e. the "changes") between<:entry-id>
and thatWorkSpaceItem
So, this would mean you'd no longer need a subresource called changes
(which as I said, doesn't seem like a normal subresource anyways).
It's just a different idea of how to possibly approach this specific endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the features of both endpoints are quite different, it doesn't make much sense to ask for the metadata of e.g. PubMed's live import using /api/integration/externalsources/pubmed/entryValue/31680121 especially since there's a reasonable chance /api/integration/externalsources/pubmed doesn't exist (is not configured)
It would be similar to updating the collection metadata using /api/core/items/<:collectionID> just because items also have metadata, even though there's no item with that ID
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes
subresource still bugs me, as it doesn't seem very RESTful. I understand the purpose here, and it seems like the goal is to simplify the client side processing, so the client side doesn't need to compare the response of the /metadata-suggestions
request to the current metadata in the WorkspaceItem/WorkflowItem
.
However, at the same time, I'm not sure the Server Side (REST API) will always know whether the /changes
should return an add
or replace
or delete
operation. For example, if the /metadata-suggestion
returns one dc.title
while the WorkspaceItem has a different dc.title
, is that an add
operation (to end up with two alternative titles) or a replace
operation (to end up with one title)? Would the Server Side be able to make a proper /changes
suggestion without input from an End User?
I'm starting to wonder if it's easier to drop this /changes
endpoint altogether, and do the comparison on the Client Side (as these are simple string comparisons). In the above scenario, I could imagine a UI that simply shows that there's a difference in the values of dc.title
to let the End User select the one they prefer, or even allow them to select both. Based on the User's selection, the UI would generate either a noop, an add
or a replace
using the existing Item PATCH metadata functionality.
This is also a topic that we could bring to the DSpace 7 Working Group meeting -- as this feature is not specific to Entities in any way. Maybe others see a different way of achieving this endpoint in a RESTful fashion.
@benbosman : Revisiting this Contract, I'm looking for DSpace 6.x examples where we had Metadata suggestions using the Live Import framework (I'm not familiar with how this currently works, and I didn't realize it already supported Metadata suggestions -- for some reason I thought it was a full import in 6.x). Do you have examples or a demo from 6.x that we can base this Contract from? I'm still finding the use case differences between "Live Import" and PR#74 (Import from External Sources) to not be clear. I know you've tried to explain the differences, but I think it'd help me to step back here and walk through use cases (based on how this worked in DSpace 6.x). |
@tdonohue in DSpace 6, the live import starts from an existing item as well. It offers a search feature, which returns a list of matching articles in PubMed. The UI displays a preview with the title and first authors. This is a completely different feature compared to #74, with the overlap being: Since the features are quite different, it doesn't make much sense to ask for the metadata of e.g. PubMed's live import using /api/integration/externalsources/pubmed/entryValue/31680121 especially since there's a reasonable chance /api/integration/externalsources/pubmed doesn't exist (is not configured) |
@benbosman : I feel we need to discuss this endpoint further. I'm worried about the complexity of having to configure multiple "external APIs" multiple times for similar import mechanisms. I'm having a hard time following the logic that an import using "live imports" into an Item is vastly different from an import using "external sources" into an Entity (also an Item). It seems like, at a minimum, these two import mechanisms should share some underlying code or configuration -- so I'd like to better understand the plans for doing so. For example, just a few questions include:
Just for clarity, I (personally) don't like to see integrations/configurations that sound duplicative -- this sounds quite duplicative as we are talking about using the same external APIs for importing data, both triggerable in the submission UI, but in two different setups/configurations. I'd like to tease this apart further before I can approve it, as I still don't know how to describe the differences here to other DSpace users (and if I'm confused, I expect others will be confused as well) I'd request we bring this for discussion in either an Entities WG or a DSpace 7 WG meeting. |
@benbosman can you please consider this on this PR? #88 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benbosman : Updating my feedback on this PR, after more thought (and since we last discussed this in a meeting).
- Overall, I (now) understand the reasons why we'd want this new
/metadata-suggestions
endpoint. The overall goal of this endpoint is to port the "Live import" feature (from 6.x) to our new REST API. As that feature is primarily oriented towards metadata enhancement (and as we discussed in a meeting "live import" is an odd name for this feature), I agree that "metadata-suggestions" is a better name. - However, I'm still not yet convinced that the
/changes
subresource is something we should implement. See this comment: https://github.com/DSpace/Rest7Contract/pull/83/files#r343257121
If the /changes
subresource was removed, I think the rest of this Contract is mostly straightforward. However, I think this also may take away some of the usefulness of these metadata suggestions. I've been trying to locate other examples of REST APIs that provide these sorts of "suggestions" for changes/enhancements, but haven't found any.
I'm not sure of a better way to move this forward. We could discuss this more though in either the DSpace 7 Entities WG or the DSpace 7 WG to see if we can come up with ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been following closely the enhances and comments in this PR.
I'm ok with the current status and I don't have any strong opinion about the use of metadata PATCH for informing which metadata to be changed if it will be done on the client of server side. So, whatever decided with @tdonohue, I will also be ok with it.
@tdonohue it would help to clarify if instead of: a different end-point was proposed ( |
I talked with @benbosman and Lieven about this further today. I suggested the following:
|
Changing endpoint
Changing endpoint
@paulo-graca I've updated the endpoint based on your suggestion @tdonohue I've discussed the proposal with Art, and have changed it to be more focussed towards what the difference is while still ensuring the REST does the heavy lifting. With this implementation, the REST can suggest multiple ways to change the metadata, and the user can determine the most applicable solution (while REST can still suggest a sensible default as the first option). This new format will still keep it easy for Angular to build the actual PATCH based on the user's selected action I've also adjusted the workspaceitem POST to be in line with https://github.com/DSpace/Rest7Contract/blob/master/items.md#creating-an-archived-item-from-an-external-source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benbosman : I like these new changes! I think this is looking good. I only have minor feedback/tweaks to suggest before I think this can be merged.
@paulo-graca : If you have a chance, could you give this a quick look again to be sure you approve of the new direction here? (I know you previously approved, but as there's been some significant rework, it'd be good to ensure it still looks good to you.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Thanks for the updates, @benbosman . This looks good to me overall. I did clarify one minor request (which would be nice to still fix) in this comment. But, I approve of this direction.
Thanks @tdonohue, I've applied that change |
Merging since this has 2 approvals |
https://jira.duraspace.org/browse/DS-4281
This Rest Contract will introduce the live import from https://wiki.duraspace.org/dsdoc6x/using-dspace/ingesting-content-and-metadata/submission-user-interface/2016-framework-for-live-import-from-external-sources
This will allow the user to choose an external source to import from, retrieve suggestions, and apply the metadata changes once the user has confirmed the suggested changes are applicable.
This Rest Contract is related to #74
It ensures a submission can be started based on an external source.
This uses a POST on the /api/submission/workspaceitems, with a URI-list (with the collection and external source).
The item metadata is imported in this case