Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entities projections #165

Closed
wants to merge 3 commits into from

Conversation

benbosman
Copy link
Member

This is the REST contract for the changes requested in DSpace/dspace-angular#1148

It contains details of the new Projection type (for JSON properties), the 3 new Projections, and the new search method, all requested in this issue:

  • A projection for items, to indicate whether the item is related to the given item with the given relationship type
  • A projection for relationships, to indicate on which side of the relationship the given item resides
  • A projection for relationshiptypes, to indicate on which side of the relationshiptype the given entity type resides
  • A search endpoint for relationshiptypes, to search by entity type

@benbosman benbosman self-assigned this Jun 28, 2021
@benbosman benbosman requested a review from tdonohue June 28, 2021 15:34
@benbosman benbosman added this to Needs Reviewers Assigned in DSpace 7.0 (final) release via automation Jun 28, 2021
@benbosman benbosman added this to the 7.0 milestone Jun 28, 2021
Copy link
Member

@tdonohue tdonohue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benbosman : Overall, the concept here sounds good. I agree that these sound like scenarios where we have a new type of projection -- one that adds information to the request, rather than removing information from the request.

I added a few minor suggestions for renaming below. Nothing major.

@abollini : when you return, this might be good to glance at, just to be certain you have no objections to this new type of Projection.

The REST request doesn't have to be on the `relationship` endpoint directly, it can be on another endpoint which simply embeds relationships.

### Verify whether a given item is the left or right
**?projection=CheckSideItemInRelationship&checkSideItemInRelationship=<:item-uuid>**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this projection might be simplified to something like CheckRelatedItemSide or CheckSideOfRelatedItem. That's a bit more similar to the projection name off the Items endpoint.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that simplifies the name if you know in advance this is specific to a Relationship.
But projections can be applied even if the relationship is embedded. This implies that if you'd be requesting an item, embedding the relationships, and asking to add a projection for the side of the item in the embedded relationships, the name may become a bit vague.
Do you agree to keep the name, or do you expect the name you suggested will be clear enough to indicate it adds the information to a Relationship?

The REST request doesn't have to be on the `relationshiptype` endpoint directly, it can be on another endpoint which simply embeds relationship types.

### Verify whether a given entity type is the left or right
**?projection=CheckSideEntityInRelationshipType&checkSideEntityInRelationshipType=<:entitytype>**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also might suggest simplifying the name here to something like CheckRelatedTypeSide or CheckSideOfRelatedType...based on the name of the projection you suggest for the relationships endpoint.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that simplifies the name if you know in advance this is specific to a RelationshipType.
Would you want to keep the name, or do you expect the name you suggested will be clear enough to indicate it adds the information to a RelationshipType?

@tdonohue tdonohue requested a review from abollini June 29, 2021 17:28
@tdonohue tdonohue moved this from Needs Reviewers Assigned to Under Review in DSpace 7.0 (final) release Jul 1, 2021
@tdonohue tdonohue removed this from Under Review in DSpace 7.0 (final) release Jul 28, 2021
@tdonohue tdonohue added this to Needs Reviewers Assigned in DSpace 7.1 release via automation Jul 28, 2021
@tdonohue tdonohue modified the milestones: 7.0, 7.1 Jul 28, 2021
@tdonohue tdonohue moved this from Needs Reviewers Assigned to Under Review in DSpace 7.1 release Jul 29, 2021
@tdonohue
Copy link
Member

tdonohue commented Aug 5, 2021

@corrad82-4s: Please give this a quick review when you have the chance.

The REST request doesn't have to be on the `relationshiptype` endpoint directly, it can be on another endpoint which simply embeds relationship types.

### Verify whether a given entity type is the left or right
**?projection=CheckSideEntityInRelationshipType&checkSideEntityInRelationshipType=<:entitytype>**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick, and minor, question: why in this projection entitytype name is used as possible value for checkSiedEntityInRelationshipType, while on above call /api/core/relationshiptypes/search/byEntityTypeId the entity type id is used?
If possible, I'd use id or entity type name in both calls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are two equivalent requests, depending on whether you currently have the name of the id of the given entity type.
I don't know for sure whether we need both in Angular, but they can both be useful. Because there's little extra effort in implementing both, I just added them

Response:
* The response will contain an extra JSON property `relatedItems` in the item object
* If there's no relationship to the given item, it will be an empty array: `relatedItems:[]`
* If there is a relationship to the given item, it will contain the related item: `relatedItems:['f727a6a4-5541-4148-ad0a-1ab9596d981f']`
Copy link
Member

@tdonohue tdonohue Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benbosman : I discussed this today with @abollini , and he noted that the design here might negatively impact object caching (in the UI or any client) because it accidentally breaks REST principles & how Projections are meant to behave.

As we know, Projections can add/remove properties to any Resource endpoint. However, per REST principles, the value of those properties should NOT change (unless the entire Resource has been updated). So, as an example, it might be perfectly reasonable to have a default request return 10 properties, while a projection returns an extra 2 properties (12 total)...but, those extra 2 properties should keep consistent values.

Unfortunately, it's that latter point that this REST Contract accidentally breaks, as the value of relatedItems property may change based on the parameters passed to the Projection. So, for example, four similar requests may return different values like this:

  1. /api/core/items/[uuid]?projection=CheckRelatedItem&checkRelatedItem=[unrelated-item-uuid]
    • Would return an empty array of related items: relatedItems:[]. This value may end up in a client side cache for the Item.
  2. /api/core/items/[uuid]?projection=CheckRelatedItem&checkRelatedItem=[one-related-item-uuid]
    • Would return a value in that array of related items, relatedItems:[one-related-item-uuid]. Now, the Resource property has changed, and the client may refresh the cache as it appears the Item has been updated.
  3. /api/core/items/[uuid]?projection=CheckRelatedItem&checkRelatedItem=[another-related-item-uuid]
    • Would return a different value in that array of related items, relatedItems:[another-related-item-uuid]. Now, the Resource property has changed AGAIN, and the client may refresh the cache as it appears the Item has been updated.
  4. /api/core/items/[uuid]?projection=CheckRelatedItem&checkRelatedItem=[one-related-item-uuid],[another-related-item-uuid]
    • Would return yet another different value for relatedItems, this time both related Item UUIDs.

So, in this way, we can see that the Projection may be accidentally making the Resource appear as though it has been modified between different requests. This obviously isn't the intention of this new Projection. The changing of values though may confuse clients into thinking the Item resource has changed when a projection is applied, and this is what can make caching more difficult (both in our UI and in other clients).

Andrea mentioned he hasn't had a chance to dig further into a better solution. I also don't have a better solution yet -- but I plan to dig around more today. One brainstorm that Andrea had was to consider whether we could do something with _links...where we might have a different link for each relationship type. E.g., maybe separate isPublicationOfAuthor and isProjectOfAuthor _links on a Person Entity resource. Those links might be used to embed more information about related items when needed & quickly determine which relationships an Entity has. However, it's unclear (to me) if that'd help solve this same use case entirely or not.

In the meantime, I've added this to next week's agenda to discuss more. I'll also dig around to see if I can come up with other ideas.

Copy link
Member

@tdonohue tdonohue Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benbosman : In reviewing the Spring Data REST documentation around Projections, it appears to me that the Projection we've designed here is seemingly not fully in line with their definition of valid projections. My apologies in not realizing this sooner...as I should have reviewed those docs.

As Spring Data REST documents, there are three types of valid projections:

  1. Projections that limit properties (display only parts of the Resource): https://docs.spring.io/spring-data/rest/docs/current/reference/html/#projections-excerpts.projections This is the type we've already been using.
  2. Projections that "bring in hidden data" (or enhance the data coming from the Resource): https://docs.spring.io/spring-data/rest/docs/current/reference/html/#projections-excerpts.projections.hidden-data
  3. Projections which "excerpt commonly accessed data" from related resources: https://docs.spring.io/spring-data/rest/docs/current/reference/html/#projections-excerpts.excerpting-commonly-accessed-data

The projection defined in this contract is most similar to the third one. We're trying to excerpt basic information about the Relationship resource in the response from the Item resource. However, the big difference here is that, in Spring Data Rest they are noting this data should be an EXCERPT from the related resource (and not a "search" or "lookup" across related resources).

So, in my opinion, we'd better align with Spring Data Rest via one of the following:

  1. Either we look at using basic "embed" options (which we already use to return related resources elsewhere).
    • For instance, instead of... ?projection=CheckRelatedItem&checkRelatedItem=<:other-item>
    • Could we do... ?embed=relatedItems (where we have a new _link called relatedItems which returns all Items linked to the current Item via a relationship)
    • And also... ?embed=isPublicationOfAuthor (where we have a new _link called isPublicationOfAuthor which returns only items linked to the current Item via an isPublicationOfAuthor relationship)
  2. OR, we create a new projection which returns an EXCERPT from the given Relationship resource.
    • For instance, instead of... ?projection=CheckRelatedItem&checkRelatedItem=<:other-item>
    • Could we do...?projection=relatedItems which returns an extra JSON property relatedItems that lists ALL the Item UUIDs linked to this current Item via any type of Relationship. This would imply a relatedItems link should also exist where more information can be gathered about those Items if the excerpt (in this case the list of UUIDs) isn't enough info.
    • And also... ?projection=isPublicationOfAuthor which returns an extra JSON property isPublicationOfAuthor that lists ALL the Item UUIDs linked to this current Item via an isPublicationOfAuthor relationship. This would imply an isPublicationOfAuthor link should also exist where more information can be gathered about those Items if the excerpt (in this case the list of UUIDs) isn't enough info.

The current approach in this Contract is most similar to 2 above. But, the key difference is that currently we are not returning a full excerpt (i.e. returning all Item UUIDs which match). In Spring Data Rest, there's no concept of a Projection which includes parameters to limit results or only return some matching values. It seems, to me, that this type of Projection is meant to be more of an all or nothing. If it's enabled, it returns all values of that JSON property. If it's disabled, it doesn't return the JSON property at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tdonohue

This was a complication we were aware of before starting the development.
The solution we worked out is to ensure that objects which are retrieved with these extra projections are cached separately. This has been implemented in a way to limit the cache impact only to the actual REST objects which have the extra information

If you'd e.g. retrieve an item with various embeds, and the CheckSideItemInRelationship, only the HAL links to the Relationship objects will contain the projection (knowing that those should be cached separately)
The HAL link to the item itself, the bundles, … will remain unchanged.

This solution has the advantages:

  • It will use the cached version of the rest resource as much as possible
  • It will only use a separate version with the extra information where it is actually applicable and requested (a very small footprint)
  • It can even still cache that extra information

And specifically for your latest comment:

  • The projections we've used to far are actually not similar to any of those, they fall much further from the standards
  • These new projections are also none of the 3 you mentioned, but are calculated data. It's probably best we get the current work ready for a PR, so it's easier to understand how this differs, and why it doesn't fit the examples you've posted here

Copy link
Member

@tdonohue tdonohue Aug 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benbosman : Those last statements are worrisome, as that means this "projection" would fall out of the standard definition of a Projection. Please remember, our goals are to always align with Spring Data REST (to help ensure our REST API is based on standards & best practices). So, if this "projection" doesn't align with Spring Data REST then it no longer sounds like a valid Projection.

In terms of caching, it's good to hear you've worked around the potential caching issues in the UI layer, but this unfortunately doesn't solve the potential caching issues for other clients which attempt to use this projection. It's sounding like, since we've built a projection that doesn't align with Spring Data REST, we are now having to expect clients are "smarter" about how they cache these objects.

I'd still encourage us to look back at Spring Data REST Projections. I'm worried we've accidentally stepped out of bounds of what Projections should do, and it is forcing us to do more work in the client side to ensure everything still works correctly.

I'll leave this on the agenda for this week's meeting, as I think we need to find a way back to using valid Spring Data REST Projections in this implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking back on the initial use cases here, DSpace/dspace-angular#1148 (comment), I'm realizing we might also want to consider adding a new type of search configuration for this scenario.

For instance, what if we had a /api/core/items/search/relatedItems?uuid=<:item-uuid> search option which only returned Items which were related to a given Item (UUID) via any relationship. We could also easily add a filter for relationshipType to only related items with this relationship type. If needed, we could have a negation search, e.g. unrelatedItems, which found all Items that are NOT already related to the given Item, perhaps filterable by entityType.

Other endpoints in this PR we could analyze closer to see if a search could be reasonable, or if a projection is really necessary. The answer may be different for different endpoints. (I've mostly concentrated on this endpoint as it seems the most problematic at a glance.)

Copy link
Member

@tdonohue tdonohue Aug 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of the discussion today, and the constructive feedback from @artlowel, I realized that there might be a more simplistic way to make this endpoint a bit more RESTful, without major redesign. However, I'd appreciate feedback on this idea, as it's still "stretching" the definition of a Projection, but still seems to be RESTful in my opinion.

Simply put, I'd recommend we slightly redesign the endpoint as follows.

We'd make a very minor parameter change to (isRelatedItem instead of checkRelatedItem)...
?projection=CheckRelatedItem&isRelatedItem=<:item-uuid>

Then, we'd change the response to instead be a dynamically-named extra JSON property isRelatedItem_<:item-uuid>: [true|false]

E.g. ?projection=CheckRelatedItem&isRelatedItem=f727a6a4-5541-4148-ad0a-1ab9596d981f

Would include in the response this JSON property...

isRelatedItem_f727a6a4-5541-4148-ad0a-1ab9596d981f: true

Specific relationship types could also be handled, but they'd use a different param (the relationship type name) which results in a different JSON property...

E.g. ?projection=CheckRelatedItem&isPublicationOfAuthor=f727a6a4-5541-4148-ad0a-1ab9596d981f

Would return a property...

isPublicationOfAuthor_f727a6a4-5541-4148-ad0a-1ab9596d981f: false

(If the different param is too hard, we could still go with the approach of passing something like isRelatedItem=isPublicationOfAuthor=f727a6a4-5541-4148-ad0a-1ab9596d981f. It'd just result in a slightly different "rule" for the boolean property that comes back)

In other words, the returned boolean property always uses the format [param-name]_[uuid]: [true/false]. If multiple UUIDs are passed in, then multiple JSON properties are returned.

The reason this is more RESTful is that the extra JSON property's value is static. It never will change values between requests (whereas relatedItems does change in the current design).

However, I'll admit it's a slightly "odd" looking design, because the JSON Property itself changes names based on the UUID passed in.

That said, this tweak seems (to me) to align better with REST principles, and it likely shouldn't involve a massive redesign. (However the UI will need to know how to construct the dynamic property name, as it no longer can just look at the relatedItems property).

@artlowel or @benbosman , thoughts on whether this (or something similar) would be a reasonable compromise here? To me, this seems like a (potentially simple) change that would fix my major concern with the current design.

(If this is reasonable, a similar design pattern could be applied to the other endpoints in this PR... the returned property MUST always include the UUID to ensure it has a unique property name, and it always has a true/false value.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that can work, but if we add dynamic properties to Item, we lose compile-time type checking of Item properties. We'd have to add a catch all property at the bottom for everything that isn't explicitly specified.

I'd prefer it if the dynamic properties are all moved to one property of item. e.g.

"relatedItems": {
  "isPublicationOfAuthor_0c021c60-a516-4b49-b4a0-0432df0ddfcd": true,
  "isPublicationOfAuthor_f727a6a4-5541-4148-ad0a-1ab9596d981f": false
}

If all the dynamic properties are grouped together, and they're all booleans, we can write something like:

relatedItems: {
  [key: string]: boolean
}

That way it doesn't affect the rest of the item model.

It's a little more clunky compared to @benbosman's original proposal, because we'd have to parse those keys to get the uuids, but it's workable from the perspective of the UI.

I don't know about the feasibility on the rest side though

Copy link
Member

@tdonohue tdonohue Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artlowel : I like that suggestion! I also think it somewhat mirrors our approach to the metadata field, which acts as a grouping of all metadata about an object (as metadata also can be somewhat "dynamic", in that you don't know what fields will be on each object, and projections should be allowed to similarly hide/show additional metadata fields).

In other words, we already have this concept for metadata...

"metadata": {
   [key: string]: array-of-values
}

So, it makes perfect sense to mirror that approach for relatedItems...

"relatedItems": {
    [key: string]: boolean
}

I agree that it might seem slightly more "clunky" than the initial approach of @benbosman. But, as you can see, this new design better aligns with our existing approach for the metadata section. This also means that someone can more easily show/hide individual relationship keys (e.g. isPublicationOfAuthor_0c021c60-a516-4b49-b4a0-0432df0ddfcd) via projections, in the same manner that one can show/hide individual metadata keys (e.g. dc.description.provenance which is hidden by default) via projections and/or access permissions.

@abollini
Copy link
Member

I suggest to make a step back and look to the original problem.
As @tdonohue notes a search endpoint can solve our issue without introducing new concepts that have unclear implication. I'm worried for instance about the loss of compile time checking on the angular side noted in the @artlowel suggestion but also on unverified performance implication in my idea about split the relationships over multiple links.
The search method can accept a filter for the relationship type, the side, a fixed uuid for the item on the specified side and optionally a list of uuid to verify on the other side. With a such endpoint once you have a list of items you can verify with a single call which of these items are already related to the target item and disable the add button.

@mwoodiupui
Copy link
Member

An off-the-wall view of the discussion: during today's meeting the word "projection" prompted me to think of this in relational terms. And to me, it seems that an operator which adds information from another relation is not a projection at all, but a join. Looking at it that way, a second lookup makes much sense.

I am not sure whether this observation is applicable or useful, but I wanted to share it in case it is.

@tdonohue
Copy link
Member

tdonohue commented Sep 2, 2021

Closing this Rest Contract in favor of #170 (a non-projection based approach). That said, if we find that approach falls short in significant ways, we can always revisit this idea to implement via a custom type of projection

@tdonohue tdonohue closed this Sep 2, 2021
DSpace 7.1 release automation moved this from Under Review to Done Sep 2, 2021
@tdonohue tdonohue removed this from Done in DSpace 7.1 release Sep 2, 2021
@tdonohue tdonohue removed this from the 7.1 milestone Sep 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants