New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Entities projections #165
Entities projections #165
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benbosman : Overall, the concept here sounds good. I agree that these sound like scenarios where we have a new type of projection -- one that adds information to the request, rather than removing information from the request.
I added a few minor suggestions for renaming below. Nothing major.
@abollini : when you return, this might be good to glance at, just to be certain you have no objections to this new type of Projection.
The REST request doesn't have to be on the `relationship` endpoint directly, it can be on another endpoint which simply embeds relationships. | ||
|
||
### Verify whether a given item is the left or right | ||
**?projection=CheckSideItemInRelationship&checkSideItemInRelationship=<:item-uuid>** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of this projection might be simplified to something like CheckRelatedItemSide
or CheckSideOfRelatedItem
. That's a bit more similar to the projection name off the Items endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that simplifies the name if you know in advance this is specific to a Relationship.
But projections can be applied even if the relationship is embedded. This implies that if you'd be requesting an item, embedding the relationships, and asking to add a projection for the side of the item in the embedded relationships, the name may become a bit vague.
Do you agree to keep the name, or do you expect the name you suggested will be clear enough to indicate it adds the information to a Relationship?
The REST request doesn't have to be on the `relationshiptype` endpoint directly, it can be on another endpoint which simply embeds relationship types. | ||
|
||
### Verify whether a given entity type is the left or right | ||
**?projection=CheckSideEntityInRelationshipType&checkSideEntityInRelationshipType=<:entitytype>** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also might suggest simplifying the name here to something like CheckRelatedTypeSide
or CheckSideOfRelatedType
...based on the name of the projection you suggest for the relationships endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that simplifies the name if you know in advance this is specific to a RelationshipType.
Would you want to keep the name, or do you expect the name you suggested will be clear enough to indicate it adds the information to a RelationshipType?
@corrad82-4s: Please give this a quick review when you have the chance. |
The REST request doesn't have to be on the `relationshiptype` endpoint directly, it can be on another endpoint which simply embeds relationship types. | ||
|
||
### Verify whether a given entity type is the left or right | ||
**?projection=CheckSideEntityInRelationshipType&checkSideEntityInRelationshipType=<:entitytype>** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick, and minor, question: why in this projection entitytype name is used as possible value for checkSiedEntityInRelationshipType, while on above call /api/core/relationshiptypes/search/byEntityTypeId the entity type id is used?
If possible, I'd use id or entity type name in both calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are two equivalent requests, depending on whether you currently have the name of the id of the given entity type.
I don't know for sure whether we need both in Angular, but they can both be useful. Because there's little extra effort in implementing both, I just added them
Response: | ||
* The response will contain an extra JSON property `relatedItems` in the item object | ||
* If there's no relationship to the given item, it will be an empty array: `relatedItems:[]` | ||
* If there is a relationship to the given item, it will contain the related item: `relatedItems:['f727a6a4-5541-4148-ad0a-1ab9596d981f']` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benbosman : I discussed this today with @abollini , and he noted that the design here might negatively impact object caching (in the UI or any client) because it accidentally breaks REST principles & how Projections are meant to behave.
As we know, Projections can add/remove properties to any Resource endpoint. However, per REST principles, the value of those properties should NOT change (unless the entire Resource has been updated). So, as an example, it might be perfectly reasonable to have a default request return 10 properties, while a projection returns an extra 2 properties (12 total)...but, those extra 2 properties should keep consistent values.
Unfortunately, it's that latter point that this REST Contract accidentally breaks, as the value of relatedItems
property may change based on the parameters passed to the Projection. So, for example, four similar requests may return different values like this:
/api/core/items/[uuid]?projection=CheckRelatedItem&checkRelatedItem=[unrelated-item-uuid]
- Would return an empty array of related items:
relatedItems:[]
. This value may end up in a client side cache for the Item.
- Would return an empty array of related items:
/api/core/items/[uuid]?projection=CheckRelatedItem&checkRelatedItem=[one-related-item-uuid]
- Would return a value in that array of related items,
relatedItems:[one-related-item-uuid]
. Now, the Resource property has changed, and the client may refresh the cache as it appears the Item has been updated.
- Would return a value in that array of related items,
/api/core/items/[uuid]?projection=CheckRelatedItem&checkRelatedItem=[another-related-item-uuid]
- Would return a different value in that array of related items,
relatedItems:[another-related-item-uuid]
. Now, the Resource property has changed AGAIN, and the client may refresh the cache as it appears the Item has been updated.
- Would return a different value in that array of related items,
/api/core/items/[uuid]?projection=CheckRelatedItem&checkRelatedItem=[one-related-item-uuid],[another-related-item-uuid]
- Would return yet another different value for
relatedItems
, this time both related Item UUIDs.
- Would return yet another different value for
So, in this way, we can see that the Projection may be accidentally making the Resource appear as though it has been modified between different requests. This obviously isn't the intention of this new Projection. The changing of values though may confuse clients into thinking the Item resource has changed when a projection is applied, and this is what can make caching more difficult (both in our UI and in other clients).
Andrea mentioned he hasn't had a chance to dig further into a better solution. I also don't have a better solution yet -- but I plan to dig around more today. One brainstorm that Andrea had was to consider whether we could do something with _links
...where we might have a different link for each relationship type. E.g., maybe separate isPublicationOfAuthor
and isProjectOfAuthor
_links
on a Person Entity resource. Those links might be used to embed more information about related items when needed & quickly determine which relationships an Entity has. However, it's unclear (to me) if that'd help solve this same use case entirely or not.
In the meantime, I've added this to next week's agenda to discuss more. I'll also dig around to see if I can come up with other ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benbosman : In reviewing the Spring Data REST documentation around Projections, it appears to me that the Projection we've designed here is seemingly not fully in line with their definition of valid projections. My apologies in not realizing this sooner...as I should have reviewed those docs.
As Spring Data REST documents, there are three types of valid projections:
- Projections that limit properties (display only parts of the Resource): https://docs.spring.io/spring-data/rest/docs/current/reference/html/#projections-excerpts.projections This is the type we've already been using.
- Projections that "bring in hidden data" (or enhance the data coming from the Resource): https://docs.spring.io/spring-data/rest/docs/current/reference/html/#projections-excerpts.projections.hidden-data
- Projections which "excerpt commonly accessed data" from related resources: https://docs.spring.io/spring-data/rest/docs/current/reference/html/#projections-excerpts.excerpting-commonly-accessed-data
The projection defined in this contract is most similar to the third one. We're trying to excerpt basic information about the Relationship resource in the response from the Item resource. However, the big difference here is that, in Spring Data Rest they are noting this data should be an EXCERPT from the related resource (and not a "search" or "lookup" across related resources).
So, in my opinion, we'd better align with Spring Data Rest via one of the following:
- Either we look at using basic "embed" options (which we already use to return related resources elsewhere).
- For instance, instead of...
?projection=CheckRelatedItem&checkRelatedItem=<:other-item>
- Could we do...
?embed=relatedItems
(where we have a new_link
calledrelatedItems
which returns all Items linked to the current Item via a relationship) - And also...
?embed=isPublicationOfAuthor
(where we have a new_link
calledisPublicationOfAuthor
which returns only items linked to the current Item via anisPublicationOfAuthor
relationship)
- For instance, instead of...
- OR, we create a new projection which returns an EXCERPT from the given Relationship resource.
- For instance, instead of...
?projection=CheckRelatedItem&checkRelatedItem=<:other-item>
- Could we do...
?projection=relatedItems
which returns an extra JSON propertyrelatedItems
that lists ALL the Item UUIDs linked to this current Item via any type of Relationship. This would imply arelatedItems
link should also exist where more information can be gathered about those Items if the excerpt (in this case the list of UUIDs) isn't enough info. - And also...
?projection=isPublicationOfAuthor
which returns an extra JSON propertyisPublicationOfAuthor
that lists ALL the Item UUIDs linked to this current Item via anisPublicationOfAuthor
relationship. This would imply anisPublicationOfAuthor
link should also exist where more information can be gathered about those Items if the excerpt (in this case the list of UUIDs) isn't enough info.
- For instance, instead of...
The current approach in this Contract is most similar to 2 above. But, the key difference is that currently we are not returning a full excerpt (i.e. returning all Item UUIDs which match). In Spring Data Rest, there's no concept of a Projection which includes parameters to limit results or only return some matching values. It seems, to me, that this type of Projection is meant to be more of an all or nothing. If it's enabled, it returns all values of that JSON property. If it's disabled, it doesn't return the JSON property at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tdonohue
This was a complication we were aware of before starting the development.
The solution we worked out is to ensure that objects which are retrieved with these extra projections are cached separately. This has been implemented in a way to limit the cache impact only to the actual REST objects which have the extra information
If you'd e.g. retrieve an item with various embeds, and the CheckSideItemInRelationship, only the HAL links to the Relationship objects will contain the projection (knowing that those should be cached separately)
The HAL link to the item itself, the bundles, … will remain unchanged.
This solution has the advantages:
- It will use the cached version of the rest resource as much as possible
- It will only use a separate version with the extra information where it is actually applicable and requested (a very small footprint)
- It can even still cache that extra information
And specifically for your latest comment:
- The projections we've used to far are actually not similar to any of those, they fall much further from the standards
- These new projections are also none of the 3 you mentioned, but are calculated data. It's probably best we get the current work ready for a PR, so it's easier to understand how this differs, and why it doesn't fit the examples you've posted here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benbosman : Those last statements are worrisome, as that means this "projection" would fall out of the standard definition of a Projection. Please remember, our goals are to always align with Spring Data REST (to help ensure our REST API is based on standards & best practices). So, if this "projection" doesn't align with Spring Data REST then it no longer sounds like a valid Projection.
In terms of caching, it's good to hear you've worked around the potential caching issues in the UI layer, but this unfortunately doesn't solve the potential caching issues for other clients which attempt to use this projection. It's sounding like, since we've built a projection that doesn't align with Spring Data REST, we are now having to expect clients are "smarter" about how they cache these objects.
I'd still encourage us to look back at Spring Data REST Projections. I'm worried we've accidentally stepped out of bounds of what Projections should do, and it is forcing us to do more work in the client side to ensure everything still works correctly.
I'll leave this on the agenda for this week's meeting, as I think we need to find a way back to using valid Spring Data REST Projections in this implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking back on the initial use cases here, DSpace/dspace-angular#1148 (comment), I'm realizing we might also want to consider adding a new type of search configuration for this scenario.
For instance, what if we had a /api/core/items/search/relatedItems?uuid=<:item-uuid>
search option which only returned Items which were related to a given Item (UUID) via any relationship. We could also easily add a filter for relationshipType
to only related items with this relationship type. If needed, we could have a negation search, e.g. unrelatedItems
, which found all Items that are NOT already related to the given Item, perhaps filterable by entityType.
Other endpoints in this PR we could analyze closer to see if a search could be reasonable, or if a projection is really necessary. The answer may be different for different endpoints. (I've mostly concentrated on this endpoint as it seems the most problematic at a glance.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of the discussion today, and the constructive feedback from @artlowel, I realized that there might be a more simplistic way to make this endpoint a bit more RESTful, without major redesign. However, I'd appreciate feedback on this idea, as it's still "stretching" the definition of a Projection, but still seems to be RESTful in my opinion.
Simply put, I'd recommend we slightly redesign the endpoint as follows.
We'd make a very minor parameter change to (isRelatedItem instead of checkRelatedItem)...
?projection=CheckRelatedItem&isRelatedItem=<:item-uuid>
Then, we'd change the response to instead be a dynamically-named extra JSON property isRelatedItem_<:item-uuid>: [true|false]
E.g. ?projection=CheckRelatedItem&isRelatedItem=f727a6a4-5541-4148-ad0a-1ab9596d981f
Would include in the response this JSON property...
isRelatedItem_f727a6a4-5541-4148-ad0a-1ab9596d981f: true
Specific relationship types could also be handled, but they'd use a different param (the relationship type name) which results in a different JSON property...
E.g. ?projection=CheckRelatedItem&isPublicationOfAuthor=f727a6a4-5541-4148-ad0a-1ab9596d981f
Would return a property...
isPublicationOfAuthor_f727a6a4-5541-4148-ad0a-1ab9596d981f: false
(If the different param is too hard, we could still go with the approach of passing something like isRelatedItem=isPublicationOfAuthor=f727a6a4-5541-4148-ad0a-1ab9596d981f
. It'd just result in a slightly different "rule" for the boolean property that comes back)
In other words, the returned boolean property always uses the format [param-name]_[uuid]: [true/false]
. If multiple UUIDs are passed in, then multiple JSON properties are returned.
The reason this is more RESTful is that the extra JSON property's value is static. It never will change values between requests (whereas relatedItems
does change in the current design).
However, I'll admit it's a slightly "odd" looking design, because the JSON Property itself changes names based on the UUID passed in.
That said, this tweak seems (to me) to align better with REST principles, and it likely shouldn't involve a massive redesign. (However the UI will need to know how to construct the dynamic property name, as it no longer can just look at the relatedItems
property).
@artlowel or @benbosman , thoughts on whether this (or something similar) would be a reasonable compromise here? To me, this seems like a (potentially simple) change that would fix my major concern with the current design.
(If this is reasonable, a similar design pattern could be applied to the other endpoints in this PR... the returned property MUST always include the UUID to ensure it has a unique property name, and it always has a true/false value.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that can work, but if we add dynamic properties to Item, we lose compile-time type checking of Item properties. We'd have to add a catch all property at the bottom for everything that isn't explicitly specified.
I'd prefer it if the dynamic properties are all moved to one property of item. e.g.
"relatedItems": {
"isPublicationOfAuthor_0c021c60-a516-4b49-b4a0-0432df0ddfcd": true,
"isPublicationOfAuthor_f727a6a4-5541-4148-ad0a-1ab9596d981f": false
}
If all the dynamic properties are grouped together, and they're all booleans, we can write something like:
relatedItems: {
[key: string]: boolean
}
That way it doesn't affect the rest of the item model.
It's a little more clunky compared to @benbosman's original proposal, because we'd have to parse those keys to get the uuids, but it's workable from the perspective of the UI.
I don't know about the feasibility on the rest side though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@artlowel : I like that suggestion! I also think it somewhat mirrors our approach to the metadata
field, which acts as a grouping of all metadata about an object (as metadata also can be somewhat "dynamic", in that you don't know what fields will be on each object, and projections should be allowed to similarly hide/show additional metadata fields).
In other words, we already have this concept for metadata...
"metadata": {
[key: string]: array-of-values
}
So, it makes perfect sense to mirror that approach for relatedItems...
"relatedItems": {
[key: string]: boolean
}
I agree that it might seem slightly more "clunky" than the initial approach of @benbosman. But, as you can see, this new design better aligns with our existing approach for the metadata
section. This also means that someone can more easily show/hide individual relationship key
s (e.g. isPublicationOfAuthor_0c021c60-a516-4b49-b4a0-0432df0ddfcd
) via projections, in the same manner that one can show/hide individual metadata key
s (e.g. dc.description.provenance
which is hidden by default) via projections and/or access permissions.
I suggest to make a step back and look to the original problem. |
An off-the-wall view of the discussion: during today's meeting the word "projection" prompted me to think of this in relational terms. And to me, it seems that an operator which adds information from another relation is not a projection at all, but a join. Looking at it that way, a second lookup makes much sense. I am not sure whether this observation is applicable or useful, but I wanted to share it in case it is. |
Closing this Rest Contract in favor of #170 (a non-projection based approach). That said, if we find that approach falls short in significant ways, we can always revisit this idea to implement via a custom type of projection |
This is the REST contract for the changes requested in DSpace/dspace-angular#1148
It contains details of the new Projection type (for JSON properties), the 3 new Projections, and the new search method, all requested in this issue: