Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expressing sequence in complex object children #1181

Closed
mjordan opened this issue Jun 22, 2019 · 20 comments
Closed

Expressing sequence in complex object children #1181

mjordan opened this issue Jun 22, 2019 · 20 comments
Milestone

Comments

@mjordan
Copy link
Contributor

mjordan commented Jun 22, 2019

I couldn't find any discussion of this in the CLAW issues, so I thought I'd create one. Now that we will need to start thinking about modeling complex objects, we need to decide how we're going to express sequence/order of children.

I've looked at all of the ontologies listed in the list of ontologies we're already using, and none, as far as I can tell, contain any properties describing sequence or order. Googling for an ontology that expresses sequence is an exercise in frustration.

We have isSequenceNumber and isSequenceNumberOf_some_pid from Islandora 7.x, for example:

  • For children of compound objects: <islandora:isSequenceNumberOfbcp_5454>1</islandora:isSequenceNumberOfbcp_5454>
  • For pages: <islandora:isSequenceNumber>8</islandora:isSequenceNumber>, plus <islandora:isPageNumber>3</islandora:isPageNumber>

I wonder if we need a separate property for pages? The fact that they are pages could be indicated in some other way, e.g., by using a model tag of "Page" for example.

It's my understanding that the isSequenceNumberOf_xxx pattern is to allow a 7.x object to be a child of multiple compound parents. (I wonder how many compound children in the 7.x wild actually have multiple parents. 🤔) The disadvantage of this pattern in 7.x is that every object with this class of property has a unique value in its RELS-EXT, which, if indexed in Solr, can substantially increase the size of the index. We should think of other ways of supporting this requirement in Islandora 8. Maybe a custom complex field that can be repeatable. This is probably feasible, but we still have the problem of how to serialize its value into RDF.

I assume that in Islandora 8 the sequence property will be a node field, like field_member_of currently is, e.g. field_is_sequence_number.

@seth-shaw-unlv
Copy link
Contributor

I haven't found an ontology to go along with it, but I've been relying on the Weight module for storing sequence values. See my local field and field storage configs.

You could make a compound field to store a Parent and Weight pair, but serializing that requires implementing hook_jsonld_alter_normalized_array to find those field instances and produce the JSON-LD serialization you want.

@whikloj
Copy link
Member

whikloj commented Jun 24, 2019

Multi-parent scenarios always seem to be the cause of bloat and complexity. We use compound now but I am hoping to possibly dump it or I'd vote for pursuing a single parent and multi-parent path. Then we can let people deal with the complexity if they need to. But I would stick to single-parent scenarios as we currently don't have any multi-parent scenarios outside of collection membership. @jonathangreen makes heavy use of the compound solution pack in 7.x, so he might have some thoughts.

@seth-shaw-unlv
Copy link
Contributor

seth-shaw-unlv commented Jun 24, 2019

Yeah, I'm happy with the single parent solution, myself. After building a few custom fields, I'm happy to avoid making another.

Anyway, I did some more poking around this morning and found the Collections Ontology that supports a few different ways of ordering a collection, but more importantly, the "index" predicate can be used with the weight field. It is a good thing that namespaces don't have to be dereferenceable, because "http://purl.org/co" throws a 500.

@jonathangreen
Copy link
Contributor

I am happy with a single parent solution, especially for compound. We extensively use compound, but restrict that to items only having a single compound parent. I agree that multi-parent is difficult to deal with in many ways and I would rather avoid the complexity.

I also agree that belonging to multiple collections is a multi-parent situation that we would need.

@mjordan
Copy link
Contributor Author

mjordan commented Jun 24, 2019

I'd like to confirm that no one is currently using Compound to manage children that have multiple parents. If we find that no one is, I'd be 👍 to not address that need now. I'll post to the email list.

@seth-shaw-unlv
Copy link
Contributor

seth-shaw-unlv commented Jun 24, 2019

Looking at this a bit more, I'm wondering if we should be using two separate fields (both predicates come from the W3C SimplePartWhole):

  • One reference field would be multi-parent partOf where we can add the item to any number of unordered sets.
  • Another reference field would be a single parent partOf_directly field to show it is part of a larger object (potentially ordered set).

Obviously we would want to run all this by MIG (tagging @rosiel).

@mjordan
Copy link
Contributor Author

mjordan commented Jun 24, 2019

Post to islandora and islandora-dev lists is at https://groups.google.com/d/msg/islandora/n0_d72aPeiQ/4QJtRUAvCQAJ. I've asked for feedback by July 5.

@bryjbrown
Copy link
Member

bryjbrown commented Jun 24, 2019

What about using rdf:Seq? It can be used not only to group a compound's references to its children, but also express the sequence. You could have a compound parent with an rdf:Seq ordered list of its children, and this would still allow for multiple compound parents to reference the same child since the reference is stored on the parent and not the child. You could even express it with the same predicate used for collection ownership, but with a collection having an rdf:Bag of children and a compound parent having an rdf:Seq.

@seth-shaw-unlv
Copy link
Contributor

@bryjbrown possibly; although I'm not seeing an elegant implementation because the sequence value is expressed as a predicate or the parent, rather than as property of the child.

It looks like, to me, implementation would probably require a 'child/order' pair property on the parent object with a hook_jsonld_alter_normalized_array implementation to transform it into the sequence property and child reference.

@t4k
Copy link

t4k commented Jun 24, 2019

I agree that multiple parents is too complex for basic functionality.

I was starting to look into this topic a couple of weeks ago, and one of the bits I found was this part of the PCDM: https://github.com/duraspace/pcdm/wiki#ordering-extension

The way ordering is laid out might be overly complex, though, with only "previous" and "next" being linked on children to other children…

@mjordan
Copy link
Contributor Author

mjordan commented Jun 24, 2019

@seth-shaw-unlv makes a good point - the property we're discussing is of the child, not of the parent. In Islandora, parents of all sorts do not know directly about their children; children however know about their parents. We should be careful to preserve the directionality of this relationship and not become susceptible to the "many members" issue.

@dannylamb
Copy link
Contributor

🔥 multi-parent membership 🔥

Won't somebody please think of the breadcrumbs!

@whikloj
Copy link
Member

whikloj commented Jun 26, 2019

I'd note that the many-members issue is related to Fedora, so I am not sure that it has any impact here.

I'd also note that if you place the references to children and their order in the parent, then for different parents you can easily have different orderings.

Just a thought.

@mjordan
Copy link
Contributor Author

mjordan commented Jun 26, 2019

I'd note that the many-members issue is related to Fedora, so I am not sure that it has any impact here.

True, I didn't mean to introduce a red herring, but it seemed relevant at the time I wrote it.

@dannylamb
Copy link
Contributor

So the PCDM approach looks solid to me except for one thing. It makes you use proxies. It's for mutliple orderings, and I get that, but I would be cool just to put prev/next links on the children themselves instead of a proxy if you only have one ordering. We could easily alter all that stuff into the jsonld (and link headers for the REST API!), and only resort to proxies if you wanted multiple orderings or used something like entityqueue.

@rangel35
Copy link

iana has first, last, prev, next if that helps

@rosiel
Copy link
Member

rosiel commented Jul 1, 2019

@whikloj 's approach is 100% what the user should interact with. You go to a parent to order its children. Store this in Drupal however the heck you want.

But it is impossible to say "'Purple Rain' is on 'My Mixtape' and is track 4" in RDF without some sort of proxy structure. That's why PCDM does its proxyIn/proxyFor thing.

However, it makes a choice (and I don't know why, but it's probably easier for computers?) to only use previous/next pointers, not ordinals (i.e. sequenceNumbers, i.e. the integer 4 representing that this is the 4th track. Or for you CS folks, the integer 3.). Both methods have their challenges for maintaining the list's integrity, cause you can make loops with prev/next, and with just sequenceNumbers you could have 3 things at '1' and the only other sibling at '100' or some mess like that.

Is the sequence number within a complex object an important piece of metadata? When we 'bag' an object or otherwise remove it from its context in the graph, does it matter what order it was in, in its various compound objects? Maybe it matters - example: sequenceNumber '2' means this image is the 'back' of the postcard. Maybe it doesn't matter - example: some curator made a list of photographs of the same object, and this happens to be the second image at the time of export.

Maybe (philosophizing?) things like pages in a book, or front/back of a postcard, have 'inherent order' that is part of the structure of the object as we understand it. A front of a postcard can only be the front of one postcard, and it is crucial to understand this image that it is part of a postcard. Likewise, when we consider a page within a book, we're not expecting people to rip it out and replace it into a different book (not without fundamentally changing its relationship to the first book). A book of pages, as we model it digitally, is a sequence. (oh fuck what if you have inclusions or things on one page that need to be opened? Maybe we have 'alternate views' of those pages...) Where was I going with this? Inherent order means one parent. Its order is a property of the child.

:page4 isMemberOfSequence :mybook.
:page4 isSequenceNumber 4.

Mixtapes, curated exhibits, etc. are lists with assigned order. It's a context in which the object is present, and maybe order matters. We already have 'collections' to model parents and children with no assigned ordering. But I think we might need/want something in the middle - a way to make your own 'ordered collections'. Here I kind of want to say PCDM.

:page4 isPartOf :myExhibit .
:page4 isPartOf :anotherExhibit .

:myExhibit iana:first :p1 .
:p1 a ore:Proxy .
:p1 proxyFor :page4 .
:p1 proxyIn :myExhibit .
:p1 iana:next :p2 .

What if we don't have to make a separate "compound" type - we just have collections that may be ordered or not?

Your mixtape? it's a collection with order. Your OAI-PMH harvest target? a collection without order. Don't store the order on the children, as it's not inherent to them.

Paged content? Go ahead and store the order on the children because by this model they are only in one paged Parent.

@mjordan
Copy link
Contributor Author

mjordan commented Aug 19, 2019

In prep for the upcoming sprint, I am linking to the discussion about this, including a use case for multiple parents, on the islandora-dev email list: https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/islandora-dev/YnLtvFjBINo/DxjYjaBlAgAJ .

@seth-shaw-unlv
Copy link
Contributor

seth-shaw-unlv commented Oct 16, 2019

Noting here that islandora_defaults now includes field_weight with a mapping to co:index as of Islandora/islandora_defaults@b8c4d51 which is used in conjunction with field_member_of.

However, we may want to keep this issue open for reference should someone want to implement multi-orders using proxy ordering as suggested above.

@dannylamb dannylamb added this to the 1.x milestone Jan 30, 2020
@mjordan
Copy link
Contributor Author

mjordan commented Jun 10, 2020

@dannylamb can we close this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants