Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiplicity and Collections #92

Closed
azaroth42 opened this issue Nov 3, 2015 · 13 comments
Closed

Multiplicity and Collections #92

azaroth42 opened this issue Nov 3, 2015 · 13 comments
Assignees

Comments

@azaroth42
Copy link
Collaborator

It was brought up at TPAC that there's quite a lot of semantic overlap between Composite and Collection, List and OrderedCollection. They all use the notion of items and the same basic construction of a wrapper around an rdf:List or repeated predicate. We could simplify this situation by dropping Composite and List in favor of the AS Collections. Further, we could require OrderedCollection and if you don't care about the order, then you won't care that there is an order. In assessing the impact, Benjamin and Rob could not find in a use case or think of any situation where this was damaging.

Proposal:

  • Delete oa:Composite and oa:List, and just use as:OrderedCollection.
  • Make oa:Choice a subClassOf as:OrderedCollection
  • [Propose to ActivityStreams to drop non OrderedCollections as unnecessary]

This is also related to a second issue, regarding the Selector (or indeed Specifier) workflow and that Choice/List are overkill for the intended functionality

@azaroth42 azaroth42 self-assigned this Nov 3, 2015
@azaroth42 azaroth42 changed the title Simplification: Multiplicity and Collections Multiplicity and Collections Nov 4, 2015
@jjett
Copy link

jjett commented Nov 5, 2015

Is it possible to delay discussion of this issue until December? I'm about start my quals here and don't have time to formulate my thoughts fully but I am -0.5 for this idea. as:OrderedCollection seems fine for oa:List and oa:Choice but I'm not as confident that it's truly fit for oa:Composite. What is really need for composite is a property that articulates when something is a mashup or amalgam of things, i.e., what I need is a way to to say that I have steel and not that I have a pot of iron bars and a pot of bonemeal.

The composite idea encompasses the idea of annotating a collage of images and not a list of images.

I'll follow-up on this with a lengthier email around Thanksgiving time here in the States.

Regards,

Jacob

@azaroth42
Copy link
Collaborator Author

If you're okay with List and Choice we could go ahead with that part and save the discussion of Composite until you have time.

What would be most useful is concrete use cases in which List could not fulfill the role of Composite and implementations would be meaningfully impacted. Remember that inferencing and reasoning are not critical requirements, so please try to come up with UCs that aren't based on them. For example, in what way does the collage of images require there to be no order at all (especially given that there will be an order inherent in the serialization), rather than just ignoring that order. In JSON-LD, our primary serialization, both are going to look identical after all:

{
  "type": "Composite",
  "items": [ "iron", "iron", "iron", "bonemeal"]  // an @set
}
{
  "type": "List":,
  "items": ["iron", "iron", "iron", "bonemeal"] // an @list
}

@jjett
Copy link

jjett commented Nov 5, 2015

I think your example (like a lot of examples lately) leans too much on document pattern and not enough on what kinds of behavior we expect out of the consuming applications. Imagine the following case:

{
"type": "Sum",
"items": [ "iron", "iron", "iron", "bonemeal"] // add these together
}
{
"type": "Print":,
"items": ["iron", "iron", "iron", "bonemeal"] // render these on-screen
}

It looks the same as the as:OrderedCollection pattern...the moral of the story is that we can get a huge amount of mileage out of simply varying @type . There are many kinds of containers which fulfill different roles and demand that different things be done with their contents. A grave problem with the activity streams approach is that it assumes that only one thing can be done with a list (as though we don't used them in actionable fashions everywhere).

Of potential importance set =/= aggregation; list = set; collection = aggregation (ditto for composite). More on this in a couple of weeks.

Closing thought, we've already removed inferencing and reasoning from the playing board (which begs the question of why bother with rdf at all), are we also to discount our expectations for what consuming applications are actually supposed to do with these documents? Or to put it another way, if the goal is to produce a general document standard for annotation-flavored documents then why not simply work directly in the json serialization format? (or xml or html or...).

It's not actually the case that we have no commitments at all to inferencing and reasoning...

@iherman
Copy link
Member

iherman commented Nov 6, 2015

On 5 Nov 2015, at 22:46, Jacob notifications@github.com wrote:
Closing thought, we've already removed inferencing and reasoning from the playing board (which begs the question of why bother with rdf at all),

I have said that several times: as far as I am concerned, the value of RDF is not primarily its inference possibilities. The value is an easy integration of different data coming from different sources and origins, into one big happy graph. In practice, many of the RDF tools out there do not even do inference and reasoning, even the most basic RDFS reasoning (let alone OWL), by default, only via additional tools and libraries (if any)...
are we also to discount our expectations for what consuming applications are actually supposed to do with these documents? Or to put it another way, if the goal is to produce a general document standard for annotation-flavored documents then why not simply work directly in the json serialization format? (or xml or html or...).

It's not actually the case that we have no commitments at all to inferencing and reasoning...

@akuckartz
Copy link

@azaroth42 wrote

Further, we could require OrderedCollection and if you don't care about the order, then you won't care that there is an order.

If there is no order then it should not be an OrderedCollection.

@jjett
Copy link

jjett commented Nov 20, 2015

This may be TL;DR... however, regarding the differences between composites and lists. Conceptually, ontologically, there is a fundamental difference between these things. One is an aggregation and the other is a set. Because they are fundamentally different things we should (and indeed must) expect different kinds of behaviors from them.

In the case of composite we are doing to two things:

  1. We naming some new thing (a content object thing) that comprises multiple other things (themselves content objects)
  2. We expect that a software agent will appropriately combine them into a single web resource which an end user encounters.

One example of a composite is an aggregation of 120 digitized pages. Notionally we might call it a book. My expectation is that the software agent that consumes this composite understands that it must string the pages together (perhaps through some page-turner application or perhaps by forging them into a single pdf or other epub format file) into some singular thing that the end user encounters.

Another example is the digital emblematica one that emerged from the 2nd round of OAC work - that of juxtaposing images. The expectation our end users had was that the thing they were annotating, the actual target of the annotation, was a collage comprising a pair of emblem icons arranged according to their wishes. The use case for the annotation model is to provide enough information so that the annotation's target can be faithfully reproduced in any arbitrary setting that has access to the underlying images from which the collage is formed. Again, something new and neither set- nor list-like is formed.

Using this information we can actually describe two kinds of Composites: Unordered Networks of Content Objects [UNCO](e.g., collections, aggregations, etc.) and Ordered Hierarchies of Content Objects [OHCO](aka Documents -- e.g., Books, Videos, Poems, etc.). The difference between them is the nature of the relationships that obtain among their parts.

In the case of things like collections, there is always at least one feature which forms an axis along which all members of the collection are equivalent. Take for example the Charles W. Cushman Photograph Collection[1]. The following two equivalency relationships always obtain between any to members of the collection: they always have the same creator (Charles W. Cushman) and they are always of the same kind of thing (they are photographs). These relationships obtain regardless of the order of the individuals and no amount of reshuffling will affect them.

The collection's existence as a Composite is completely independent of the order of its members and any such ordering or reordering of them is merely presentational in nature (and can be handled entirely on the application end). The aggregated members collectively form a contextual mass, the many, many relationships (besides the equivalency relationships) producing some additive content beyond that which the user experiences when encountering any one of the photographs individually. This additive content is recorded through the auspices of collection-level metadata and through the relationships that obtain among collection members by virtue of their inclusion in the collection.

If this was the only kind of Composite that we needed to record information about then the Activity Streams model might be adequate even though it conflates UNCOs with Lists. Unfortunately there is a paucity of pertinent information that would let us determine when we are looking at one or the other. It lacks rather important things like motivations and expectations. I don't really know what I'm supposed to do with an as:Collection.

Of course, UNCOs like collections aren't the only kind of composites we annotate. There are also OHCOs. We might argue that we have existing container models for a lot of these but a great failing is that we don't really have one that lets us make OHCOs out of anything on the fly (like constructing impromptu collages from existing images). But this kind of behavioral expectation is wholly unsuited for as:Collection which "represents ordered or unordered sets of Object or Link instances." In no way would we be safe in assuming that we could successfully tell a software agent that it should take the things contained within the collection to forge some new entity from them. It (as:Collection/as:OrderedCollection) cannot communicate that the thing contained within is greater than the sum of its apparent parts.

Lists lack these kind of expectations. They do not represent some new singular content-bearing thing. Rather they are either some kind of ordered presentational framework (e.g., present these things in this arrangement) or an ordered execution framework (e.g., do these things in the following sequence). They are sets and not aggregations.

What this implies to me, with regards to the annotation model is that as:Collection is not sufficient to meet the needs of the model. I'm not even certain that they actually meet the model's needs vis-a-vis ordered lists. As @tilgovi and @shepazu have noted a few times some normative work is needed. Unfortunately wrapping selectors or wrapping content objects inside of an as:Collection or as:OrderedCollection wrapper doesn't actually tell me what to do with them. We are not making any ontological commitments with regards to what they do or what they mean. This is a problem because composite both means something very different from list and behaves very differently from list.

The good news is, we probably don't need to do this work on our end. Like Specific Resources and Specifiers, the whole issue of kinds of containers and what kinds of behaviors they should instigate in software agents can be spun out to another working group (and why this wasn't one of the first semantic web working groups ever formed I do not understand but for some reason its 2015 and we have 0 agreements on containers... :( ). As @tilgovi mentioned in a past post and as the workset example elsewhere demonstrates, we have a lot of uses for containers. We expect them to fulfill a lot of different roles and lend themselves to various specialized behaviors. These uses and expectations extend well beyond annotations (or activity streams for that matter).

-1 for using as:Collection or as:OrderedCollection in place of oa:Composite. They are both wholly inadequate mechanically and ontologically for fulfilling the role needed by composite use cases.

-0 for using as:Collection in place of oa:List (or even rdf:List). Poor label selection. :( When things are lists or sets we should just call them that. Collections are aggregations which are an entirely different animal altogether.

My alternate suggestion for the multiplicity objects issue is 'do nothing, will not fix' until such a time as the W3C forms a working group that confronts containers directly.

Apologies for the wordiness and any lack of clarity in this post.

[1] http://webapp1.dlib.indiana.edu/cushman/index.jsp

Regards,

Jacob


Jacob Jett
Research Assistant
Center for Informatics Research in Science and Scholarship
The Graduate School of Library and Information Science
University of Illinois at Urbana-Champaign
501 E. Daniel Street, MC-493, Champaign, IL 61820-6211 USA
(217) 244-2164
jjett2@illinois.edu

@iherman
Copy link
Member

iherman commented Nov 21, 2015

@jjett, let me try to summarize your reasoning, just to be on the safe side...

  • The (conceptual) aggregation of web resources, to be used as a single target or body in our model is complex, insofar as, at least in some cases, the real nature of the aggregation is difficult to express semantically and ontologically.
  • The current AS structures are not rich enough to express that complexity
  • Therefore why bother using the AS structures, they are inadequate anyway (I paraphrase:-)

I would agree with your first statement; the difference between lists and sets is a clear example. (Actually, as a side note, it is interesting to see that for a long time programming languages ignored the necessity of having sets in their palette of datatypes and they all relied on lists; they have then been added later to languages like Python or, finally, to ES6…).

I must admit I do not have a very clear view on whether using AS would lead to something really much better than what we have in the model document right now under the multiplicity constructs (most of the use cases for composite structures in this working group came from you:-).

I any case, I strongly agree with your conclusion that this Working Group should not even attempt to provide a generic model that covers all the complexity in this area. Whether this is done in another group, or whether the community at large comes up with some structures, I cannot say, but we should limit ourselves to our scope.

Oh and, of course, I am all in favour of choosing a "won't fix" action if it is reasonable. Call my lazy:-)

Ivan

On 20 Nov 2015, at 21:51, Jacob notifications@github.com wrote:

This may be TL;DR... however, regarding the differences between composites and lists. Conceptually, ontologically, there is a fundamental difference between these things. One is an aggregation and the other is a set. Because they are fundamentally different things we should (and indeed must) expect different kinds of behaviors from them.

In the case of composite we are doing to two things:

  1. We naming some new thing (a content object thing) that comprises multiple other things (themselves content objects)
  2. We expect that a software agent will appropriately combine them into a single web resource which an end user encounters.

One example of a composite is an aggregation of 120 digitized pages. Notionally we might call it a book. My expectation is that the software agent that consumes this composite understands that it must string the pages together (perhaps through some page-turner application or perhaps by forging them into a single pdf or other epub format file) into some singular thing that the end user encounters.

Another example is the digital emblematica one that emerged from the 2nd round of OAC work - that of juxtaposing images. The expectation our end users had was that the thing they were annotating, the actual target of the annotation, was a collage comprising a pair of emblem icons arranged according to their wishes. The use case for the annotation model is to provide enough information so that the annotation's target can be faithfully reproduced in any arbitrary setting that has access to the underlying images from which the collage is formed. Again, something new and neither set- nor list-like is formed.

Using this information we can actually describe two kinds of Composites: Unordered Networks of Content Objects UNCO x-msg://8/e.g.,%20collections,%20aggregations,%20etc. and Ordered Hierarchies of Content Objects OHCO x-msg://8/aka%20Documents%20--%20e.g.,%20Books,%20Videos,%20Poems,%20etc.. The difference between them is the nature of the relationships that obtain among their parts.

In the case of things like collections, there is always at least one feature which forms an axis along which all members of the collection are equivalent. Take for example the Charles W. Cushman Photograph Collection[1]. The following two equivalency relationships always obtain between any to members of the collection: they always have the same creator (Charles W. Cushman) and they are always of the same kind of thing (they are photographs). These relationships obtain regardless of the order of the individuals and no amount of reshuffling will affect them.

The collection's existence as a Composite is completely independent of the order of its members and any such ordering or reordering of them is merely presentational in nature (and can be handled entirely on the application end). The aggregated members collectively form a contextual mass, the many, many relationships (besides the equivalency relationships) producing some additive content beyond that which the user experiences when encountering any one of the photographs individually. This additive content is recorded through the auspices of collection-level metadata and through the relationships that obtain among collection members by virtue of their inclusion in the collection.

If this was the only kind of Composite that we needed to record information about then the Activity Streams model might be adequate even though it conflates UNCOs with Lists. Unfortunately there is a paucity of pertinent information that would let us determine when we are looking at one or the other. It lacks rather important things like motivations and expectations. I don't really know what I'm supposed to do with an as:Collection.

Of course, UNCOs like collections aren't the only kind of composites we annotate. There are also OHCOs. We might argue that we have existing container models for a lot of these but a great failing is that we don't really have one that lets us make OHCOs out of anything on the fly (like constructing impromptu collages from existing images). But this kind of behavioral expectation is wholly unsuited for as:Collection which "represents ordered or unordered sets of Object or Link instances." In no way would we be safe in assuming that we could successfully tell a software agent that it should take the things contained within the collection to forge some new entity from them. It (as:Collection/as:OrderedCollection) cannot communicate that the thing contained within is greater than the sum of its apparent parts.

Lists lack these kind of expectations. They do not represent some new singular content-bearing thing. Rather they are either some kind of ordered presentational framework (e.g., present these things in this arrangement) or an ordered execution framework (e.g., do these things in the following sequence). They are sets and not aggregations.

What this implies to me, with regards to the annotation model is that as:Collection is not sufficient to meet the needs of the model. I'm not even certain that they actually meet the model's needs vis-a-vis ordered lists. As @tilgovi https://github.com/tilgovi and @shepazu https://github.com/shepazu have noted a few times some normative work is needed. Unfortunately wrapping selectors or wrapping content objects inside of an as:Collection or as:OrderedCollection wrapper doesn't actually tell me what to do with them. We are not making any ontological commitments with regards to what they do or what they mean. This is a problem because composite both means something very different from list and behaves very differently from list.

The good news is, we probably don't need to do this work on our end. Like Specific Resources and Specifiers, the whole issue of kinds of containers and what kinds of behaviors they should instigate in software agents can be spun out to another working group (and why this wasn't one of the first semantic web working groups ever formed I do not understand but for some reason its 2015 and we have 0 agreements on containers... :( ). As @tilgovi https://github.com/tilgovi mentioned in a past post and as the workset example elsewhere demonstrates, we have a lot of uses for containers. We expect them to fulfill a lot of different roles and lend themselves to various specialized behaviors. These uses and expectations extend well beyond annotations (or activity streams for that matter).

-1 for using as:Collection or as:OrderedCollection in place of oa:Composite. They are both wholly inadequate mechanically and ontologically for fulfilling the role needed by composite use cases.

-0 for using as:Collection in place of oa:List (or even rdf:List). Poor label selection. :( When things are lists or sets we should just call them that. Collections are aggregations which are an entirely different animal altogether.

My alternate suggestion for the multiplicity objects issue is 'do nothing, will not fix' until such a time as the W3C forms a working group that confronts containers directly.

Apologies for the wordiness and any lack of clarity in this post.

[1] http://webapp1.dlib.indiana.edu/cushman/index.jsp http://webapp1.dlib.indiana.edu/cushman/index.jsp
Regards,

Jacob

Jacob Jett
Research Assistant
Center for Informatics Research in Science and Scholarship
The Graduate School of Library and Information Science
University of Illinois at Urbana-Champaign
501 E. Daniel Street, MC-493, Champaign, IL 61820-6211 USA
(217) 244-2164
jjett2@illinois.edu mailto:jjett2@illinois.edu

@BigBlueHat
Copy link
Member

Anyone want to do a quick cost analysis on removing (vs. keeping) the multiplicity stuff we have now vs. switching to AS's *Collection terms?

The area I see this being the most risky (and valuable) are for storing selectors--especially Choice--but even here, I see UA's doing whatever they think is best despite what they're given.

For instance, you could tell a UI to choose between two selectors, but if the UA tries one and it fails?...it's going to try the other one (or that's how I'd write it). If you didn't define it as a Choice (but just put them in the target array), a UA would try them both and Do The Right Thing (in its eyes only) for resulting in what the user expects--i.e. the best selection that UA can give them.

So. If there's some other scenario where these things must be used--and for which a UA wouldn't just "figure it out" (as best it can)...then we should spec to those use cases. If not, then we should at least mark this section as "at risk"...or replace with something simpler (as is the propose of this issue).

@jjett you're the one with the greatest interest in these things (afaik). Would you mind connecting the current multiplicity bits to some use cases (maybe in the wiki?) and explain why the other options aren't useful to you? I'm thinking something similar to what we did for the Roles work which originally started on the wiki.

Thanks in advance! 😉

@azaroth42
Copy link
Collaborator Author

I think the analysis comes out very good :)
Given the recently updated AS: https://www.w3.org/TR/activitystreams-core/#collections and the current multiplicity constructs, the mapping is very easy:

  • Choice becomes a subClassOf OrderedCollection
  • List is replaced by OrderedCollection
  • Drop Composite as unnecessary and either overly complex or out of scope

We have to specify OrderedCollection and paging for the protocol. In order to do that we have to have them referenced from the model. Following our principles, we should not invent something new when there's an existing term, and we should have only one way to do something. Currently there are multiple ways to create an ordered list.

So the cost to the model is free to merge them, and non-free to have both.

In terms of implementation, I know of several implementations of Choice. The Protocol will provide implementations of List/OrderedCollection, and the IDPF have an explicit requirement for it. In the model, the only distinction needed (if we accept #93 to cover execution order) at the Annotation level is whether the targets or bodies are:

  • separate individuals (by way of multiple hasTarget/hasBody rels)
  • all required together (by way of as:OrderedCollection)
  • or whether any one of them is required (by way of oa:Choice)

I would actually be okay to drop the second bullet and Composite/List completely, given #93. I don't think we'll get 2 independent implementations that do anything meaningful to exit CR, so we could just drop them now and leave it up to other standards/ontologies/systems to determine.

To respond to @jjett:

  • I disagree that there is any implication that a consuming software agent will combine resources in a Composite in any way that they might not also do for a List.
  • An unordered set of pages is not a book, which has order. A Book is at least better represented as a List, if not a more complex structure. Also, we're not providing a Book structure ontology!
  • There's no reason why a collage or a collection should NOT have an order: there's no additional cost given that the serialization will imply an order anyway. In some situations (z-axis ordering, relevance, etc) it may even be essential.
  • We should avoid epistemological debates about identity. Whether a collection exists outside of the Annotation ontology is entirely irrelevant to an Annotation discussion. Other ontologies can describe those things, and the entities can be annotated. The use cases for including them in the Annotation Ontology must be directly related to the Annotation, not complex descriptions of the target. The ordered presentational or execution framework is exactly what we need in this context.

@jjett
Copy link

jjett commented Jan 19, 2016

@azaroth42 I see the points you're making. I agree with your final one that identity debates are not something we need to have here. +1 for dropping Composite from the model. Counter proposal for Choice and List.

Remove Choice and List from model and refer implementers to activity stream's ordered collections via non-normative best practice documentation as both choice and list seem out of scope for annotations per se.

One of the issues I'm stumbling with is consistency of approach. For Composites you say, "I disagree that there is any implication that a consuming software agent will combine resources in a Composite in any way that they might not also do for a List." However your expectation for Choice is precisely the opposite. Thus the need to differentiate it from a List. I hate to sound unreasonable but this approach only meets the needs of certain community members. Since, on the whole, I think this is a Containers issue, I think the whole matter is out of scope for us and better left for a group dedicated to looking at Containers and combinatorial expectations.

@BigBlueHat
Copy link
Member

This just came up on #145. I recall agreeing to drop Composite...and I thought we'd done that one a call. 😕 Anyone want to clear this one up?

@jjett
Copy link

jjett commented Jan 27, 2016

@BigBlueHat This has not yet been addressed on a call. On the whole, these may not be one call issues. They were a pretty major face-to-face meeting issue for the community group. (It might have taken 2 face-to-face's to reach the stage it's at now.)

On the whole these are not easy issues and there is a lot of overlap/intersection with standards and vocabulary activities going on elsewhere. It might be worth considering subtracting all of this from the model/vocab, deferring it until the new charter period, and applying some stopgap best practices until we can see how the community actually deals with this issue.

@iherman
Copy link
Member

iherman commented Feb 12, 2016

Closed by resolution

we close issues #50, #92, #145 with the principle that, whenever we can, we use ordered list only. Exceptions should be subjects of specific issues.' at telco 2016-02-12.

See: http://www.w3.org/2016/02/12-annotation-irc#T16-54-02

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants