Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General strategy for (not) applying domainIncludes and rangeIncludes #59

Closed
aisaac opened this issue Nov 3, 2018 · 19 comments
Closed

Comments

@aisaac
Copy link
Collaborator

aisaac commented Nov 3, 2018

Out of the initial points in #43 , we have agreed on having domainIncludes and rangeIncludes, we've agreed on a definition for them, but have we agreed to applying them to all properties that have a domain and a non-literal range, without any exception?
I don't think we have formally voted on the third point. I have voiced concerns about it in various calls and issues and am reiterating them here.
To make them more concrete I have started an analysis of individual properties and noted my feeling about the suggestions to:

  • replace rdfs:domain by domainIncludes everywhere
  • replace rdfs:range by rangeInclude for every non-literal range and keep rdfs:range when the range is rdfs:Literal.

It's at https://docs.google.com/spreadsheets/d/1cWKBvCzgXEq4_Fq2mKI3werooqJo8HB-DU5Y1s-fKH4/

I'm curious to see whether I'm the only one to feel uncomfortable with what the current 'blanket strategy' implies for some of these properties.

@aisaac
Copy link
Collaborator Author

aisaac commented Nov 3, 2018

@tombaker should this issue have the ISO_18836-2, as #43 has?

@kcoyle
Copy link
Collaborator

kcoyle commented Nov 8, 2018

Antoine, when you say "skeptical for domain" are you saying that you don't think this is the correct domain, or that domainIncludes doesn't work here? It seems that you are reviewing the choice of domains and ranges, and if so I would say that is out of scope for this work on the ISO document.

@aisaac
Copy link
Collaborator Author

aisaac commented Nov 8, 2018

@kcoyle when I've used the wording "skeptical for domain" for a row, I meant that I have doubts about relaxing the domain in that line to domainIncludes (while keeping the same class as the object of the new domainIncludes axiom)

@tombaker
Copy link
Collaborator

tombaker commented Nov 13, 2018

@aisaac In general: Your spreadsheet lists the first fifteen properties in alphabetical order. Is this a coincidence, or are you hinting that we should review range/rangeIncludes for all 56 properties?

On the specifics:

  • alternative is a subproperty of title, which has a range of literal. We actually had a long discussion back in 2007 about whether limiting Title to literals was too limiting. The counter-example was a title in Japanese: one "title" but with multiple ways of expressing that title as a literal (hirigana, kanji, romaji, etc). We considered coining something like a title_r property (title as a resource) but decided against it at the time while not precluding that we could coin this in the future or that another initiative could take the initiative to coin it.
  • date as literal. At the time, Date was being used overwhelmingly with literals structured according to one of the date standards and we wanted to strengthen that expectation with the formal range of literal. At the same time, Coverage explicitly allowed named periods, and the applicability of Date versus Coverage in specific cases looked like a grey area.
  • domain of Collection for accrual* properties: I would see no harm in leaving these as formal ranges, since the properties clearly apply to things that can be inferred to be collections. On the other hand, do we have any evidence that people are using these domains to make inferences? If not, domainIncludes is just as good.
  • I agree that Bibliographic Citation is potentially an edge case but think that to make an informed decision, we would need to look at some usage data.

In general, I see more value in deciding on some simple principles and applying them consistently than in taking a more nuanced approach and applying the changes on a case-by-case basis.

The approach I favor is:

  1. change all range axioms to rangeIncludes except for properties with range of rdfs:Literal
  2. change all domain axioms to domainIncludes.

For the sake of consistency, I would rather change all of the domain axioms to domainIncludes - or leave them all as they are. Note that there are only five properties with domains: the three collection accrual properties, Bibliographic Citation (again), and Medium (domain Physical Resource). Relaxing the axiom for Medium would have the side effect of removing an impediment, noted by @rguenther52 in Issue #26 on July 11, to using the property Medium with digital files.

@tombaker
Copy link
Collaborator

tombaker commented Nov 13, 2018

PROPOSAL 1 - "status quo, based on decisions to date"

  1. change all range axioms to rangeIncludes except for properties with range of rdfs:Literal
  2. change all domain axioms to domainIncludes.

Please vote. If you can live with a proposal but do not favor it, use the "Confused" emoji.

@tombaker
Copy link
Collaborator

tombaker commented Nov 13, 2018

PROPOSAL 2 - "change some literal ranges to rangeIncludes and change some domainIncludes back to domain, on a case-by-case basis"

  1. change most range axioms to rangeIncludes, with some exceptions (proposal required)
  2. change some domain axioms to domainIncludes (proposal required)

Please vote. If you can live with a proposal but do not favor it, use the "Confused" emoji.

@aisaac
Copy link
Collaborator Author

aisaac commented Nov 13, 2018

@tombaker indeed I was hinting that we may have to review all 56 properties. But to test this hypothesis I don't think we need to discuss all of them now. So I've stopped arbitrarily after the 15 first properties in alphabetical order.

@aisaac
Copy link
Collaborator Author

aisaac commented Nov 13, 2018

@tombaker thanks for the explanations re. the range of literal, my concern is not so much about what was the motivation at the time, but rather whether we should still stick to it, especially at a time we would relax everything else...

@aisaac
Copy link
Collaborator Author

aisaac commented Nov 13, 2018

Why not divide and conquer? If there are 5 domains only then we can probably reach an agreement quickly about them and deal with ranges in a separate decision:

PROPOSAL (DOMAINS): keep Collection as rdfs:domain for accrualMethod, accrualPolicy and accrualPeriodicity; keep BibliographicResource as rdfs:domain for bibliographiCitation and replace rdfs:domain with rdfs:domainIncludes for medium.

Rationale: as said on my sheet I can live with the domains kept as such for the collections accrual properties. The domain of bibliographicCitation doesn't bother me. There seems to be much to gain to not constraint medium to PhysicalResource if it's desirable to use it for digital resources.

@tombaker
Copy link
Collaborator

tombaker commented Nov 14, 2018

@aisaac Purely in ontological terms, I'll admit that your proposal actually seems better than applying domainIncludes across the board. I hesitate because it slightly complicates the story by supporting two notions of domain, adding to the conceptual overhead for users, though admittedly the users of DCMIMT most likely to notice will also be the most likely to understand that distinction. Moreover, given the current state of decisions, we already support two notions of range: rdfs:range for literal ranges and rangeIncludes for all others.

Deciding to keep domain for four out of the five properties would mean superseding our unanimous decision of 2 November to assign domainIncludes. I see no actual harm in standing by our decisions to date with regard to domain (the "status quo" above). However, you have persuaded me enough to slightly prefer your solution if others agree.

I have changed my thumbs-up vote on the status quo (above) to a neutral stance (the "confused" emoji), added a thumbs-up to your proposal, and would like to hear other views.

@kcoyle
Copy link
Collaborator

kcoyle commented Nov 14, 2018

My objection to keeping those as "domain" is that few of them seem useful as classes. For example, the only property with a domain of BibliographicResource is bibliographicCitation. The class adds nothing to its meaning, IMO. That leaves Collection as a class, and I just can't see any gain in having this one exception to the rule. I agree with Tom that exceptions are likely to cause confusion, so there would need to be a significant gain to justify this. If we had evidence of widespread use of these properties I might reconsider, but I would still need convincing.

@tombaker
Copy link
Collaborator

See https://doodle.com/poll/ay5xcqqhwsnmna5n to schedule a call specifically about this next week (Nov 19 to 23).

@aisaac
Copy link
Collaborator Author

aisaac commented Nov 15, 2018

@kcoyle what do you mean by "That leaves Collection as a class, and I just can't see any gain in having this one exception to the rule"? Collection is used in 3 domain statements, which is the majority of domain statements. Arguably this is in fact the one most important case.

I personally agree with you that we could do without BibliographicResource. But I am not aiming at questioning the relevant of classes here, I assume they're all extremely relevant except for the one where we're making decisions that make them less relevant ;-) (i.e. the rangeInclude has made the 'lumpy' classes formally useless).

@tombaker
Copy link
Collaborator

tombaker commented Nov 15, 2018

@aisaac I find myself agreeing that rdfs:domain dct:Collection is the "right thing" here. I do not think it is actually wrong to say domainIncludes dct:Collection, but if the accrual properties are found in data, is can be useful to infer that the thing described is a collection in the very broad way defined by DCMI, even if it is found in data that asserts it, more specifically, to be part of, say, a herbarium or fungarium (both of which are types of collection).

So while I would argue for, say, domainIncludes herbarium, because it would not be comprehensive of the intended domain, in this case, where the properties are so very clearly about things that are collections in the broadest sense, domain Collection could add useful information to a description in the form of inferences, especially if the data does not explicitly assert the described resource to be a member of a collection, or if the data consumer does not recognize the hypothetical kew:fungarium to be a collection.

As for bibliographic citation, inferring that the thing described is a Bibliographic Resource is fine and good, but also kind of obvious and perhaps not as useful as an inference.

@kcoyle
Copy link
Collaborator

kcoyle commented Nov 15, 2018

@aisaac A couple of things. First, that Collection is listed under /dcmiType/ and does not appear in the regular list of classes, which is already a kind of exception.

Second, if you look at the stats that Osma and Joachim came up with, two of the 'accrualX' are used zero times and one is used 2 times. That is out of millions of triples. So making an exception for rarely used terms in the dcmiType vocabulary just doesn't make sense to me, and does not offset the confusion that an exception could cause.

The big question (that I don't believe we have actually decided) is what will the RDF version of the vocabulary have for domains and ranges? I suspect that we have at least two kinds of users: those who are using RDF/OWL tools like Protege and PoolParty (who I'll call the STRICT users), and those who are creating RDF-like data but not doing any formal validation against the RDF property and class definitions (the NON-STRICT users). The STRICT users will not be happy with '*Includes' if we change the RDF definition. The NON-STRICT users who are doing their own validation (or none at all) may not notice a difference.

@tombaker
Copy link
Collaborator

@kcoyle I did indeed argue for consistency! As things stand, however, we are already keeping range (for literal ranges) and rangeIncludes, so in that sense it wouldn't be inconsistent to keep both variants of domain. That Collection appears in the type vocabulary is a good point - maybe that's why it seems more natural to see it used to infer rdf:type Collection triples.

@tombaker
Copy link
Collaborator

tombaker commented Nov 20, 2018

APPROVED

Keep rdfs:domain foraccrual* properties and bibliographicCitation

domainIncludes for medium

@tombaker
Copy link
Collaborator

tombaker commented Nov 20, 2018

APPROVED

Leave rdfs:range for properties with literal range for now.

Revisit later (i.e., after publication of ISO) on a case-by-case basis.

@tombaker
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants