Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coining and applying dcam:rangeIncludes #43

Open
tombaker opened this issue Jul 19, 2018 · 47 comments
Open

Coining and applying dcam:rangeIncludes #43

tombaker opened this issue Jul 19, 2018 · 47 comments

Comments

@tombaker
Copy link
Collaborator

tombaker commented Jul 19, 2018

Issue 32: "Broken ranges? has generated some great discussion, covering among other things the history of how rdfs:domain and rdfs:range ended up being defined as they currently are. This was also discussed in the June 26 telecon. It looks like we have converged on the following approach:

  • Coin http://purl.org/dc/dcam/rangeIncludes, e.g., as defined in the current ISO draft:

    Class of which a value described by the term may be an instance.

  • Coin http://purl.org/dc/dcam/domainIncludes:

    Class of which a resource described by the term may be an instance.

  • Substitute dcam:rangeIncludes for rdfs:range (and dcam:domainIncludes for rdfs:domain) for all properties except for properties with a range of rdfs:Literal (which will continued to use rdfs:range).

Note that the definition of dcam:rangeIncludes proposed here is only slightly different from that of rdfs:range ("class of which a valued described by the term is an instance"). For comparison, see the Schema.org definitions [1,2].

Note that the definitions of rangeIncludes and domainIncludes would be included in ISO 15836, but as part of Section 3.1 ("Terms and definitions"), where they would be defined in words only (i.e., no URI), and not in Section 3.3 ("DCMI properties"), which lists properties of the /terms/ namespace with their URIs.

2018-07-20 edit: This posting conflates several issues that I would like to split out into separate questions below.

[1] https://meta.schema.org/rangeIncludes
[2] https://meta.schema.org/domainIncludes

@kcoyle
Copy link
Collaborator

kcoyle commented Jul 19, 2018

Using the DCAM namespace essentially makes this a modification to DCAM. Should we look at what it does to the DCAM model? Should we update the DCAM documentation to include this?

@tombaker
Copy link
Collaborator Author

@kcoyle Instead of starting an entirely new namespace, I suggest we simply rename Section 8 of DCMIMT so that it does not imply that that these are "Terms related to the DCMI Abstract Model", then annotate or even deprecate the DCAM specification itself, though that is a different discussion.

@tombaker
Copy link
Collaborator Author

tombaker commented Jul 20, 2018

STRAW POLL ON USING rangeIncludes

Do you agree with changing rdfs:range statements into rangeIncludes statements for properties that are intended to be used with non-literal values? (And rdfs:domain => domainIncludes.)

@tombaker
Copy link
Collaborator Author

tombaker commented Jul 20, 2018

STRAW POLL ON LITERAL RANGES

Do you agree with not changing rdfs:range rdfs:Literal statements?

@tombaker
Copy link
Collaborator Author

tombaker commented Jul 20, 2018

STRAW POLL ON DEFINITION OF rangeIncludes

Do you agree with the following definitions along the following lines (possibly modified in light of Antoine's suggestion):

  • rangeIncludes (the definition currently used in DCMIMT and in the ISO draft):

    Class of which a value described by the term may be an instance.

  • domainIncludes:

    Class of which a resource described by the term may be an instance.

@tombaker
Copy link
Collaborator Author

STRAW POLL ON ALTERNATIVE DEFINITION OF rangeIncludes

Do you agree with definitions based on Schema.org:

  • rangeIncludes:

    Relates a property to a class that constitutes (one of) the expected type(s) for values of the property.

  • domainIncludes:

    Relates a property to a class that is (one of) the type(s) the property is expected to be used on.

@tombaker
Copy link
Collaborator Author

STRAW POLL ON DEFINITION OF range

The current ISO draft defines range as:

class of which a value described by the term is an instance

Antoine suggests that rdfs:range actually means something more like:

class of which any object of the statement using the property is an
instance

Is Antoine's definition an improvement?

@sruehle
Copy link
Collaborator

sruehle commented Jul 20, 2018

rdfs:range with rdfs:Literal statements

As far as I see, the properties used with rdfs:Literal are the date properties and the titles. In both cases I would prefer rangeIncludes, because with rangeIncludes we would recommend to use a literal value but also allow people to use URIs if it is necessary.
An example for dates used with URIs are geologic time scales. If I want to say that stones were created in Pleistocene, I would prefare to use a URI.
And in some cases (e.g. art history or anthropology) you may need to say more about a title - e.g. the time it was valid, the context it was used etc.
If you think that these examples are too sophisticated and out of scope of the dcterms vocabulary, I can live
with rdfs:range. But if we want to allow more complex data modelling with dcterms I think rangeIncludes makes
more sense.

@sruehle
Copy link
Collaborator

sruehle commented Jul 20, 2018

Definition of rangeIncludes

I prefer Antoine's definition. So the definition will be
"Class of which any object of the statement using the property may be an instance" ?

@kcoyle
Copy link
Collaborator

kcoyle commented Jul 20, 2018

The statements that begin "class of which" are very hard to parse - I have to read them multiple times and still they don't stick. (I think this is an English problem - I can see how it might work ok in other languages.) We need something more like the schema.org statement which is direct, not contorted.

Other suggestions:
rangeincludes: The value of this property is expected to be an instance of this class or one of these classes.
domainIncludes: Use of this property implies that the subject of the triple is expected to be an instance of this class or one of these classes.

@aisaac
Copy link
Collaborator

aisaac commented Jul 25, 2018 via email

@kcoyle
Copy link
Collaborator

kcoyle commented Jul 25, 2018

Thanks, @aisaac. I'm going to add something here which we can decide to ignore... the use of rangeIncludes/domainIncludes to my mind is making a false equivalence or extension of rdfs:range/domain. In fact, what we are doing is the opposite of RDF:
RDF domain determines the class type of a subject in a triple via inference
RDF range defines the class type of the value via inference

Neither of these acts as a constraint on the domain or range of a property, and neither is advice to the metadata creator (although we almost always read it that way).

What schema.org does is it uses xIncludes to advise metadata creators on best practices for properties. It also treats those choices as OR rather than UNION. (I may have that wrong, my formal logic is minimal.) This is really unrelated to the semantics of RDF domain and range.

I say this because I think we want to be clear that we are providing something very different from RDF's domain and range. I would like to see the RDF world adopt something other than xIncludes, and I think we should think about this as we do more work on application profiles. I would like to have properties that have the meaning of "suggested value types" "suggested class inclusion" or something like that, and that are clearly differentiated from domain and range. Among other reasons, there may be cases in which you want to define both (dubious? thinking ...)

All of this to say that I do want us to use "suggested" or "expectation" in our definitions because that is the actual meaning of the property. We do not want to imply that any inferencing on these values would produce the kind of semantics that one would expect with domain and range, including their use in query (SPARQL).

@sarahartmann
Copy link
Collaborator

All of this to say that I do want us to use "suggested" or "expectation" in our definitions because that is the actual meaning of the property.

+1

@kcoyle
Copy link
Collaborator

kcoyle commented Sep 7, 2018

Relates a property to a class that is an expected type for the value of the property.

@kcoyle
Copy link
Collaborator

kcoyle commented Sep 7, 2018

rangeincludes: The value of this property may be an instance of this class
domainIncludes: Use of this property implies that the subject of the triple may be an instance of this class

It is recommended that values of this property be members of this class.

@tombaker
Copy link
Collaborator Author

tombaker commented Sep 7, 2018

rangeIncludes: Instances of this class are appropriate as values for the property.

@aisaac
Copy link
Collaborator

aisaac commented Sep 7, 2018

Trying to keep the notion of expectation in @kcoyle comment above, but fitting to the case of several rangeIncludes:
Relates a property to a class that is one of the possible types expected for the value of the property.

@osma
Copy link
Collaborator

osma commented Sep 7, 2018

My rewording of @aisaac's above proposal:

Relates a property to a class whose instances are appropriate values for the property.

@aisaac
Copy link
Collaborator

aisaac commented Sep 7, 2018

@osma I still would fight for some idea of 'expectation' to be included.
And 'appropriate' is maybe too strong. I mean, it sounds ok in the affirmative sense, but if one turns it to the negative form, the sentence then sounds very strong to my hears. I.e., instances of classes that are not in the rangeIncludes could then be said to be 'not appropriate', and that can be interpreted as 'not allowed'. In fact that would be even stronger than what our original rdfs:range statement formally amount to (since rdfs:range is formally not a very strong constraint).

@osma
Copy link
Collaborator

osma commented Sep 7, 2018

@aisaac I don't quite follow...you can't just turn the parts of a statement like this into their negatives and retain the meaning. Saying that "X is an appropriate value type for the property Y" does not entail that "everything not X is not an appropriate value type for Y", just like "Socrates is human" does not entail "everything that is not Socrates is not human".

I'm thinking about this from an Open World Assumption perspective, where each statement represents a part of the truth, but does not preclude other similar statements to be true as well (as long as they don't contradict each other). So each rangeIncludes statement should be true on its own, irrespective of other rangeIncludes statements that may apply to the same property. Maybe the wording I suggested does not convey this in an ideal way, but that's what I was aiming at.

Regarding rdfs:range, I consider it quite strong, despite it not being intended for validation. With "Y rdfs:range X", every value that is used with the property Y can be inferred to be an instance of X. This usually means that there can be only one value for rdfs:range for a particular property. For example, with dct:creator rdfs:range ex:Person, ex:Organization, this would mean that every creator is both a person and an organization. I would like to define rangeIncludes and domainIncludes in a way that permits them having multiple values (as separate, independent statements) without causing nonsensical situations like that.

@aisaac
Copy link
Collaborator

aisaac commented Sep 7, 2018

@osma I beg to disagree in this specific case. Having a definition reading "Here's a property P. Class C is appropriate for P" is quite different from "Here's Socrates. Socrates is a human".
I agree with the open world reading, but intuitively the first sentence is going to cast a big shadow of doubt on the appropriateness of everything that is not C. And intuition matters here, see how the intuition of the wording for rdfs:range mislead many to understand it is stronger than what it is in reality.

@osma
Copy link
Collaborator

osma commented Sep 7, 2018

@aisaac Point taken. I guess we have to find another formulation that cannot be misinterpreted that way. How about:

Relates a property to a class whose instances may be used as values for the property.

@aisaac
Copy link
Collaborator

aisaac commented Sep 7, 2018

@osma I miss very much the notion of 'expectation' or 'suggestion' in this. 'May' could be anything...
How about:
Relates a property to one of the suggested classes whose instances may be used as values for the property.
This would also capture the 'OR' semantics of several rangeIncludes statements for one property.

@kcoyle
Copy link
Collaborator

kcoyle commented Sep 8, 2018

I like Tom's suggestion: rangeIncludes: Instances of this class are appropriate as values for the property

And we could substitute "suggested" for "appropriate". I agree with Antoine that "appropriate" has the sense that anything else is "inappropriate" which is very close to "not valid" in meaning. Not being suggested just means that it isn't preferred. In fact, we could use "preferred":

rangeIncludes: Instances of this class are preferred as values for the property
(Or: Preferred values of this property are ones that are instances of this class.) (But I think Tom's is better, with "preferred" substituted.)

I'm not happy with the "relates" statements. Every property "relates" so that isn't necessary.

@tombaker
Copy link
Collaborator Author

tombaker commented Nov 2, 2018

    Relates a property to a class whose instances may be used as
    values for the property.

    Relates a property to one of the suggested classes whose
    instances may be used as values for the property.

    Instances of this class are appropriate as values for the property.

    Instances of this class are preferred as values for the property.

   Relates a property to a class that constitutes (one of) the expected 
   type(s) for values of the property. (Schema.org)

 Instances of this class are suggested as values for the property.

  Instances of this class are one of the [appropriate|preferred|suggested]
  types(s) for values of the property.

@tombaker
Copy link
Collaborator Author

tombaker commented Nov 2, 2018

PROPOSED

  Suggested

@tombaker
Copy link
Collaborator Author

tombaker commented Nov 2, 2018

PROPOSED

Instances of this class are one of the suggested
types for values of the property.

@tombaker
Copy link
Collaborator Author

tombaker commented Nov 2, 2018

PROPOSED

   Values of the property may be instances of this class.

@tombaker
Copy link
Collaborator Author

tombaker commented Nov 2, 2018

PROPOSED

   It is suggested that values of this property be instances of this class

@tombaker
Copy link
Collaborator Author

tombaker commented Nov 2, 2018

PROPOSED

    Instances of this class are suggested as values for the property.

@tombaker
Copy link
Collaborator Author

tombaker commented Nov 2, 2018

PROPOSED

  A suggested class for values of this property.

@tombaker
Copy link
Collaborator Author

tombaker commented Nov 2, 2018

PROPOSED

 A suggested class for subjects of this property.

@tombaker
Copy link
Collaborator Author

tombaker commented Nov 2, 2018

Resolved!

@aisaac
Copy link
Collaborator

aisaac commented Nov 3, 2018

This was not closed in fact, and I'm not sure it shall be. Out of the initial points, we have agreed on having domainIncludes and rangeIncludes, we've agreed on a definition for them, but have we agreed to applying them to all properties that have a domain and a non-literal range, without any exception?
I have voiced some concerns about the third point and am reiterating them here.
But for the sake of progressing on issues and clarifying the discussion maybe we can discuss on a separate issue, which I'm going to create.

@tombaker
Copy link
Collaborator Author

tombaker commented Nov 8, 2018

Closed - discussion continues in Issue #59

@tombaker
Copy link
Collaborator Author

@niklasl Am reopening this issue in light of schemaorg/schemaorg#3442, where @hoijui suggests mapping schema:rangeIncludes to dcam:rangeIncludes (and ditto for schema:domainIncludes) via owl:equivalentProperty and @danbri points out that they have subtly different meanings.

@kcoyle
Copy link
Collaborator

kcoyle commented Jan 12, 2024

I commented there, and in fact I don't think that the differences are all that subtle. schema.org defines domains and ranges for its xIncludes properties that are quite specific to schema. DC's properties have no domains or ranges defined, making them wide open in terms of those aspects. In RDF terms, that's really apples and oranges.

As I say there:

(I do think it unwise to adopt vocabulary property names into different vocabularies but with different meanings (even if subtle). The difference in namespace is obvious to machines but we humans tend to jump to conclusions based on human-understandable text.)

@hoijui
Copy link

hoijui commented Jan 13, 2024

My main motivation is, to aide ontology designers (like me) as much as possible, to figure out which they should use,
and secondary - if possible - to connect the two in a machine-readable way - that makes most sense.

As I understand right now (after @kcoyle s explanations), the dcmi versions make sense for pretty much anyone, while the schema versions make sense only for ontologies that use schema as their base ... right?
So would it make sense for the schema versions to be defined as owl:subPropertyOf the dcmi versions?
Or should they simply be marked as "unequal" in some machine-readable way, and then annotated with an extra comment (for humans), explaining when to use which?

@niklasl
Copy link
Contributor

niklasl commented Jan 13, 2024

Yes, this difference is crucial. The DCAM variants are intentionally unconstrained, so they cannot be subproperties of the more constrained Schema.org variants.

It would be great if Schema.org defined those to be rdfs:subPropertyOf the DCAM variants, to aid in ontology interoperability (for both humans and machines).

Process-wise it may be considered "backwards", but such is occasionally the case (and in the future more commonly so, I hope). An "application ontology" defines properties for specific needs. Sometimes those needs would be great to generalize just a tiny bit, defined in a "shared ontology", which the application-centric one can then, as a supportive act of the general case, formally align to. (I believe this is quite like what we hope for with openWEMI, for instance.)

@kcoyle
Copy link
Collaborator

kcoyle commented Jan 13, 2024

Now I have to rebut my own comment because openwemi is using the same class names as FRBR. I still think that will work because FRBR classes are not actually in use. The words, especially "Work", are common and not exclusively property names, like "rangeIncludes". However I still think that thought must be given to how humans AND machines act on defined elements, especially since there is an explosion of terms being defined in the RDF space.

@philbarker
Copy link

I commented there, and in fact I don't think that the differences are all that subtle. schema.org defines domains and ranges for its xIncludes properties that are quite specific to schema.

But the schema:Property and schema:Class classes that are suggested domain and range of their ___Includes properties are declared as equivalent to rdfs:Class and rdfs:Property. So

schema:rangeIncludes  
    schema:domainIncludes schema:Property ;
    schema:rangeIncludes schema:Class .

is not particularly limiting. I find it hard to think of uses of the dct: properties that wouldn't honour that suggestion.

That said, I wouldn't mind the superabundance of caution that the rdfs:subPropertyOf provides, because many things that I find hard to think of turn out to exist.

@hoijui
Copy link

hoijui commented Jan 15, 2024

Both rdfs:range and rdfs:domain already have rdfs:Class as their rdfs:range. I would assume it thus makes sense to have this limitation for their *Includes counterparts as well.

@tombaker
Copy link
Collaborator Author

@hoijui To use a turn of phrase from Alistair Miles, I think of rdfs:range and rdfs:domain as granting a "license to infer" that the object (or subject) of a triple using the given property is an instance of the given range class (or domain class). Defining schema:rangeIncludes and schema:domainIncludes as having an rdfs:range of rdfs:Class would likewise give "license to infer" that the object is an instance of a class.

But what if a URI used as the object of a rangeIncludes assertion were not explicitly defined (or perhaps even intended) by its owner to be a class? Would we want to license that inference anyway?

Put another way, would we really want to implicitly limit rangeIncludes to be used with classes? Could a SKOS concept serve as the object just as well?

I see utility in a notion of rangeIncludes defined as something like a type hint in Python. In Python, a type hint conveys intention in a way that is helpful to a reader of some code or user of an IDE. But while type hints can be enforced by tools, Python itself does not actually block a user from assigning an integer to a variable type-hinted to be a string. I like the idea of having such type hints for RDF too.

@kcoyle
Copy link
Collaborator

kcoyle commented Jan 15, 2024

@tombaker @hoijui I can understand Tom's view that "limiting" to classes (aka URIs) is a limitation, although his argument could also be read as "no problem, because anything can be a class" and "this only matters when doing inferencing."

But I am still bothered by the use of the rangeIncludes/domainIncludes properties, which to me imply a kind of formality related to the RDF concept of range (which does require a class value).
I think it is significant that DCTerms definition for rangeIncludes is:

A suggested class for values of this property.

schema's is:

Relates a property to a class that constitutes (one of) the expected type(s) for values of the property.

The former reads like a note for humans, the latter is written as a formal statement. How does the human-facing definition interact with the RDF formal definition? I think the schema's definition can be read as potentially driving an application, whereas the DCTerms definition reads like a note for humans. It may be subtle, but there is a difference.

Schema is an "application vocabulary" as @niklasl calls it, and a vocabulary that is very self-contained, defining its own elements for Class and Property (as @philbarker shows). The "pseudo-formality" of schema.rangeIncludes and domainIncludes may allow or imply some kind of processing on values for those creating schema metadata.

I see DCTerms as being less prescriptive. If, in most cases, DCTerms is silent on a formal constraint on values, then I would prefer that this be expressed in something that a person would read as a note without the baggage of the schema usage.

Given that most users of these vocabularies will not see the definitions for rangeIncludes and domainIncludes in schema and DCTerms, they would logically assume that they are the same. Because they are not I think that using the same property names is confusing. Actually, that is precisely why we are here.

@danbri
Copy link

danbri commented Jan 15, 2024 via email

@hoijui
Copy link

hoijui commented Jan 15, 2024

Uff...
I am a software dev, and in my mind RDF, ontologies are like DB schemas for distributed data. I know that is not true, it is not what they are meant for, nor what they are formally and so on. It is just what I want to have a solution for in the end of the day, when walking away from this.
... The world is much simpler in my mind because of this.

Is it even an option, to change the names of the two DC properties (as I understood, that some of you think that might be a good way to go)?

@tombaker
Copy link
Collaborator Author

@hoijui As a point of procedure: to follow our namespace policy, we would not change the names of the two DC properties. However, we could in principle create new properties under a different name, use the new properties instead of the existing properties, and mark the existing properties as deprecated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants