Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What parts of term definitions are normative? #35

Closed
baskaufs opened this issue Apr 16, 2016 · 6 comments
Closed

What parts of term definitions are normative? #35

baskaufs opened this issue Apr 16, 2016 · 6 comments

Comments

@baskaufs
Copy link

baskaufs commented Apr 16, 2016

Currently, Darwin Core and Audubon Core have vocabulary list documents that are designated as "Type 1" (a.k.a. normative) based on the document typing system established in the old draft Documentation Specification. The new draft specification has done away with that system and allows human-readable documents to declare that particular sections or components (such as figures) to be normative or non-normative.

Based on the current practice, everything contained in the term list document (https://github.com/tdwg/dwc/blob/master/rdf/dwctermshistory.rdf and http://terms.tdwg.org/wiki/Audubon_Core_Term_List) should be considered normative, including examples and informative comments included in the term metadata. I believe that there has been some consensus in previous discussion that those examples and informative comments should not be considered normative, and that only the URI and definition should be considered normative (don't know about the label).

Should we specify this as a general practice? It should be possible to make this convention clear in the text of each human-readable term list. Since the specification as currently written would end the practice of having an RDF document being the normative document for standards, the question of how to express what RDF triples are normative would be moot.

see section 3.3.3.1 of the draft documentation specification

@tucotuco
Copy link
Member

Could a distinction be made that a machine-readable document can be
normative, but if so, it's entire content outside of commented text must be
normative? While human-readable documents can contain a mix, but anything
not explicitly stated as normative is not normative?

Why? Because I am not sure that the following statement from Section 3 is
necessarily true:

"Determining what is necessary to comply with a standard is necessarily a
human activity. Therefore, the normative content of the standard should be
contained exclusively in human-readable documents."

It may be a human activity to define what is normative, but at some point,
or at some level, machines probably need to be able to test whether an
artifact claiming to be compliant is so.

In the alternative scenario, what will have to happen to make existing
standards compliant? What will be their status until they are compliant?

On Sat, Apr 16, 2016 at 2:16 PM, Steve Baskauf notifications@github.com
wrote:

Currently, Darwin Core and Audubon Core have vocabulary list documents
that are designated as "Type 1" (a.k.a. normative) based on the document
typing system established in the old draft Documentation Specification. The
new draft specification has done away with that system and allows
human-readable documents to declare that particular sections or components
(such as figures) to be normative or non-normative.

Based on the current practice, everything contained in the term list
document (https://github.com/tdwg/dwc/blob/master/rdf/dwctermshistory.rdf
and http://terms.tdwg.org/wiki/Audubon_Core_Term_List) should be
considered normative, including examples and informative comments included
in the term metadata. I believe that there has been some consensus in
previous discussion that those examples and informative comments should not
be considered normative, and that only the URI and definition should be
considered normative (don't know about the label).

Should we specify this as a general practice? It should be possible to
make this convention clear in the text of each human-readable term list.
Since the specification as currently written would end the practice of
having an RDF document being the normative document for standards, the
question of how to express what RDF triples are normative would be moot.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#35

@ramorrismorris
Copy link
Contributor

@tucotuco asks: "In the alternative scenario, what will have to happen to make existing standards compliant? What will be their status until they are compliant?" It seems that W3 has specifications for the answers to these question, but such specs do not seem to be public....???

@baskaufs
Copy link
Author

baskaufs commented May 3, 2016

I think that my belief that the normative content of vocabulary standards should be human readable comes from what I've come to understand by studying what is and isn't possible to achieve with machine processing. The RDF 1.0 Semantics document [1] sums the situation up like this:

Exactly what is considered to be the 'meaning' of an assertion in RDF or RDFS in some broad sense may depend on many factors, including social conventions, comments in natural language or links to other content-bearing documents. Much of this meaning will be inaccessible to machine processing...

The chief utility of a formal semantic theory is not to provide any deep analysis of the nature of the things being described by the language or to suggest any particular processing model, but rather to provide a technical way to determine when inference processes are valid, i.e. when they preserve truth.

We can add a lot of OWL markup to term definitions, but that is basically going to accomplish one of two things:

  1. entail other triples
  2. destroy the consistency of the graph if the terms are used in the "wrong" ways

Neither of these things is ever going to make the machine client actually "understand" what the terms mean. The actual "meaning" of the terms is going to be encapsulated by the human-readable text found in the rdfs:comment values, and those comments are just going to be the same text that is in the human-readable HTML web page or PDF version of the vocabulary document.

So I suppose we could declare that the triples containing rdfs:comment as a predicate are normative. But what would that accomplish? Make people struggle to look at Turtle or XML files to figure out what is going on? A machine client is going to get absolutely nothing from that triple.

I suppose this is a philosophical discussion that is way beyond the scope of a Issues Tracker comment. But certainly in the case of Darwin Core, aside from a few subproperty declarations, there is nothing in the RDF that would help a machine client understand what terms mean - it's only the human-readable literals that provide the meaning.

[1] http://www.w3.org/TR/2004/REC-rdf-mt-20040210/#intro

@nfranz
Copy link

nfranz commented May 3, 2016

Not being a computer scientist, this nevertheless seems to strongly worded
to me.

Certain types of information, I suppose by virtue of being encoded by
humans in the right ways, are amenable to knowledge representation and
reasoning. Logic reasoners can process this information to infer additional
(implied) knowledge. Is that not a more productive way to set the bar for
machine processing?

Best, Nico

On Mon, May 2, 2016 at 6:53 PM, Steve Baskauf notifications@github.com
wrote:

I think that my belief that the normative content of vocabulary standards
should be human readable comes from what I've come to understand by
studying what is and isn't possible to achieve with machine processing. The
RDF 1.0 Semantics document [1] sums the situation up like this:

Exactly what is considered to be the 'meaning' of an assertion in RDF or
RDFS in some broad sense may depend on many factors, including social
conventions, comments in natural language or links to other content-bearing
documents. Much of this meaning will be inaccessible to machine
processing...

The chief utility of a formal semantic theory is not to provide any deep
analysis of the nature of the things being described by the language or to
suggest any particular processing model, but rather to provide a technical
way to determine when inference processes are valid, i.e. when they
preserve truth.

We can add a lot of OWL markup to term definitions, but that is basically
going to accomplish one of two things:

  1. entail other triples
  2. destroy the consistency of the graph if the terms are used in the
    "wrong" ways

Neither of these things is ever going to make the machine client actually
"understand" what the terms mean. The actual "meaning" of the terms is
going to be encapsulated by the human-readable text found in the
rdfs:comment values, and those comments are just going to be the same text
that is in the human-readable HTML web page or PDF version of the
vocabulary document.

So I suppose we could declare that the triples containing rdfs:comment as
a predicate are normative. But what would that accomplish? Make people
struggle to look at Turtle or XML files to figure out what is going on? A
machine client is going to get absolutely nothing from that triple.

I suppose this is a philosophical discussion that is way beyond the scope
of a Issues Tracker comment. But certainly in the case of Darwin Core,
aside from a few subproperty declarations, there is nothing in the RDF that
would help a machine client understand what terms mean - it's only the
human-readable literals that provide the meaning.

[1] http://www.w3.org/TR/2004/REC-rdf-mt-20040210/#intro


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#35 (comment)

@baskaufs
Copy link
Author

baskaufs commented May 3, 2016

I have a response to @nfranz 's comment, but I'm thinking this comment box isn't the place for it. If I have time, I'll put it in a blog post and link to it.

@baskaufs
Copy link
Author

Here is some text from the 2016-05-04 meeting notes about this issue:

There are essentially three separate but related processes going on here:

  1. A documentation process (described by the documentation specification), which includes demarcating what is normative and non-normative in standards documents, how version information is recorded, and how versions are connected to each other and to their current resources. It does not stipulate what should be normative or not, nor how versioning should be managed.
  2. A vocabulary maintenance process (described by the vocabulary maintenance specification) which includes decision-making about whether changes should be made to vocabularies and terms within them. This specification would presumably trigger varying levels of oversight depending on the extent to which the proposed changes would affect stability and interoperability of the vocabulary.
  3. A vocabulary management process (possibly described by some document, but if so, not one that is included as part of a standard) that would include practical aspects of managing documents, endpoints, GitHub repos, etc., and that would involve generating new versions and representations of documents, and releases of standards “bundles” of documents. The changes that take place would be documented by Complete draft Standards Documentation Standard #1 and in some cases triggered by Complete draft Vocabulary Maintenance Specification #2, but the management of those changes would be dictated by practicalities, not by prescribed rules.

Maintaining separation among these three processes, would make completing the two standards tractable. The complications involved in any one of these three processes would not necessarily impede description of the other two.

Given that understanding of the situation, these would be the implications for Issue #35 (normative parts of term definitions):

Section 3.2.1 of the current draft documentation specification says that authors of descriptive documents must indicate which parts of the document are normative and which (if any) are not. Vocabulary descriptions are specified as a special category of descriptive documents, so the same thing applies to them. Machine-readable representations of descriptive documents provide metadata about the documents, but do not include the full content of the document, so issues of normative vs. non-normative content do not apply to them. Machine-readable representations of vocabularies (in the form of terms and term lists) should include what is essentially the same information as is included in the human-readable representations.

Because of the effective identicalness of human- and machine-readable vocabulary representations, whatever designation of normative vs. non-normative that is made in the human-readable representation should also be made in the machine-readable representation. For example, if the human-readable vocabulary term list document states that the definitions are normative but that the comments are not, then the machine-readable description of the term list should include the same statement in an rdfs:comment value.

I’ve added section 4.4.2.1 and an example in 4.4.2.2 to clarify this. I have also removed the text from section 3 that declared that normative content is found only in human-readable documents. I think that these actions address this issue. The spec defines "normative" and "non-normative" but to be telling authors what should and should not be normative is basically out of the scope of the Documentation Spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants