TTL version of the vocabulary needs improvement #84

reckart · 2019-03-11T17:32:37Z

It seems to me as if the TTL version (or maybe all LD versions) of the LAPPS vocabulary could use some refactoring.

My understanding is that these should represent a schema (based on OWL and/or RDFS). As such, the LAPPS types would be classes (rdfs:class or owl:class) and their attributes should be properties (rdfs:Property, owl:DatatypeProperty or owl:ObjectProperty).

Let's take http://vocab.lappsgrid.org/Token as an example. The current TTL file says:

<http://vocab.lappsgrid.org/Token>
        a                owl:Class , rdfs:Class , rdfs:Resource ;
        rdfs:comment     "A string of one or more characters that serves as an indivisible unit for the purposes of morpho-syntactic labeling (part of speech tagging)." ;
        rdfs:subClassOf  <http://vocab.lappsgrid.org/Region> , <http://vocab.lappsgrid.org/Token> , <http://vocab.lappsgrid.org/Annotation> , <http://vocab.lappsgrid.org/Thing> ;
       <http://vocab.lappsgrid.org/Token#pos>
                "String or URI" .

<http://vocab.lappsgrid.org/Token#pos>
        a             owl:DatatypeProperty ;
        rdfs:comment  "Part-of-speech tag associated with the token." .

The inheritance information is highly redundant. The triple <http://vocab.lappsgrid.org/Token> <http://vocab.lappsgrid.org/Token#pos> "String or URI" does not express in RDFS or OWL that Token has an attribute called pos which can take a String or URI.

I believe a better representation would be e.g.

<http://vocab.lappsgrid.org/Token>
        a                owl:Class ;
        rdfs:comment     "A string of one or more characters that serves as an indivisible unit for the purposes of morpho-syntactic labeling (part of speech tagging)." ;
        rdfs:subClassOf  <http://vocab.lappsgrid.org/Region> ;

<http://vocab.lappsgrid.org/Token#pos>
        a             owl:DatatypeProperty ;
        rdfs:comment  "Part-of-speech tag associated with the token." ;
        rdfs:domain <http://vocab.lappsgrid.org/Token> ;
        rdfs:range xsd:string .

I removed the (inferred) redundant information from the a and rdfs:cubClassOf statements and rendered the value type information as rdfs:range.

However, there is still a little problem here: it does not express that the range can be a "String or URI" - specifying multiple types as range indicates an intersection of the types (which would be empty in this case), not a disjunction. That is why I only put the "more generic" type xsd:stringhere.

The text was updated successfully, but these errors were encountered:

ksuderman · 2019-03-12T00:50:39Z

The RDF, OWL, JSONLD, and TTL files are generated by Apache Jena from the same data model and I notice that the OWL, JSONLD, and TTL files all contain redundant inheritance declarations while the RDF file does not. The only difference between how the files were generated is the value of the RDFFormat parameter to the RDFDataMgr.write() method. OntClass.setSuperClass(Resource) is only being called once. We are using an old version of Jena so hopefully simply updating the dependency will correct this.

The code that generates the property definitions is just plain buggy.

Both issues will be fixed in lappsgrid-incubator/vocabulary-dsl#10

ksuderman · 2019-03-17T17:38:10Z

@reckart I have deployed a test version to http://vocab.lappsgrid.org/1.3.0-SNAPSHOT for comment and review. In particular the RDF files are at http://vocab.lappsgrid.org/1.3.0-SNAPSHOT/lapps-vocabulary.ttl (et al).

All of the generated RDF files had the same redundant information as the default Jena model uses a Reasoner that generates all the triples it can infer. The redundant triples are removed by specifying a model that does not do inferencing.

The domain and range of properties should now be specified correctly.

Note There are two definitions for Morphology included (http://vocab.lappsgrid.org/1.3.0-SNAPSHOT/Morphology and http://vocab.lappsgrid.org/1.3.0-SNAPSHOT/Token#morph). These are included only to test the schema generation and file deployment and do not represent how the WSEV may eventually represent morphological annotations.

ksuderman · 2019-03-17T20:00:12Z

NOTE Updated URLs now contain -SNAPSHOT

ksuderman mentioned this issue Mar 12, 2019

Fix RDF/OWL/TTL property definitions. lappsgrid-incubator/vocabulary-dsl#10

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TTL version of the vocabulary needs improvement #84

TTL version of the vocabulary needs improvement #84

reckart commented Mar 11, 2019 •

edited

ksuderman commented Mar 12, 2019

ksuderman commented Mar 17, 2019 •

edited

ksuderman commented Mar 17, 2019

TTL version of the vocabulary needs improvement #84

TTL version of the vocabulary needs improvement #84

Comments

reckart commented Mar 11, 2019 • edited

ksuderman commented Mar 12, 2019

ksuderman commented Mar 17, 2019 • edited

ksuderman commented Mar 17, 2019

reckart commented Mar 11, 2019 •

edited

ksuderman commented Mar 17, 2019 •

edited