Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: add properties that represent taxon relationships #225

Closed
camwebb opened this issue Jul 1, 2020 · 50 comments · Fixed by #226
Closed

Proposal: add properties that represent taxon relationships #225

camwebb opened this issue Jul 1, 2020 · 50 comments · Fixed by #226
Labels
enhancement New feature or request

Comments

@camwebb
Copy link
Member

camwebb commented Jul 1, 2020

(Following from a short discussion at today's 2020-06-30 video call...)

In TCS1, the TDWG TaxonConcept ontology, and our draft Terms document, the only way to express the relationship between two taxon name usages (or taxon concepts) is via the creation of an instance of the TaxonRelationshipAssertion class. While this reification of the relationship is the only way to attach an accordingTo (at least in a table or in RDF), it is cumbersome (see, e.g., this figure). The accordingTo is also redundant in the case of a dataset representing a taxonomic monograph, where all such assertions are being made by a single author in a single publication. For simpler modelling of TNU relationships, I suggest that some users would value a standardized alternative in which the relationship types were themselves properties of a TNU. For example:

:TNU1 tcs2:is_congruent_to :TNU2 .
:TNU1 tcs2:includes :TNU3 .

These new properties would be added to the properties section of the TCS2 Terms document.

@camwebb camwebb added the enhancement New feature or request label Jul 1, 2020
@mdoering
Copy link

mdoering commented Jul 1, 2020

It opens up the standard for alternative representations. This shifts the problem to consumers that now have to check different places for the same information. I personally think this is a bigger problem than for the producer, so I would try to avoid this as much as possible. It is also an issue you find in DwC, ABCD and TCS1 where there are parsed and unparsed strings for the same thing. The more options a standard has, the less of a standard it is :)

@camwebb
Copy link
Member Author

camwebb commented Jul 1, 2020

@mdoering Good point. I concede that for standardization of data transfer, having a single way to 'officially' do something is best.

@camwebb camwebb closed this as completed Jul 28, 2020
@nielsklazenga nielsklazenga transferred this issue from tdwg/tnc Jul 1, 2023
@nielsklazenga nielsklazenga reopened this Jul 21, 2023
@nielsklazenga nielsklazenga changed the title Proposal: add properties that represent TNU relationships Proposal: add properties that represent taxon relationships Jul 21, 2023
@nielsklazenga
Copy link
Member

nielsklazenga commented Jul 21, 2023

I transferred this issue from the TNC repo. a few weeks ago, because I think we should consider making the relationship types TCS properties, rather than controlled terms in an external vocabulary. I actually think we should do it, but I have been flip-flopping around on this issue a bit (privately; publicly I think this is the first time I am saying something about it). At the moment I even think we can do without the Taxon Relationship class altogether, but we might need it again when we want to add additional properties like whether the statement reflects the ostensive or intensional component of the relationship. On the other hand, that might be the time to add the class, rather than now.

Having the relationship types as TCS properties enables us to do things like this:

<https://id.biodiversity.org.au/instance/apni/545068> a tcs:TaxonConcept ;
  dcterms:title "Euphrasia gibbsiae sec. Barker 1982" ;
  tcs:accordingTo <https://www.jstor.org/stable/23873848> ;
  tcs:acceptedName <https://www.ipni.org/n/802545-1> ;
  tcs:synonym <https://www.ipni.org/n/802619-1> ;
  tcs:intersects <https://id.biodiversity.org.au/instance/apni/713514> .

<https://id.biodiversity.org.au/instance/apni/713514> a tcs:TaxonConcept ;
  dcterms:title "Euphrasia gibbsiae sec. Curtis 1967" ;
  tcs:accordingTo <https://id.biodiversity.org.au/reference/apni/23028> ;
  tcs:acceptedName <https://www.ipni.org/n/802545-1> .

... and, for a third-party taxon relationship assertion, in the body of an annotation or a nano-publication:

_:someone_elses_assertion a rdf:Statement ;
  rdf:subject <https://id.biodiversity.org.au/instance/apni/545068> ;
  rdf:predicate tcs:includes ;
  rdf:object <https://id.biodiversity.org.au/instance/apni/713514> .

I think I might be committing RDF sacrilege with the latter example, but this is just how I think it could work and someone else can correct me.

I think doing this makes TCS 2 more simple and flexible and more like a proper TDWG vocabulary standard, i.e. a set of terms and definitions.

I acknowledge that it is rather late in the game to make significant changes, so if we cannot quickly agree on this we will table it, but if we can we might as well get it right from the start, so let's at least talk about it. At least at the moment I think this is right. Defining taxon relationship assertions as "Things" has always bothered me for some reason – and we have had some issues with naming the class (#43) – and now I think that might be because they are mere data structures and should not have a domain-specific class. Relationships between objects are generally recorded using properties. I also think it is not that much of a change and possibly a big improvement.

@camwebb
Copy link
Member Author

camwebb commented Aug 4, 2023

I've always disliked the existing implementation of TC Relationships with the 'From' and the 'To' TCs, and agree in simple cases making the relationship types into different properties would be cleaner and simple. But... in our case, almost always we want to encode the 'according to' of the relationship - who asserted it. Which would mean reifying the statement every single time. I think the essence of the issue this: will there almost always be metadata about the TC relationship that needs to be encoded in the data transmission model? If so, we should probably stick with what we have.

The problem is a generic one with modelling in RDF. Graph DBs (and their ASCII encodings) allow edges to have properties, which permits a more 'natural' graph. In RDF you always need an intermediate node to hang edge metadata on. I'm sure there has been extensive discussion of the pros and cons of writing a schema such that reification must always be used (for full data transmission). In practice, I haven't seen reification used as a core technology.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 5, 2023

Hi @camwebb, have a look at the examples I put in https://github.com/tdwg/tcs2/tree/examples-stash/docs/examples/taxon-relationships-redone. As this branch will disappear at some point, I will repeat some here for the sake of longevity.

So, a Taxon Concept with horizontal relationships between taxa looks like this:

@prefix tcs: <http://rs.tdwg.org/tcs/terms/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

[] a tcs:TaxonConcept ;
    dcterms:title "Campylium sec. Koperski et al. 2000" ;
    tcs:accordingTo <https://tropicos.org/reference/9022656> ;
    tcs:acceptedName <https://tropicos.org/name/35000208> ;
    tcs:synonym <https://tropicos.org/name/35101378> ,
            <https://tropicos.org/name/35101264> ;
    tcs:isConguentWith [ a tcs:TaxonConcept ;
            dcterms:title "Campylium sec. Corley et al. 1981" ;
            tcs:accordingTo <https://tropicos.org/reference/9004554> ;
            tcs:acceptedName <https://tropicos.org/name/35000208> ] ;
    tcs:includes [ a tcs:TaxonConcept ;
            dcterms:title "Campylium sec. Smith 1980" ;
            tcs:accordingTo <https://tropicos.org/reference/9007023> ;
            tcs:acceptedName <https://tropicos.org/name/35000208> ] ;
    tcs:partiallyOverlaps [ a tcs:TaxonConcept ;
            dcterms:title "Chrysohypnum sec. Moenkemeyer 1927" ;
            tcs:accordingTo <https://tropicos.org/reference/9021085> ;
            tcs:acceptedName <https://tropicos.org/name/35000256> ] ,
        [ a tcs:TaxonConcept ;
            dcterms:title "Drepanocladus sec. Moenkemeyer 1927" ;
            tcs:accordingTo <https://tropicos.org/reference/9021085> ;
            tcs:acceptedName <https://tropicos.org/name/35000413> ] ;
    tcs:intersects: [ a tcs:TaxonConcept ;
            dcterms:title "Campylium sec. Frahm & Frey 1992" ;
            tcs:accordingTo <https://tropicos.org/reference/9011497> ;
            tcs:acceptedName <https://tropicos.org/name/35000208> ] ;
    rdfs:comment """Weltweit über 30 Arten, in Europa 9 Arten (incl. *C. 
            laxifolium Engelmark & Hedenäs 1990) – Hedenäs (1997a) akzeptiert 
            auch die Gattungen Campyliadelphus und Campylophyllum und stellt 
            Campylium polygamum zur Gattung Drepanocladus.""" .

One of these relationships, with a comment, looks like this:

@prefix tcs: <http://rs.tdwg.org/tcs/terms/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

[] a rdf:Statement ;
    rdf:subject _:tc-2000-Campylium ;
    rdf:predicate tcs:includes ;
    rdf:object _:tc-1980-Campylium ;
    dcterms:source <https://tropicos.org/reference/9022656> ;
    rdfs:comment """excl. *C. longicuspis (Lindb. & Arnell) Hedenäs (syn. 
            *Amblystegium l. Lindb. & Arnell, *Drepanocladus l. (Lindb. & 
            Arnell) Broth., Pseudocalliergon l. (Lindb. & Arnell) Loeske), da 
            Hedenäs diese Kombination erst 1989 schuf. Es bleibt unklar zu 
            welcher Gattung SMITH dieses auf den Britischen Inseln nicht 
            nachgewiesene Taxon stellt""" .

I came across the RDF Reification Vocabulary serendipitously when I happened to read the RDF Schema specification, but it turns out @baskaufs had already suggested using it two years ago, in a presentation at TDWG 2021, to deal with dwc:ResourceRelationship statements in RDF (see here and here; I think the 'having your cake and eating it...' in the title is really apt).

Regarding your statement that you 'haven't seen reification used as a core technology', I think that is not true, as you are using reification whether you use the Vocabulary or not, so we were already using reification for the relationships between taxa. Especially when serialising as tabular data – and storing data in a relational database – reification is used all the time to resolve many-to-many relationships. Also, serialisation should not be a concern of a Vocabulary Standard, which is one of the reasons why we are having so much trouble with the dwc:ResourceRelationship (and by extension the tcs:TaxonRelationship) and it is also why we want to free TCS from its XML Schema. Your data might force you to use reification, but TCS should not.

I am not opposed to still having a TaxonRelationshipAssertion or TaxonConceptAlignment class in TCS, quite the opposite, as it has a special purpose for us. I think it is akin to creating a special Extension for the Darwin Core Archive, rather than using the ResourceRelationship extension (which is why I am in two minds about this, as that comes into Application Profile territory). This is something I want to discuss in the TG (in the first instance). It is the subjectTaxonConcept, relationshipType and objectTaxonConcept properties I really want to get rid of, as they are boilerplate (and syntactic sugar), which has no place in a domain Standard like TCS, I think.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 7, 2023

TCS 1 has TaxonRelationshipAssertion as a top level element (same level as TaxonConcept and TaxonName), so removing it would be a change to the Standard. Which makes the decision easy, I think.

With that, the proposal is simply as follows:

  1. Instead of having a separate controlled (SKOS) vocabulary for relationship types, define the relationship types as properties in TCS.
  2. Instead of defining the syntax properties subjectTaxonConcept, relationshipType and objectTaxonConcept in TCS, borrow the subject, predicate and object properties from the RDF namespace.

The first dot point is the original GitHub issue (#225), although I think that proposed to have both the SKOS vocabulary and the TCS properties. This brings TCS 2 closer to TCS 1, as in TCS 1 the taxon relationships could also be embedded in the Taxon Concept.

Additionally, rather than defining a TCS property relationshipAccordingTo, I would like to borrow dcterms:source from Dublin Core (@baskaufs uses prov:wasDerivedFrom in his Resource Relationship examples). I would also like to borrow dcterms:creator and dcterms:created for when there is no source as such.

So, the example before becomes:

@prefix tcs: <http://rs.tdwg.org/tcs/terms/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

_:tc-2000-Campylium a tcs:TaxonConcept ;
    dcterms:title "Campylium sec. Koperski et al. 2000" ;
    tcs:accordingTo <https://tropicos.org/reference/9022656> ;
    tcs:acceptedName <https://tropicos.org/name/35000208> ;
    tcs:synonym <https://tropicos.org/name/35101378> ,
            <https://tropicos.org/name/35101264> ;
    rdfs:comment """Weltweit über 30 Arten, in Europa 9 Arten (incl. *C. 
            laxifolium Engelmark & Hedenäs 1990) – Hedenäs (1997a) akzeptiert 
            auch die Gattungen Campyliadelphus und Campylophyllum und stellt 
            Campylium polygamum zur Gattung Drepanocladus.""" .

[] a tcs:TaxonRelationshipAssertion ;
    rdf:subject _:tc-2000-Campylium ;
    rdf:predicate tcs:includes ;
    rdf:object [ a tcs:TaxonConcept ;
            dcterms:title "Campylium sec. Corley et al. 1981" ;
            tcs:accordingTo <https://tropicos.org/reference/9004554> ;
            tcs:acceptedName <https://tropicos.org/name/35000208> ] ;
    dcterms:source <https://tropicos.org/reference/9022656> ;
    rdfs:comment """excl. *C. longicuspis (Lindb. & Arnell) Hedenäs (syn. 
            *Amblystegium l. Lindb. & Arnell, *Drepanocladus l. (Lindb. & 
            Arnell) Broth., Pseudocalliergon l. (Lindb. & Arnell) Loeske), da 
            Hedenäs diese Kombination erst 1989 schuf. Es bleibt unklar zu 
            welcher Gattung SMITH dieses auf den Britischen Inseln nicht 
            nachgewiesene Taxon stellt""" .

...and assertions without a source look like:

@prefix tcs: <http://rs.tdwg.org/tcs/terms/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

[] a tcs:TaxonRelationshipAssertion ;
    rdf:subject <https://avibase.ca/82745BAA> ;
    rdf:predicate tcs:includes ;
    rdf:object <https://avibase.ca/F558C7F9> ;
    dcterms:creator <https://orcid.org/0000-0003-2224-6821> ;
    dcterms:created "2023-07-31" .

[] a tcs:TaxonRelationshipAssertion ;
    rdf:subject <https://avibase.ca/82745BAA> ;
    rdf:predicate tcs:includes ;
    rdf:object <https://avibase.ca/A091D50A> ;
    dcterms:creator <https://orcid.org/0000-0003-2224-6821> ;
    dcterms:created "2023-07-31" .

[] a tcs:TaxonRelationshipAssertion ;
    rdf:subject <https://avibase.ca/F558C7F9> ;
    rdf:predicate tcs:isDisjointWith ;
    rdf:object <https://avibase.ca/A091D50A> ;
    dcterms:creator <https://orcid.org/0000-0003-2224-6821> ;
    dcterms:created "2023-07-31" .

@baskaufs
Copy link

baskaufs commented Aug 7, 2023

Just to follow up on what @nielsklazenga said, Darwin Core basically uses the reification model in the dwc:ResourceRelationship class. When we wrote the DwC RDF Guide, we deferred on how to represent those terms in RDF and as I suggested in that talk Niels mentioned, using the RDF subject, predicate, and object reification properties would be a way to express resource relationships without minting new DwC dwciri: properties that mean the same thing.

If Darwin Core were to adopt what I proposed, then following the approach that Niels suggests in TCS2 would result in consistent modeling across TDWG, which makes more sense to me than to develop yet another system that does the same thing as the ReasourceRelationship terms and the RDF reification vocabulary.

I get what @camwebb said about reification not often being used as a core technology. But for better or worse, TCS2 is basically being defined using RDF, so using terms from the core RDF vocabulary to define the relationships makes sense to me and should be understandable by people outside of our community. That would not stop anyone in their own graph database from collapsing down the relationships as direct edges with properties, since Niels is proposing minting properties for the relationships (tcs:includes, tcs:isDisjointWith, etc.).

@ghwhitbread
Copy link
Collaborator

ghwhitbread commented Aug 8, 2023

@nielsklazenga , let's try and avoid reification wherever possible (blank nodes too).

My preference is to continue with an extensible, generic Relationship class, sub-typed as necessary, that can be bundled or used stand-alone without reification.

A bodgied (ChatGPT) example for illustration, no attempt to be TCS or RDF compliant.

#### Define classes

tcs:TaxonConcept rdf:type rdfs:Class ;
    rdfs:label "Taxon Concept" .

#### Generic Relationship Class

tcs:Relationship rdf:type rdfs:Class ;
    rdfs:label "Generic Relationship" .

tcs:CongruentRelationship rdf:type rdfs:Class ;
    rdfs:subClassOf tcs:Relationship .

tcs:IncludedInRelationship rdf:type rdfs:Class ;
    rdfs:subClassOf tcs:Relationship .

tcs:RelationshipMetadata rdf:type rdfs:Class ;
    rdfs:label "Relationship Metadata" .

#### sub-typing hasRelationship if desired

tcs:hasCongruentRelationship rdf:type rdf:Property ;
    rdfs:subPropertyOf tcs:hasRelationship ;
    rdfs:domain tcs:TaxonConcept ;
    rdfs:range tcs:CongruentRelationship .


# Taxon Concepts and Relationships

<https://example.org/taxon-concept-1> a tcs:TaxonConcept ;
    dcterms:title "Campylium sec. Koperski et al. 2000" ;
    tcs:accordingTo <https://tropicos.org/reference/9022656> ;
    tcs:acceptedName <https://tropicos.org/name/35000208> ;
    tcs:hasCongruentRelationship [
        a tcs:CongruentRelationship ;
        tcs:object <https://example.org/related-taxon-concept-1> ;
        tcs:relationshipMetadata [
            a tcs:RelationshipMetadata ;
            rdfs:label "Congruence Relationship Metadata 1" ;
            dcterms:description "Metadata for the congruence relationship." ;
            # ... other metadata properties ...
        ]
    ] ;
    tcs:hasRelationship [
        a tcs:IncludedInRelationship ;
        tcs:object <https://example.org/related-taxon-concept-2> ;
        tcs:relationshipMetadata [
            a tcs:RelationshipMetadata ;
            rdfs:label "Included In Relationship Metadata 1" ;
            dcterms:description "Metadata for the included-in relationship." ;
            # ... other metadata properties ...
        ]
    ] ;
    # ... other properties ...

<https://example.org/related-taxon-concept-1> a tcs:TaxonConcept ;
    dcterms:title "Campylium sec. Corley et al. 1981" ;
    tcs:accordingTo <https://tropicos.org/reference/9004554> ;
    tcs:acceptedName <https://tropicos.org/name/35000208> ;
    # ... other properties ...

<https://example.org/related-taxon-concept-2> a tcs:TaxonConcept ;
    dcterms:title "Campylium sec. Smith 1980" ;
    tcs:accordingTo <https://tropicos.org/reference/9007023> ;
    tcs:acceptedName <https://tropicos.org/name/35000208> ;
    # ... other properties ...

<https://example.org/taxon-concept-3> a tcs:TaxonConcept ;
    dcterms:title "Chrysohypnum sec. Moenkemeyer 1927" ;
    tcs:accordingTo <https://tropicos.org/reference/9021085> ;
    tcs:acceptedName <https://tropicos.org/name/35000256> ;
    # ... other properties ...

<https://example.org/taxon-concept-4> a tcs:TaxonConcept ;
    dcterms:title "Drepanocladus sec. Moenkemeyer 1927" ;
    tcs:accordingTo <https://tropicos.org/reference/9021085> ;
    tcs:acceptedName <https://tropicos.org/name/35000413> ;
    # ... other properties ...

#### Relationship Statements

<https://example.org/relationship-statement-1> a tcs:Relationship ;
    tcs:subject <https://example.org/taxon-concept-1> ;
    tcs:predicate tcs:includes ;
    tcs:object <https://example.org/taxon-concept-2> ;
    dcterms:source <https://tropicos.org/reference/9022656> ;
    rdfs:comment """excl. *C. longicuspis (Lindb. & Arnell) Hedenäs (syn. 
            *Amblystegium l. Lindb. & Arnell, *Drepanocladus l. (Lindb. & 
            Arnell) Broth., Pseudocalliergon l. (Lindb. & Arnell) Loeske), da 
            Hedenäs diese Kombination erst 1989 schuf. Es bleibt unklar zu 
            welcher Gattung SMITH dieses auf den Britischen Inseln nicht 
            nachgewiesene Taxon stellt""" .

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 9, 2023

Note that my proposal (or @camwebb's proposal as adulterated by me) does not introduce reification to TCS or adds more reification to TCS. Whenever there is a relationship object, there is reification, no matter what namespace the properties are in. In fact this proposal reduces the amount of reification (and my original proposal removed it altogether) by not hard-coding it into TCS, but leaving it to the data and/or the implementation whether reification is needed.

The proposal also does not add any new terms to TCS. In fact, it reduces the number of terms defined in TCS by borrowing syntax terms from other specifications that are specifically designed for that.

A "generic Relationship class" is already in Darwin Core in the form of the dwc:ResourceRelationship, so would not be part of TCS. This was my original proposal. I do not think there is such a thing as a relationship object without reification.

So, the only question (regarding relationship classes) in the purview of the TCS 2 Task Group is whether or not we keep the TaxonRelationshipAssertion class (ignoring the issue of the name of the class for now). As TaxonRelationshipAssertion is a top-level element in TCS 1 and since we have translated those top-level elements to classes in TCS 2, there needs to be unanimous consent to remove it, so I think we have no choice but to keep it, although I personally think this sort of stuff does not belong in a Standard but in an Application Profile.

We can discuss whether we define the properties of the TaxonRelationshipAssertion class in TCS (like we're doing now) or borrow them from RDF and Dublin Core, but this is also not entirely up to us, as the final decision is not ours and I think people generally will prefer a TDWG wide solution rather than have different standards do different things. I think we should do whatever gives us the highest chance of ratification, so all our work has not been for nothing.

@ghwhitbread
Copy link
Collaborator

There is a big difference between using reification as a principle in concept modelling and the very specific use case in RDF for making statements about statements to capture edge metadata. A model that supports reified content does not need reification in RDF. This might sound trivial, but it has implications for what can be delivered using the model and the ease with which data can be queried and processed. Not least is @camwebb 's comment about best practices and then the model’s standing.

The advantage of the TaxonRelationshipAssertion model is that it can be extended with controlled vocabulary amendments rather than ontology change. I don’t imagine there will be support for 25 additional data properties for the “taxon” relationships we publish already.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 9, 2023

I do not think it gets any more general than RDF and your example looks just as RDF-y as mine but okay, so if we have the TaxonRelationshipAssertion class you do not want to use the RDF properties on it. My argument is that the TCS TaxonRelationshipAssertion is the same thing as the Darwin Core ResourceRelationship class, so I think it should be dealt with the same way. I think that makes it not our concern but that of the Darwin Core Maintenance Group or the TAG and I think especially the latter group is much better placed to discuss this.

Also, the authors of the Darwin Core RDF Guide, the Darwin Core and RDF/OWL Task Groups, when they could not immediately find a good way to deal with the ResourceRelationship class in RDF (they actually say that 'it is not clear whether dwc:ResourceRelationship instances make sense in the context of RDF' 1, refrained from minting dwciri equivalents for its properties. That is exactly what we would be doing by leaving the TaxonRelationshipAssertion class out of TCS for now, kicking the can down the road (or avoiding decisions that are very hard to roll back later). We are already doing the same thing with Character Circumscription and Specimen Circumscription. It's also what we have to do anyway if we cannot find agreement.

Anyway, the reification people are worried about is only in the first example, and I made very clear that it was just an example to show that we do not need a "Relationship" class (I also said that 'I think I might be committing RDF sacrilege...'). So, I am not sure how a dislike for RDF reification can be used to oppose my proposal. It actually sort of makes my point (or misses the point), especially as now the argument seems to be that reification is okay as long as you do not do it in RDF, so you can just use the dwc:ResourceRelationship and all I was doing was trying to describe that in RDF. If I had known how to do it correctly, I could also have used an anno:Annotation or np:NanoPublication. The point is that all we are talking about here is syntax—or data structures—and I do not think we should define syntax. That's what I meant by:

not hard-coding it into TCS, but leaving it to the data and/or the implementation whether reification is needed.

The inline examples in the Term List, my tcs:RelationshipAssertion and @ghwhitbread 's tcs:Relationship example do exactly the same reification as @baskaufs 's and my rdf:Statement examples—they just obfuscate it—and require the reification to be hard-coded into TCS.

TCS has neither a model nor an ontology. The TCS Term List is a vocabulary just like a controlled vocabulary and the process for adding a property to the Term List or a controlled term to a controlled vocabulary is exactly the same. Topological relationships are properties in things like GeoSPARQL and SKOS, so it makes sense to do the same in TCS. Idiosyncrasy is not exactly a desirable feature in a standard.

@nielsklazenga
Copy link
Member

BTW, @camwebb has gone bush and we will not hear from him before August 21st.

@baskaufs
Copy link

I will just verify what @nielsklazenga said about properties and controlled values. If they are part of a standard, they (as well as classes) are all vocabulary terms and subject to exactly the same change process.

@ghwhitbread I do not understand your point. You are OK with reification as a principle in concept modelling, but not with using the RDF reification terms to describe it. What is the issue? The reification terms are very lightweight and carry no significant semantics (except that they describe relationships). I don't see the point of minting our own terms that mean the same thing as ones that are already well known, which seems to be what you are doing in the Relationship Statements part of your example.

One of the roles of the TAG is to promote interoperability and to prevent TDWG standards from duplicating each other's work. What TCS is trying to do in describing taxon relationships is entirely analogous to what Darwin Core already does with ResourceRelationship terms. As Niels said, the DwC RDF Guide avoided creating dwciri: analogs of ResourceRelationship terms. That was largely because the ResourceRelationship terms seemed like a specific case of a more general pattern that should not have a purpose-built solution.

As TAG convener, what I would like to see us settle on is a TDWG-wide consensus on how we can describe these kind of relationships in RDF, particularly as we move toward JSON-LD as a machine-readable way to expose data whose semantics are clear. It seems to me that the RDF reification terms serve as a simple and well-known way to do that, so that's why I'd like to understand better what your objection to them is.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 11, 2023

Getting back to something that @ghwhitbread said earlier:

...let's try and avoid reification wherever possible (blank nodes too).

I think the bit in parentheses is actually the most important. The most important reason to have a class for these relationships in TCS is that we need IDs on these things. In the discussion in issue #43 the name we ended up with—before I started to muck around with it in the examples—was 'TaxonConceptAlignment'. I think a TaxonConceptAlignment (or ...Mapping) class would be very useful to have. In fact, it is kind of essential, as the definition of TaxonConcept relies on it.

I still do not want to mint any of its properties in TCS, as it is still a Resource Relationship, even though it is a specific type of Resource Relationship that is very important to us.

The reason, by the way, why I went back to TaxonRelationship is that I wanted to use the relationships more widely than only for concept mappings. Especially the intersects relationship I have been using to deal with references that are cited in a Treatment (e.g., https://github.com/tdwg/tcs2/blob/examples-stash/docs/examples/col-3YVTF/col-3YVTF.ttl). This is not necessary anymore if we turn the relationships from controlled values into properties, so we can use them outside a relationship object. In general, I think relationship types should be properties: all our other relationships are, as are most relationships in other systems.

Further on the blank nodes, I think we need to reconsider how we deal with this in the examples. In an earlier meeting, we decided that we should not make up IRIs if they do not already exist. However, after looking at online taxonomic systems to find examples (for the Term List and the Implementation Experience Report), I now think that the next challenge to overcome is the lack of citable identifiers for all our objects and that it is often not clear what object an identifier applies to. I think (now) it is important to set a good example and have IRIs on all the objects we define in TCS at least. We will talk more about this in our next meeting (I am planning one for the end of August).

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 12, 2023

#I:heart:reification:

@prefix tcs: <http://rs.tdwg.org/tcs/terms/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

<https://eg.tdwg.org/tcs/tc/2000-Amblystegium> a tcs:TaxonConcept ;
    dcterms:title "Amblystegium sec. Koperski et al. 2000" ;
    tcs:accordingTo <https://tropicos.org/reference/9022656> ;
    tcs:acceptedName <https://tropicos.org/name/35000040> ;
    tcs:synonym <https://tropicos.org/name/35000612> ;
    tcs:includes <https://eg.tdwg.org/tcs/tc/1981-Amblystegium> ,
            <https://eg.tdwg.org/tcs/tc/1992-Amblystegium> ,
            <https://eg.tdwg.org/tcs/tc/1992-Hygroamblystegium> ,
            <https://eg.tdwg.org/tcs/tc/1927-Hygroamblystegium> ;
    tcs:partiallyOverlaps <https://eg.tdwg.org/tcs/tc/1992-Amblystegiella> ,
            <https://eg.tdwg.org/tcs/tc/1992-Campylium> ,
            <https://eg.tdwg.org/tcs/tc/1927-Amblystegium> ,
            <https://eg.tdwg.org/tcs/tc/1927-Chrysohypnum> ,
            <https://eg.tdwg.org/tcs/tc/1980-Amblystegium> ,
            <https://eg.tdwg.org/tcs/tc/1980-Platydictya> ;
    rdfs:comment """Weltweit etwa 35 Arten, in Europa 8 Arten. Abweichend von 
            Corley & al. Wird in Übereinstimmung mit Anderson & al. (1990), 
            Ignatov & Afonina (1992), Blockeel & Long (1998) sowie Crosby & al. 
            (1999) Leptodictyum als Gattung akzeptiert (s. auch Hedenäs 
            1990a)""" .

<https://eg.tdwg.org/tcs/tca/6> a tcs:TaxonConceptAlignment ;
    rdf:subject <https://eg.tdwg.org/tcs/tc/2000-Amblystegium> ;
    rdf:predicate tcs:includes ;
    rdf:object <https://eg.tdwg.org/tcs/tc/1981-Amblystegium> ;
    dcterms:source <https://tropicos.org/reference/9022656> ;
    rdfs:comment "incl. Leptodictyum riparium (Amblystegium r.)" .

<https://eg.tdwg.org/tcs/tca/7> a tcs:TaxonConceptAlignment ;
    rdf:subject <https://eg.tdwg.org/tcs/tc/2000-Amblystegium> ;
    rdf:predicate tcs:includes ;
    rdf:object <https://eg.tdwg.org/tcs/tc/1992-Amblystegium> ;
    dcterms:source <https://tropicos.org/reference/9022656> ;
    rdfs:comment """excl. Amblystegium confervoides (Amblystegiella c.), A. 
            subtile (Amblystegiella s.), A. fluviatile (Hygroamblystegium f.), 
            A. tenax (Hygroambystegium t.), A. radicale (Campylium r.)""" .

<https://eg.tdwg.org/tcs/tca/8> a tcs:TaxonConceptAlignment ;
    rdf:subject <https://eg.tdwg.org/tcs/tc/2000-Amblystegium> ;
    rdf:predicate tcs:includes ;
    rdf:object <https://eg.tdwg.org/tcs/tc/1992-Hygroamblystegium> ;
    dcterms:source <https://tropicos.org/reference/9022656> .

<https://eg.tdwg.org/tcs/tca/9> a tcs:TaxonConceptAlignment ;
    rdf:subject <https://eg.tdwg.org/tcs/tc/2000-Amblystegium> ;
    rdf:predicate tcs:includes ;
    rdf:object <https://eg.tdwg.org/tcs/tc/1927-Hygroamblystegium> ;
    dcterms:source <https://tropicos.org/reference/9022656> .

<https://eg.tdwg.org/tcs/tca/10> a tcs:TaxonConceptAlignment ;
    rdf:subject <https://eg.tdwg.org/tcs/tc/2000-Amblystegium> ;
    rdf:predicate tcs:partiallyOverlaps ;
    rdf:object <https://eg.tdwg.org/tcs/tc/1992-Amblystegiella> ;
    dcterms:source <https://tropicos.org/reference/9022656> ;
    rdfs:comment """incl. Amblystegium confervoides (Amblystegiella c.), 
            Amblystegium subtile (Amblystegiella s.)""" .

<https://eg.tdwg.org/tcs/tca/11> a tcs:TaxonConceptAlignment ;
    rdf:subject <https://eg.tdwg.org/tcs/tc/2000-Amblystegium> ;
    rdf:predicate tcs:partiallyOverlaps ;
    rdf:object <https://eg.tdwg.org/tcs/tc/1992-Campylium> ;
    dcterms:source <https://tropicos.org/reference/9022656> .

<https://eg.tdwg.org/tcs/tca/12> a tcs:TaxonConceptAlignment ;
    rdf:subject <https://eg.tdwg.org/tcs/tc/2000-Amblystegium> ;
    rdf:predicate tcs:partiallyOverlaps ;
    rdf:object <https://eg.tdwg.org/tcs/tc/1927-Amblystegium> ;
    dcterms:source <https://tropicos.org/reference/9022656> ;
    rdfs:comment """incl. Conardia compacta (Amblystegium c.), *Isopterygiopsis 
            alpicola (Lind. & Arnell) Hedenäs (Amblystegium a.), Leptodictyum 
            riparium (Amblystegium r.), Platydictya jungermannioides 
            (Amblystegium sprucei), Pseudoleskeella catenulata p.p. 
            (Amblystegium ursorum) sowie die aussereuropäische Art *Homomallium 
            adnatum (Hedw.) Broth. (Amblystegium a.); excl. Amblystegium 
            fluviatile (Hygroamblystegium f.), A. tenax (Hygroamblystegium t.), 
            A. radicale (Chrysohypnum hygrophilum).""" .

<https://eg.tdwg.org/tcs/tca/13> a tcs:TaxonConceptAlignment ;
    rdf:subject <https://eg.tdwg.org/tcs/tc/2000-Amblystegium> ;
    rdf:predicate tcs:partiallyOverlaps ;
    rdf:object <https://eg.tdwg.org/tcs/tc/1927-Chrysohypnum> ;
    dcterms:source <https://tropicos.org/reference/9022656> ;
    rdfs:comment "incl. Amblystegium radicale (Chrysohypnum hygrophilum)" .

<https://eg.tdwg.org/tcs/tca/14> a tcs:TaxonConceptAlignment ;
    rdf:subject <https://eg.tdwg.org/tcs/tc/2000-Amblystegium> ;
    rdf:predicate tcs:partiallyOverlaps ;
    rdf:object <https://eg.tdwg.org/tcs/tc/1980-Amblystegium> ;
    dcterms:source <https://tropicos.org/reference/9022656> ;
    rdfs:comment """incl. Conardia compacta (Amblystegium c.), Leptodictyum 
            riparium (Amblystegium r.); excl. Amblystegium confervioides 
            (Platydictya c.)""" .

<https://eg.tdwg.org/tcs/tca/15> a tcs:TaxonConceptAlignment ;
    rdf:subject <https://eg.tdwg.org/tcs/tc/2000-Amblystegium> ;
    rdf:predicate tcs:partiallyOverlaps ;
    rdf:object <https://eg.tdwg.org/tcs/tc/1980-Platydictya> ;
    dcterms:source <https://tropicos.org/reference/9022656> ;
    rdfs:comment "incl. Amblystegium confervoides (Platydictya c.)" .

@ghwhitbread
Copy link
Collaborator

@nielsklazenga. These are not examples of reification. Using the rdf:statement vocabulary to build relationships, yes, but not reification.

@ghwhitbread
Copy link
Collaborator

@baskaufs,
I was just adding support for the case against turning the different types of inter-taxonConcept relationships into properties with a range of taxonConcept. In the systems I work with there are many more such relationships than the five that come from RCCS, and I imagine that it will be far easier to add values to a controlled vocabulary than it will be to gather support for an additional 25 properties, even if the process is the same. When it comes to RDF, sub-typing an hasRelationshipAssertion property on “predicate” would be my preferred option.

My example was written by ChatGPT on request to illustrate an alternative to @nielsklazenga example using a taxonConceptRelationship entity. It decided against using the rdf:statement properties, not me, replacing them with tcs:equivalents. When I asked it why, it provided a dozen reasons why this was a better idea.

Reification in RDF is a technique for adding metadata to a statement. It is not about building alignments between resources so that we can unambiguously describe these relationships in detail using any number of additional properties. The whole idea behind the Semantic approach is to make it possible to say what we mean. I’m a great believer in patterns. Here, we have a taxonConceptRelationship pattern that can be used to simplify the documentation of biodiversity data. I don’t think I object to it re-using tdwg:ResourceRelationship, but the RDF specification does provide us with the means for implementing associations in more meaningful ways than things and stuff while still preserving those underlying semantics. I thought that was what you were advocating.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 16, 2023

Okay, cool. If you can live with the examples, I think we are all happy (but we will have a meeting next week to confirm). I had never looked up the meaning of 'reification', just know that the rdf:Statement vocabulary is called the 'Reification vocabulary'. In knowledge representation, 'reification' simply means 'turning a statement into an object', which is exactly what I was doing, but there are a lot of other meanings.

Also, your own description:

Reification in RDF is a technique for adding metadata to a statement.

...that is exactly what I have been doing. The only metadata I want to add is an ID and a comment (and a source, so it can be used stand-alone).

My point was (and still is) that TCS should be unopiniated about syntax and enable people to describe relationships with or without a Relationship object (which I think you might have forgotten to tell ChatGPT). I have come across many examples where a Relationship object is not necessary and just overcomplicates things. Also, like the example above, examples where a Relationship object is necessary, but you still want to see the relationships on the Taxon Concept (as it looks like in the publication).

I do not think subtyping is something we are allowed to do in TDWG standards. Also, I would like to avoid Relationship objects as much as possible, not create a plethora of different types of them. Relationships are between taxa, or between names, or between taxa and names. Relationship objects do not come into their definition; they are just the vehicles that are sometimes needed to transfer the data.

What makes it difficult to add new terms to a TDWG standard (and also sort of the point of having standards) is that terms need to have definitions and there needs to be community consensus on those definitions and on having the terms in the first place. This is not going to be any easier for controlled terms than for properties. Also note that your example has and a property, and a type, and a controlled term for each relationship. That can only make it harder.

@EndCoop
Copy link

EndCoop commented Aug 17, 2023

I know that TCS2 isn't RDF but all the examples in this issue are very RDF-like so...

  • there are examples above with rdfs:comment as a properties of blank nodes. Does this make sense?
  • there are examples above that have rdfs:comment as properties of tcs:TaxonConceptAlignment (what happened to tcs:TaxonRelationship?) that don't appear to be comments on these resources. For example:

https://eg.tdwg.org/tcs/tca/15 a tcs:TaxonConceptAlignment ;
rdf:subject https://eg.tdwg.org/tcs/tc/2000-Amblystegium ;
rdf:predicate tcs:partiallyOverlaps ;
rdf:object https://eg.tdwg.org/tcs/tc/1980-Platydictya ;
dcterms:source https://tropicos.org/reference/9022656 ;
rdfs:comment "incl. Amblystegium confervoides (Platydictya c.)" .

Which resource includes Amblystegium confervoides (Platydictya c.)? Shouldn't this comment be on the subject or the object resource and not on the tcs:TaxonConceptAlignment?

  • the rdf:subject, rdf:predicate and rdf:object properties have rdfs:domain of rdf:statement, doesn't this mean that the example I quoted above is declaring https://eg.tdwg.org/tcs/tca/15 as an instance of rdf:statement? It is declared as an instance of tcs:TaxonConceptAlignment. Is this always the case? Or does it just have to be the case whenever the reification properties are used?

I might be getting this wrong but it seems to me that either:

  • the taxon concept alignments are just edges in the graph and therefore TCS2 should provide them as properties of tcs:TaxonConcept, and the reification vocabulary is sufficient for attribution of the statement (and there isn't much else to say about them), or
  • it is useful to be able to describe taxon concept alignments (via a range of properties) and they need to be created as objects and therefore the reification vocabulary isn't applicable

Another interesting thing about the examples given in this discussion, is that tcs:synonym statements do not have an attribution (and the reification vocabulary would presumably be used to provide it). Presumably the intention is that the tcs:accordingTo of the tcs:TaxonConcept is the source of the synonym but there is no rdf statement to this effect.

Probably I'm making a false dichotomy, but is TCS2 concerned with describing taxon concept alignments or merely asserting them? I feel like it is more useful to be able to describe the relationships and not just assert them.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 17, 2023

Hi @EndCoop,

I know that TCS2 isn't RDF but all the examples in this issue are very RDF-like so...

The examples are in TuRTLe, which is a serialisation of RDF, which is there to describe Resources. We aim to be RDF compliant though, as perhaps the most important reason for us to do this work is that the Darwin Core Taxon does not work in the context of RDF. We will provide other serialisations, but it is easier to convert everything at once than to manage multiple versions of examples while we are making them. I think TuRTLe is the best format to use for inline examples, as it is very readable and you can make comments in it, which you cannot do in JSON for example (but we will have everything in JSON/JSON-LD as well).

  • there are examples above with rdfs:comment as a properties of blank nodes. Does this make sense?

Not in my examples, I hope. It is not so much that it does not make sense, but we try to avoid it. We will be discussing this at the meeting next week, as so far the aim has been to avoid creating dummy IRIs, rather than blank nodes, and I think we should reconsider that. The thing is that blank nodes can be nested in another object in TuRTLe, while IRI nodes cannot. I think this is the reason why there was a preference for blank nodes in examples.

  • there are examples above that have rdfs:comment as properties of tcs:TaxonConceptAlignment (what happened to tcs:TaxonRelationship?) that don't appear to be comments on these resources. For example:

https://eg.tdwg.org/tcs/tca/15 a tcs:TaxonConceptAlignment ;
rdf:subject https://eg.tdwg.org/tcs/tc/2000-Amblystegium ;
rdf:predicate tcs:partiallyOverlaps ;
rdf:object https://eg.tdwg.org/tcs/tc/1980-Platydictya ;
dcterms:source https://tropicos.org/reference/9022656 ;
rdfs:comment "incl. Amblystegium confervoides (Platydictya c.)" .

Which resource includes Amblystegium confervoides (Platydictya c.)? Shouldn't this comment be on the subject or the object resource and not on the tcs:TaxonConceptAlignment?

  1. As to what happened to tcs:TaxonRelationship, TaxonConceptAlignment is the name we ended up with in our discussions in class: TaxonConceptMapping #43 . As already in the discussion above, I do not want a generic TaxonRelationship object, as for that we should use dwc:ResourceRelationship. I am happy to have a TaxonConceptAlignment object though, as that is a very specific thing to TCS and has an extra assertion. Also, if we delete the class altogether, we need to change the definition of TaxonConcept, which was a hard-fought result, so I do not want to go back to that. So, it is just a matter of choosing the right name.
  2. The rdfs:comment is on the relationship. The statement is Amblystegium sec. Koperski et al. >< Platydictya sec. Smith 1980 and the comment indicates that Smith 1980 includes Amblystegium confervoides in Platydictya, which is why the concepts overlap. I copied it straight from the publication. It might be a bit hard to see without the context (it is actually not super obvious when I have the book right in front of me either). The full example also includes:
<https://eg.tdwg.org/tcs/tc/2000-Platydictya> a tcs:TaxonConcept ;
    dcterms:title "Platydictya sec. Koperski et al. 2000" ;
    tcs:accordingTo <https://tropicos.org/reference/9022656> ;
    tcs:acceptedName <https://tropicos.org/name/35000970> ;
    tcs:isCongruentWith <https://eg.tdwg.org/tcs/tc/1981-Platydictya> ;
    tcs:isIncludedIn <https://eg.tdwg.org/tcs/tc/1992-Amblystegiella> ,
            <https://eg.tdwg.org/tcs/tc/1927-Amblystegium> ,
            <https://eg.tdwg.org/tcs/tc/1927-Amblystegium-sect-Serpoleskea> ,
            <https://eg.tdwg.org/tcs/tc/1980-Platydictya> ;
    rdfs:comment "Weltweit 7 Arten, in Europa 1 Art." .

and:

<https://eg.tdwg.org/tcs/tca/20> a tcs:TaxonConceptAlignment ;
    rdf:subject <https://eg.tdwg.org/tcs/tc/2000-Platydictya> ;
    rdf:predicate tcs:isIncludedIn ;
    rdf:object <https://eg.tdwg.org/tcs/tc/1980-Platydictya> ;
    dcterms:source <https://tropicos.org/reference/9022656> ;
    rdfs:comment "incl. Amblystegium confervoides (Platydictya c.)" .
  • the rdf:subject, rdf:predicate and rdf:object properties have rdfs:domain of rdf:statement, doesn't this mean that the example I quoted above is declaring https://eg.tdwg.org/tcs/tca/15 as an instance of rdf:statement? It is declared as an instance of tcs:TaxonConceptAlignment. Is this always the case? Or does it just have to be the case whenever the reification properties are used?

If the domain statements were normative (which they are not), using rdf:subject, rdf:predicate and rdf:object properties on an instance of the tcs:TaxonConceptRelationship would entail that tcs:TaxonConceptAlignment is an rdfs:subclassOf rdf:Statement and <https://eg.tdwg.org/tcs/tca/15> would be an instance of (have rdf:type) both rdf:Statement and tcs:TaxonConceptAlignment. I can live with that.

I might be getting this wrong but it seems to me that either:

  • the taxon concept alignments are just edges in the graph and therefore TCS2 should provide them as properties of tcs:TaxonConcept, and the reification vocabulary is sufficient for attribution of the statement (and there isn't much else to say about them), or

Exactly! That is the proposal.

  • it is useful to be able to describe taxon concept alignments (via a range of properties) and they need to be created as objects and therefore the reification vocabulary isn't applicable

It is not so much an either–or case, but more that there are situations in which you want to do the one thing and other situations in which you want to do the other (and situations where you want to do both) and I think this is exactly what makes the reification vocabulary so useful, as all it does is turn a statement into an object (so that you can hang extra properties off it). In many graph databases, edges do have properties, just like nodes. In RDF you do the same thing through reification.

EDIT. Actually, the scenario in the first dot point does not require reification, as you can just use rdf:value and then you can put some extra properties on (@camwebb , that is how you put properties on an edge in RDF). It is the scenario in the second dot point that you need reification for. Use of any "relationship" object (with an ID and a type) is reification, no matter how hard you try to obfuscate it.

TCS has to support different scenarios, not one particular pattern that some people prefer. So, if people object to what I did in the examples (and @baskaufs did in his), it logically follows that they should agree with my proposal to not put it in TCS. Under this proposal people can make the pattern @ghwhitbread had in his ChatGPT example as well: they just need to find another namespace to define all those extra terms. I for one like to see my data without all the ornaments.

Another interesting thing about the examples given in this discussion, is that tcs:synonym statements do not have an attribution (and the reification vocabulary would presumably be used to provide it). Presumably the intention is that the tcs:accordingTo of the tcs:TaxonConcept is the source of the synonym but there is no rdf statement to this effect.

tcs:synonym is a property of the Taxon Concept, just like tcs:acceptedName. The tcs:accordingTo is the source of the tcs:TaxonConcept for which tcs:acceptedName is the accepted name and tcs:synonyms the synonyms. There is never a reason to reify these statements, but if you do, the objects would be instances of rdf:Statement and/or dwc:ResourceRelationship (or a dwc:Taxon), not tcs:TaxonConceptAlignment, as tcs:acceptedName and tcs:synonym are relationships between a Taxon Concept and a Taxon Name, not between Taxon Concepts. What is interesting to me is that it is nomenclatural business rules that dictate whether a name is the accepted name or a synonym, yet nobody expects separate attribution for an accepted name.

That being said, having acceptedName, synonym (relationships between taxa and names) and parent (relationships between taxa) as properties of Taxon Concept and basionym, replacedSynonym, conservedAgainst and spellingCorrectionOf (relationships between names) as properties of Taxon Name is the main reason for me to say that the topological relationships between taxa should also be properties and not controlled values.

With topological relationships between taxa there is always an assertion that is not the taxon concept itself and that assertion can, although not necessarily has to, have a different source (accordingTo) than the (subject) taxon concept, which is why you will almost always want to reify those statements (so you can give them an ID). This is not the case for the other relationships I mentioned in the paragraph above.

Probably I'm making a false dichotomy, but is TCS2 concerned with describing taxon concept alignments or merely asserting them? I feel like it is more useful to be able to describe the relationships and not just assert them.

I think we want to do both and it is the ability to reify statements that lets us do that.

@baskaufs
Copy link

  • the rdf:subject, rdf:predicate and rdf:object properties have rdfs:domain of rdf:statement, doesn't this mean that the example I quoted above is declaring https://eg.tdwg.org/tcs/tca/15 as an instance of rdf:statement? It is declared as an instance of tcs:TaxonConceptAlignment. Is this always the case? Or does it just have to be the case whenever the reification properties are used?

If the domain statements were normative (which they are not), using rdf:subject, rdf:predicate and rdf:object properties on an instance of the tcs:TaxonConceptRelationship would entail that tcs:TaxonConceptAlignment is an rdfs:subclassOf rdf:Statement and <https://eg.tdwg.org/tcs/tca/15> would be an instance of (have rdf:type) both rdf:Statement and tcs:TaxonConceptAlignment. I can live with that.

Using rdf:subject, rdf:predicate and rdf:object properties would entail that the subject is an instance of rdf:Statement. But I do not think it would entail that tcs:TaxonConceptAlignment is an rdfs:subclassOf rdf:Statement. That is, if I only asserted

<https://eg.tdwg.org/tcs/tca/15> a tcs:TaxonConceptAlignment.

it would not automatically entail

<https://eg.tdwg.org/tcs/tca/15> a rdf:Statement.

as would be the case if we declared

tcs:TaxonConceptAlignment rdfs:subclassOf rdf:Statement

This is a subtle point, because as a practical matter we would probably always be using the RDF reification properties and the subject would then basically always also be an instance of rdf:Statement. But my reason for making this point is that I don't think we are "breaking the rules" by minting terms at the basic bag of terms level that generate entailments. That is, using the term tcs:TaxonConceptAlignment doesn't itself generate entailments.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 17, 2023

This was the text I Turtled, by the way (as well as I can do in Markdown):

Amblystegium Schimp.

Bryol. Eur. 6: 45. 1853 <1>

= Hygroamblystegium Loeske, nom. cons.

Weltweit etwa 35 Arten, in Europa 8 Arten. Abweichedn von Corley & al. wird in Übereinstimmung mit Anderson & al. (1990), Ignatov & Afonina (1992), Blockeel & Long (1998) sowie Crosby & al. (1999) Leptodictyum als Gattung akzeptiert (s. auch Hedenäs 1990a)

< Amblystegium Schimp. sec. Corley & al. (1981/1991)

incl. Leptodictyum riparium (Amblystegium r.)

> Amblystegium Schimp. sec. Frahm & Frey (1992)

excl. Amblystegium confervoides (Amblystegiella c.), A. subtile (Amblystegiella s.), A. fluviatile (Hygroamblystegium f.), A. tenax (Hygroambystegium t.), A. radicale (Campylium r.)

> Hygroamblystegium Loeske sec. Frahm & Frey (1992)

> Hygroamblystegium Loeske sec. Mönkemeyer (1927)

>< Amblystegiella Loeske sec. Frahm & Frey (1992)

incl. Amblystegium confervoides (Amblystegiella c.), Amblystegium subtile (Amblystegiella s.)

>< Campylium (Sull.) Mitt. sec. Frahm & Frey (1992)

incl. Amblystegium radicale (Campylium r.)

>< Amblystegium Schimp. sec. Mönkemeyer (1927)

incl. Conardia compacta (Amblystegium c.), *Isopterygiopsis alpicola (Lind. & Arnell) Hedenäs (Amblystegium a.), Leptodictyum riparium (Amblystegium r.), Platydictya jungermannioides_ (Amblystegium sprucei), Pseudoleskeella catenulata p.p. (Amblystegium ursorum) sowie die aussereuropäische Art *Homomallium adnatum (Hedw,) Broth. (Amblystegium a.); excl. Amblystegium fluviatile (Hygroamblystegium f.), A. tenax (Hygroamblystegium t.), A. radicale (Chrysohypnum hygrophilum)

>< Chrysohypnum (Hampe) G.Roth sec. Mönkemeyer (1927)

incl. Amblystegium radicale (Chrysohypnum hygrophilum)

>< Amblystegium Schimp. sec. Smith (1980)

incl. Conardia compacta (Amblystegium c.), Leptodictyum riparium (Amblystegium r.); excl. Amblystegium confervioides (Platydictya c.)

>< Platydictya Berk. sec. Smith (1980)

incl. Amblystegium confervoides (Platydictya c.)

@nielsklazenga
Copy link
Member

@baskaufs

Using rdf:subject, rdf:predicate and rdf:object properties would entail that the subject is an instance of rdf:Statement. But I do not think it would entail that tcs:TaxonConceptAlignment is an rdfs:subclassOf rdf:Statement.

Even better!

@EndCoop
Copy link

EndCoop commented Aug 18, 2023

Thanks @nielsklazenga, I might be a botanist but I do recognise a turtle when I see one... (bad joke to keep the discussion amicable). My point about RDF was to preempt objections of the form "TCS isn't RDF" and "TCS doesn't have a model". Call a spade a spade, if the examples are RDF and the proposed properties are borrowed from RDF Schema, then we are indeed talking about RDF and RDF is a model. But this isn't the main point.

The point (as I understand it) is to decide between:

  • using tcs:TaxonConceptAlignment class to model these relationships,
  • using tcs:intersects (etc) properties on the tcs:TaxonConcept class to state these relationships and using reification to enable description of these relationships (including source attribution)
  • doing both

It seems like there is at least some agreement that doing both isn't the best option for a standard. This is my opinion.

If the relationship class approach is taken, then subclasses or a controlled type vocabulary can be used. If the relationship properties approach is taken then the properties are the types. I don't think this difference is likely to be decisive.

More important is the discussion around how much we want TCS to be able to describe the relationships. The relationship property approach coupled with reification allows relationship statements to be described but you have to use blank nodes, skolemization, or create other identifiers for these statements so that you can give them other properties. This is where the question of rdfs:domain of the reification vocabulary terms comes in. @nielsklazenga says that these aren't normative so it doesn't matter (its not against the rules but is it a good idea?). I see two potential issues here:
(1) there is a subtle, but possibly important, distinction between a relationship and the statement of a relationship. I think there are things you can say about a relationship that don't really make sense to say about the statement of that relationship. It is important that properties are pointed at the right object. This is why I pointed at an rdfs:comment that appears to address the incorrect object. @nielsklazenga says that it is a comment on the relationship but in his explanation it sounds a lot like it is a statement about the object. But it might just be that this is a confusing example and it is indeed a property of the relationship. So would it make sense as a property of the statement of that relationship? I think not.
(2) how is this going to work outside of RDF? Okay I'm circling back, but if we are going with TCS isn't RDF and doesn't have a model what would all of this look like in a non-RDF application?

Ultimately we could generalise everything until we had one class for things with extensive properties to describe the thing and one property for relationships between things with extensive properties to describe the relationship. But we don't do this and I'm not attempting a reductio ad absurdum argument. My opinion is that we have enough to say about taxon concept alignments to make it worthwhile having a class for these things.

We can skip the discussion of tcs:synonym which looks an awful lot like a taxon concept alignment. Although I will clarify, implicit in "there is never a reason to reify these statements" is that there is never a reason to explicitly attribute these statements because the implicit attribution through the taxon concept accordingTo is always sufficient. Is it?

Thanks @baskaufs, I agree that there is no suggestion that using the rdf reification vocabulary would entail that a tcs:TaxonConceptAlignment is a subclass of rdf:statement. As I understand it, an individual can be a member of any number of classes without entailing any relationship between those classes.

I asked the question about tcs:TaxonRelationship because that is the term that appears in the current term list. It might help discussion if the term list is updated to include all such decisions.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 18, 2023

@EndCoop , I welcomed your questions, which is why I took so much time to respond to them, so no extra effort needed to stay amicable, we are just having a conversation (but I understand feelings sometimes do run high, so it is appreciated). As you know, I am a botanist too.

Yes, the TaxonRelationship / TaxonConceptAlignment thing will be confusing for people who are just joining the discussion. I did not want to update the term list before we have completed this discussion, but it was up to date until I started this thread and I will update it when we have reached a conclusion. TaxonRelationship and TaxonRelationshipAssertion are the terms used in TCS 1. When we discussed this in #43, I did not like the 'Relationship' part of the term, as we had just moved parent out of it, and after some discussion 'TaxonConceptAlignment' (@ghwhitbread 's suggestion actually) was what we came up with. Then, when I started writing examples, I needed a more general term, especially since we added the intersects relationship, which can be used for things that are definitely not alignments. I have used it, for example, for any references that are cited in a treatment. However, I started this discussion with saying that we should not have a TaxonRelationship class, because it is the same thing as the dwc:ResourceRelationship. If we define a class in TCS, it has to be something that is very specific to TCS, which is why I picked up TaxonConceptAlignment again (I think I referred to #43 then too).

Yes, the point is that the topological relationship terms—not only intersects, but also isCongruentWith, includes, isIncludedIn, partiallyOverlaps and isDisjointFrom—should be properties, so they can be used outside a "Relationship" object (and also because relationships should just be properties).

I am ambivalent about having a TaxonConceptAlignment (or whatever we want to call it) class, but I would use it if it were available and I think we can get it ratified (while a bespoke TaxonRelationship class with bespoke properties we cannot).

The "to reify or not to reify" question is a big red herring. My proposal was (and still is) to define the relationships in TCS, but not have TCS tell people how to implement relationships. The reification I used in my examples was just a way to deal with the dwc:ResourceRelationship (which is one way of implementing relationships) in RDF. I happened to stumble on the same solution @baskaufs came up with two years ago (I did not know that when I started the thread), but was not really advocating the RDF Reification Vocabulary in particular, just something that is out there already. If people do not like it, they can do it differently. Either way, it has got nothing to do with TCS.

tcs:synonym might look like a taxon concept alignment (to some people), but it is most definitely not. A synonym, like the accepted name, is a Taxon Name, not another Taxon Concept, and the SKOS analogue for tcs:synonym is skosxl:hiddenLabel, not one of SKOS's semantic properties that would be the analogues of the topological relationship properties in TCS (and which have a skos:Concept rather than a skosxl:Label as object). This is actually the biggest problem in TCS 1 that we set out to fix. TCS 1 has the has synonym (pardon the spelling if I did not get it right) relationship, but says it should not be used, as it is a mixed concept. We fixed this by splitting it into tcs:synonym (for proper synonyms) and tcs:intersects (for taxon relationships). So, if people confuddle concepts and labels and talk about synonyms more loosely than nomenclatural hardliners like me do, they are talking about tcs:intersects, not tcs:synonym (I think that is in the term list).

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 19, 2023

@EndCoop:

This is why I pointed at an rdfs:comment that appears to address the incorrect object. @nielsklazenga says that it is a comment on the relationship but in his explanation it sounds a lot like it is a statement about the object. But it might just be that this is a confusing example and it is indeed a property of the relationship. So would it make sense as a property of the statement of that relationship?

I have the same issue. The rdfs:comment in the tcs:TaxonConceptAlignment (or dwc:ResourceRelationship, or rdf:Statement) is about the object in relationship to the subject, so it is an explanation for why the subject and object have the particular relationship. So, indeed not a property of the relationship itself, but of the assertion (or statement) of the relationship. This is one reason I would always add 'Assertion' to the name of the object (or class rather).

This, however, is about syntax and data structures, which is not the concern of TDWG Vocabulary Standards, but of Application Profiles (Vocabulary Maintenance Specification). That is at least the argument I am pursuing and the fact that there is so much discussion about the boilerplate stuff in the examples only strengthens that argument.

@EndCoop
Copy link

EndCoop commented Aug 21, 2023

tcs:synonym might look like a taxon concept alignment (to some people), but it is most definitely not. A synonym, like the accepted name, is a Taxon Name, not another Taxon Concept, and the SKOS analogue for tcs:synonym is skosxl:hiddenLabel, not one of SKOS's semantic properties that would be the analogues of the topological relationship properties in TCS (and which have a skos:Concept rather than a skosxl:Label as object). This is actually the biggest problem in TCS 1 that we set out to fix. TCS 1 has the has synonym (pardon the spelling if I did not get it right) relationship, but says it should not be used, as it is a mixed concept. We fixed this by splitting it into tcs:synonym (for proper synonyms) and tcs:intersects (for taxon relationships). So, if people confuddle concepts and labels and talk about synonyms more loosely than nomenclatural hardliners like me do, they are talking about tcs:intersects, not tcs:synonym (I think that is in the term list).

Ok, got it, tcs:synonym is strictly nomenclatural. So for inferencing: a tcs:synonym only means that a name applies to a tcs:taxonConcept and that it isn't the tcs:acceptedName. It does not imply any topological relationship between taxon concepts. Such relationships must be stated separately using a tcs relationship property or tcs relationship class. We would need two statements in tcs for the same statement in the original source (i.e. the nomenclatural part and the "conceptual" part). This is fine, there are two meanings and separation of concerns is generally a good thing.

There is another implication of this. And it also relates to a comment in #205 that "synonymy statements should not have an accordingTo they are not assertions". Borrowing an example from the ICN (2018, 52.2 Ex. 2). Johann Link illegitimately published Picea excelsa Link (nom. illeg. superfl.). A TCS version of Link's taxon concept would have to have the tcs:acceptedName of Picea abies (L.) H.Karst. because as a nomenclatural fact this is the correct name for the taxon. This is a bit odd because the tcs:acceptedName for Link's concept doesn't appear in the tcs:accordingTo reference for the concept and didn't, in fact, exist as a combination until 40 years after Link's paper was published. This is fine, if tcs:acceptedName is strictly nomenclatural, but how do we use TCS to reflect what Link said in his paper? More tcs properties, ones that do have an accordingTo? Or just more "comments".

Why is this relevant to this discussion? Because is we want to be able to reflect the nomenclatural details as recoded in the source we need attribution for them. It might be that implicit attribution to the accordingTo for the concept is sufficient (and would therefore be part of the definition of these properties) or we're going to be doing a whole lot of reification (at least in RDF implementations). This is important because tcs:intersects as a property (and the other topological relationships) still needs to be attributed to a source, even if the source is the same as the subject taxon concept, unless the source is the exchange file itself. So, the examples given at the top of this discussion:

Having the relationship types as TCS properties enables us to do things like this:

<https://id.biodiversity.org.au/instance/apni/545068> a tcs:TaxonConcept ;
  dcterms:title "Euphrasia gibbsiae sec. Barker 1982" ;
  tcs:accordingTo <https://www.jstor.org/stable/23873848> ;
  tcs:acceptedName <https://www.ipni.org/n/802545-1> ;
  tcs:synonym <https://www.ipni.org/n/802619-1> ;
  tcs:intersects <https://id.biodiversity.org.au/instance/apni/713514> .

<https://id.biodiversity.org.au/instance/apni/713514> a tcs:TaxonConcept ;
  dcterms:title "Euphrasia gibbsiae sec. Curtis 1967" ;
  tcs:accordingTo <https://id.biodiversity.org.au/reference/apni/23028> ;
  tcs:acceptedName <https://www.ipni.org/n/802545-1> .

... and, for a third-party taxon relationship assertion, in the body of an annotation or a nano-publication:

_:someone_elses_assertion a rdf:Statement ;
  rdf:subject <https://id.biodiversity.org.au/instance/apni/545068> ;
  rdf:predicate tcs:includes ;
  rdf:object <https://id.biodiversity.org.au/instance/apni/713514> .

In the first case we have an implicitly attributed tcs:intersects statement. But tcs:synonym is not similarly attributed to the source because it is strictly nomenclatural and does not have an accordingTo at all. In the second case, the tcs:includes could have an explicit attribution for the taxon concept alignment relationship (the example doesn't include one, but having the source is part of the point).

I have a couple of questions on these examples:

  • are the four triples in the second example actually enough for the relationship statement to be made? I think not, but I'm not completely sure.
  • if we were to merge these two sets of triples into one file do we now have the same statement attributed to more than one source. One implicit and one (in the example unsourced) explicit?

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 21, 2023

Hi @EndCoop,

  • The tcs:acceptedName of 'Picea excelsa sec. Link 1841' is 'Picea excelsa'. A name does not have to be legitimate to be an accepted name, it does not even have to be a name (according to the Code):
[] a tcs:TaxonConcept ;
    dcterms:title "Picea excelsa sec. Link 1841" ;
    tcs:accordingTo [ dcterms:bibliographicCitation "Linnaea 15: 517. 1841" ] ;
    tcs:acceptedName <https://www.ipni.org/n/262647-1> .

<https://www.ipni.org/n/262647-1> a tcs:TaxonName ;
    tcs:taxonNameString "Picea excelsa" ;
    dwc:scientificNameAuthorship "Link" ;
    tcs:nomenclaturalStatus <http://rs.gbif.org/vocabulary/gbif/nomenclatural_status/illegitimum> .

@EndCoop
Copy link

EndCoop commented Aug 21, 2023

Thanks @nielsklazenga I think I'm beginning to grasp the tcs (not-a-model) model. The proposal is that the topological relationship properties have an implicit (from context) tcs:accordingTo in the same way the tcs:acceptedName and tcs:synonym do. This is not what I understood when you said that tcs:synonym statements should not have an according to over in #205 But it makes a lot more sense (otherwise if you follow your "nomenclatural business rules" you might end up with the silly result I pointed at in my previous comment).

The RDF application is still a bit unclear to me. A triple doesn't have a context. So I'm having trouble seeing how to express a "third-party" taxon concept alignment. Using the initial example:
_:someone_elses_assertion a rdf:Statement ; rdf:subject <https://id.biodiversity.org.au/instance/apni/545068> ; rdf:predicate tcs:includes ; rdf:object <https://id.biodiversity.org.au/instance/apni/713514> .

In order to actually make the statement you also need the triple (because the reification above doesn't actually make the statement that it is reifying):
<https://id.biodiversity.org.au/instance/apni/545068> tcs:includes <https://id.biodiversity.org.au/instance/apni/713514>

In RDF this is indistinguishable from any other identical triple regardless of context. So how do we know where this assertion actually comes from? Maybe I'm talking about "syntactic sugar" and "ornaments" and should save it for an RDF "application profile"? On the other hand, if we stick with the existing class then two individuals with the same subject, predicate, and object can be stated with their own source attribution.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 21, 2023

Hi @EndCoop, just looking at #205 now, that is something that came up in a meeting we had and we have not done anything with (because I was not going to). The "Traditional Synonymy" term originally comes from the Berlin Model and its relationships includes things like "pro parte synonym" and "misapplication". The Berlin Model also has "partial synonym" (no clue how that differs from "pro parte synonym") and the NSL "pro parte misapplication" and both also include "heterotypic synonym" and "homotypic synonym" in this list. This is because the Berlin Model does not deal in taxon concepts but in Potential Taxa, which are similar to (standalone) Instances in the NSL.

If we have all relationships as properties, so no need for a relationshipType property, we do not need a traditionalSynonymyRelationshipType either, as we could just add proParteSynonym and misapplication properties. As they are not in TCS 1 (or any current standard), we cannot do that right from the start, but the properties can be proposed later. The big problem is going to be that terms need definitions and I do not think there are definitions for these terms that everyone is going to agree on (I am not the problem here, as I do not care what they mean, as I will never use them).

I would not get hung up on the RDF too much. We just do it as a language to describe our data and in particular create the examples. For me, for the examples, it is only important that it is valid (which is a pretty low bar in RDF), makes sense in the context and that it is easily read and understood. It is not a suggestion for implementation. In my head, I am thinking more of it like JSON objects, which I then translate to TuRTLe. I actually started with (and preferred) JSON-LD, where you can restrict the RDF to a context file and be done with it, but because of its terseness, readability, ease of writing (especially since I now know there are validators available online) and it allowing comments, I now prefer TuRTLe. It does not have to be RDF for me, but the nice thing of RDF is that it is—to a large extent—self-documenting, so there is a lot less explaining to do.

A "statement" is not the same thing as an RDF triple. The statement is in the data, and depending on your needs, you can either write that as a triple or reify it to an addressable object. As you have seen, I am not above doing both for the same statement.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 21, 2023

Or you can do it like so, sans reification:

@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix this: <https://eg.tdwg.org/tcs/ex/a4229ec1c282644f> .
@prefix sub: <https://eg.tdwg.org/tcs/ex/a4229ec1c282644f#> .
@prefix np: <http://www.nanopub.org/nschema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix orcid: <https://orcid.org/> .
@prefix tcs: <http://rs.tdwg.org/tcs/terms/> .

sub:Head {
    this: a np:Nanopublication ;
        np:hasAssertion sub:assertion ;
        np:hasProvenance sub:provenance ;
        np:hasPublicationInfo sub:pubinfo .
}

sub:assertion {
    <https://eg.tdwg.org/tcs/tc/2012-Athyriaceae> tcs:isIncludedIn <https://eg.tdwg.org/tcs/tc/2006-Woodsiaceae> . 
}

sub:provenance {
    sub:assertion prov:hadPrimarySource <https://doi.org/10.1002/tax.613003> .
}

sub:pubinfo {
    this: dcterms:date "2012" ;
        dcterms:creator <https://orcid.org/0000-0002-6605-1770> ,
                <https://orcid.org/0000-0003-1568-150X> ,
                <https://www.ipni.org/a/20030162-1> ,
                <https://www.ipni.org/a/14414-1> ,
                <https://orcid.org/0000-0003-3891-9904> ,
                <https://orcid.org/0000-0002-9776-6736> .
}

(This is TriG, not TuRTLe, as you cannot do named graphs in TuRTLe; no idea if the syntax is valid)

@ghwhitbread
Copy link
Collaborator

A "statement" is not the same thing as an RDF triple. The statement is in the data, and depending on your needs, you can either write that as a triple or reify it to an addressable object. As you have seen, I am not above doing both for the same statement.

An implication of two ways of describing relationships is two vocabularies, or a vocabulary linking parallel terms: one set for predicates in direct triples and another for the reified predicates in their equivalent Statements.

@EndCoop
Copy link

EndCoop commented Aug 22, 2023

A "statement" is not the same thing as an RDF triple. The statement is in the data, and depending on your needs, you can either write that as a triple or reify it to an addressable object. As you have seen, I am not above doing both for the same statement.

Wait, what? This feels like gaslighting. https://www.w3.org/TR/rdf11-concepts/#resources-and-statements seems to describe correspondence between statements and triples. I suppose that this doesn't mean that the triple actually is the statement but "An RDF triple encodes a statement—a simple logical expression, or claim about the world.".

The point I was attempting to make is that you can't conclude that:
ex:subject ex:predicate ex:object .
from :

_:xxx rdf:type rdf:Statement .
_:xxx rdf:subject ex:subject .
_:xxx rdf:predicate ex:predicate .
_:xxx rdf:object ex:object .

So if you want your dataset to say ex:subject ex:predicate ex:object . then you have to include ex:subject ex:predicate ex:object . in your dataset.

Now, if we have the taxon concept alignment relationships as properties of tcs:TaxonConcept and, as you said above, then the source of a taxon concept alignment stated within the context of a taxon concept is the tcs:accordingTo for that taxon concept (and you said the same for tcs:synonym). So the three triples:

ex:taxon-concept-1 a tcs:TaxonConcept ;
tcs:accordingTo ex:source-1 ;
tcs:includes ex:taxon-concept-2 .

are considered enough to say that the source of ex:taxon-concept-1 tcs:includes ex:taxon-concept-2 . is ex:source-1 (via the according to on the taxon concept).

What happens when we add a third party assertion

_:xxx rdf:type rdf:Statement .
_:xxx rdf:subject ex:taxon-concept-1 .
_:xxx rdf:predicate ex:includes .
_:xxx rdf:object ex:taxon-concept-2 .
_:xxx dcterms:source ex:source-2 .

is that we now have to look in two places for the source of ex:taxon-concept-1 tcs:includes ex:taxon-concept-2 . Seems like a problem.

Note that the reified statement for the third party assertion didn't need to repeat the triple ex:taxon-concept-1 tcs:includes ex:taxon-concept-2 . because it was already made "within" taxon-concept-1. But what if it wasn't? What if the accordingTo of ex:taxon-concept-1 didn't say anything about taxon concept alignments? Then either you couldn't conclude that ex:taxon-concept-1 tcs:includes ex:taxon-concept-2 . from the dataset containing the third party assertion or it would have to be explicitly added to the dataset, this time "outside" taxon-concept-1 as the triple ex:taxon-concept-1 tcs:includes ex:taxon-concept-2 . but now we don't want to conclude that the source of ex:taxon-concept-1 tcs:includes ex:taxon-concept-2 . is ex:source-1 (via the according to on the taxon concept). This is inconsistent.

The inconsistency comes about because of this:

  • The attribution for tcs:synonym, like for any property on a TaxonConcept, including tcs:intersects, is the tcs:accordingTo. This has got nothing to do with the property and everything with the context in which it is used. If you reify an accepted name or synonym statement (so, take it out of its context), you'd have to give the tcs:accordingTo of the Taxon Concept as the dcterms:source of the statement.

but if this isn't true you have to always use reification to provide the source of a taxon concept alignment assertion (even it it is in the "context" of the subject taxon concept).

A simpler solution might be to use a tcs:TaxonConceptAlignment class and not put taxon concept alignment properties onto tcs:TaxonConcept .

@EndCoop
Copy link

EndCoop commented Aug 22, 2023

Or you can do it like so, sans reification:

Isn't this just another way to do (concept) reification (not RDF reification). Now all you're doing is creating a named graph to hang properties on instead of using the reification vocabulary to hang properties off a statement. I think we have established that to talk about taxon concept alignments (relationships) we need to reify them in some form. So syntax (turtle v trig) or choice of rdf reification vocabulary, blank nodes, named graphs, quoted triples isn't really the point. I think that taxon concept alignments are special enough and interesting enough for TCS to include a class for them. I don't see the advantage in dumping the class. I don't see the advantage in having two ways to do the same thing by having a class and having new properties.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 22, 2023

Yes, that is the point. It does not matter (for TCS) what the RDF looks like. TCS has to allow different representations with the same semantics.

And yes, I agree that the Taxon Concept Alignments are special enough to have a class for them in TCS (that is still the proposal), but it should not be the only way we these relationships can be represented. Also, we should not define properties for this class, but let people (or communities) choose how they want to implement it, or whether to use it at all. In the Nanopublication example above, for example, it is better not to have a TaxonConceptAlignment object.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 22, 2023

Okay, I think that is more than enough about the niceties or not-so-niceties of RDF. Can we get back to TCS and the proposal at hand now?

All the problems people are having with my examples were with the Relationship object in these examples (whether it be called rdf:Statement, dwc:ResourceRelationship, tcs:TaxonRelationship or tcs:TaxonConceptAlignment). My proposal was to not define a Relationship class and to define the relationships as properties. So, logically, these objections to the examples should translate to support for the proposal.

Every relationship, in TCS and everywhere else, boils down to and has to be expressible as:

[A] <is related to> [B]

The problem here is that we often need an ID and/or extra metadata on the relationship (or on the assertion of the relationship). This problem is not unique to TCS and therefore not a concern of TCS. For me, reification is the solution (or 'a' solution, as I provided another one as well), not the problem, and the RDF reification vocabulary was exactly what I was looking for. If people think that reification is bad for some reason, they will have to find a different solution (for themselves), but that solution cannot be defining a reification vocabulary in TCS. TCS does not define boilerplate.

We cannot have a Relationship object that is essentially the same as the Darwin Core ResourceRelationship. The Darwin Core and RDF/OWL Task Groups, in the Darwin Core RDF Guide, decided not to define dwciri: analogues for the properties of the dwc:ResourceRelationship, because it was 'not clear whether dwc:ResourceRelationship instances make sense in the context of RDF'. We cannot just ignore that and define the same properties in tcs:. No matter whether people like it or not, TCS is a TDWG standard, Darwin Core is TDWG's "general purpose" standard and tcs: is to the Darwin Core Taxon class what dwciri: is to the rest of Darwin Core. If something does not go for Darwin Core, it does not go for TCS either. If people do not agree with the Darwin Core RDF Guide, that is a discussion to have in the TAG, not the TCS 2 Task Group.

@baskaufs
Copy link

I am hesitant to post more RDF-related stuff here, because it is probably detracting from the main issue. But I did want to make a comment about one of @EndCoop's earlier statements:

The point I was attempting to make is that you can't conclude that: ex:subject ex:predicate ex:object . from :

_:xxx rdf:type rdf:Statement .
_:xxx rdf:subject ex:subject .
_:xxx rdf:predicate ex:predicate .
_:xxx rdf:object ex:object .

So if you want your dataset to say ex:subject ex:predicate ex:object . then you have to include ex:subject ex:predicate ex:object . in your dataset.

@EndCoop says "you can't conclude that: ex:subject ex:predicate ex:object .", but what does that mean? If it means "it isn't an automatic entailment", that is true because I don't think there are any semantics of the RDF reification properties that would cause that. But I also think that the intention of having those properties is to imply that there is a ex:subject ex:predicate ex:object . triple whether it is materialized or not. The advantage of NOT forcing the semantics is that it gives a user of the data the option of whether or not to assert the implied triple or not. Having such a triple might be convenient for some users who want to traverse the direct link between the subject and object nodes through querying or something, but other users may not care and choose not to include the direct triple.

Although the RDF reification properties might not get used much, there are similar strategies that are. The SKOS-XL extension allows a user to describe lexical entities and to link them in a way that can't be done with literal strings. Although there aren't any semantics of the SKOS-XL properties that entail the simpler SKOS label relationships, one can certainly generate the simple relationships based on understanding what the SKOS-XL properties mean. Example from the Getty Thesaurus of Geographic Names:

<http://vocab.getty.edu/tgn/7013959> skosxl:prefLabel <http://vocab.getty.edu/tgn/term/323840-en>.
<http://vocab.getty.edu/tgn/term/323840-en> skosxl:literalForm "Nashville"@en.
...more statements describing the provenance and relationships of http://vocab.getty.edu/tgn/term/323840-en

collapses to

<http://vocab.getty.edu/tgn/7013959> skos:prefLabel "Nashville"@en.

If I understand the Getty's process correctly, they generate the SKOS-XL triples from their database, then generate the SKOS label triples from the SKOS-XL. Those SKOS label triples aren't entailed, but they are implied by the SKOS model and are generated for the convenience of allowing for simpler queries. Having the ability to instantiate skosxl:Label resources enables the linking by label relations in a way analogous to the way we might want to link taxon relationship assertions.

Another similar "real life" example is Wikidata's creation of statement instances, which allow references to be linked to those statements. There are also direct ("truthy") statements that are simpler, but don't allow for linking provenance information. The direct property statements, e.g.

wd:Q2 wdt:P2 wd:Q3.

are constructed for preferred and normal ranked statements, but are not for deprecated ranked statements -- as in my previous examples, this is a choice made based on how convenient you want to make searching. In the case of deprecated statements, they don't want to make them easier to find.

My point is that although the simpler triples may not be entailed by the more complex sets of triples, a user who understands the model may choose to materialize them for convenience.

As Niels has said, what we are doing here is analogous to the dwc:ResourcRelationship terms. I am hoping that we can settle on a TDWG-wide approach to model these kinds of relationships as linked data. The RDF reification terms provide a relatively simple and well-known possibility for doing that. The possibility to collapse the more complex expressions of these relationships into simple links between the subject and object resources (via a single triple using a property for the relationship) for the purposes of simplifying queries is appealing.

@EndCoop
Copy link

EndCoop commented Aug 23, 2023

I am hesitant to post more RDF-related stuff here, because it is probably detracting from the main issue.

Point taken. (fwiw yes I was talking about entailment. I thought that skosxl does involve entailment, at least that is how I read https://www.w3.org/TR/skos-reference/#xl Appendix B.3.2. and the example 83)

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 23, 2023

@baskaufs and @EndCoop, you can discuss RDF as much as you want now. The discussion is to the point, just not directly what I wanted to get out of the issue, but I've got that now. This issue is linked to #43, so if people want to discuss anything related to Relationship objects, they might as well do it here.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 24, 2023

@baskaufs:

I am hoping that we can settle on a TDWG-wide approach to model these kinds of relationships as linked data.

I think that is what we all want. Is this something the TAG could pick up (after we've put out TCS 2)? I think we probably have as strong a use case here as you are going to get (a relationship assertion for which we absolutely need an ID and a Blamable).

@nielsklazenga
Copy link
Member

Okay, here is the new term list with what we discussed this morning (it was morning in Australia) implemented: https://github.com/tdwg/tcs2/tree/dev-20230824/docs/tcs-terms. I have kept it in a development branch, as there will probably be errors and I have to go over the text to see if the change requires any more changes to notes. It occurred to me today that it would probably be good if I had created the branch right at the start...but then we would have missed all this (↑),

Not having a TaxonRelationship class causes a bit of a problem for the definition of TaxonConcept, as that referred to it. I have provisionally fixed it, but it can probably be improved. It would be nice if we could have a super-property but we cannot do that in TDWG standards.

@nielsklazenga
Copy link
Member

Not having a TaxonRelationship class causes a bit of a problem for the definition of TaxonConcept, as that referred to it. I have provisionally fixed it, but it can probably be improved. It would be nice if we could have a super-property but we cannot do that in TDWG standards.

I have now informally referred to them as 'concept mapping relationships' (in analogy to the skos:mappingRelation property) and in the notes for TaxonConcept indicated what the concept mapping properties are.

@baskaufs
Copy link

@baskaufs:

I am hoping that we can settle on a TDWG-wide approach to model these kinds of relationships as linked data.

I think that is what we all want. Is this something the TAG could pick up (after we've put out TCS 2)? I think we probably have as strong a use case here as you are going to get (a relationship assertion for which we absolutely need an ID and a Blamable).

Yes, it's on my todo list. I've got several other things going on with the TAG right now but I hope to bring it up and this discussion will be very helpful background.

@baskaufs
Copy link

I am hesitant to post more RDF-related stuff here, because it is probably detracting from the main issue.

Point taken. (fwiw yes I was talking about entailment. I thought that skosxl does involve entailment, at least that is how I read https://www.w3.org/TR/skos-reference/#xl Appendix B.3.2. and the example 83)

Since Niels has given his blessing to continuing to talk about RDF, I'll respond.

Yes, you are right. I forgot about those entailment assertions in the SKOS model. I guess what I was thinking that a generic OWL reasoner would not materialize the simple triples based on the OWL properties of the terms -- you'd need to have a SKOS-aware system in order for those entailments to take effect (effectively making the choice to apply the entailments). But yes, following the model, they are entailed.

This is making my brain hurt... :-)

@nielsklazenga
Copy link
Member

Yes, it's on my todo list. I've got several other things going on with the TAG right now but I hope to bring it up and this discussion will be very helpful background.

Thanks @baskaufs . I will put something about it in the Implementation Experience Report as well (so you can at least wait for that).

@EndCoop
Copy link

EndCoop commented Aug 25, 2023

This is making my brain hurt... :-)

Yeah, sorry, I'm not sure I have been making my thoughts very clear but thank you for taking the time to look at them and reply @baskaufs

There is still a part of removing the tcs:TaxonRelationshipAssertion class and replacing its type vocabulary with object properties on the tcs:TaxonConcept that makes inference about the source of assertions tricky. But I concede that the entailment part is specific to RDF implementations and so outside the scope of TCS. Similarly if property assertions using TCS are (mostly) attributable to the file/datasource, like they are with Darwin Core, then figuring out where property assertions come from is trivial. Presumable as a TDWG standard, TCS is intended for use like this. So implementations that require more complex attribution of sources are likewise outside the scope of TCS.

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 26, 2023

@EndCoop, what has happened here in no way stops us from having a Taxon Relationship or Mapping object some day but, because of the unclarity around the use of the dwc:ResourceRelationship in linked data, I think it is unwise to define it right now. What defining the relationship terms as properties does do, however, is make that they can be used with or without a relationship object and thus make TCS—in particular these mapping relationship terms—more versatile. It also leaves the decision on which relationships can just be stated and which require a relationship object to the implementer and out of TCS.

Also, regarding attribution, tcs:accordingTo and tcs:namePublishedIn provide attribution of course, but that is not why they are in TCS. Without the accordingTo we do not know, or have a chance of knowing, what the definition of a taxonomic group—or the meaning of a name—is and a TaxonName with the same taxonNameString but a different namePublishedIn is a different name, so they are part of the data we record. Attribution in itself is not a concern of TCS. People can use Nanopublications or Web Annotations for that.

@camwebb
Copy link
Member Author

camwebb commented Aug 28, 2023

Thanks all for the virtual meeting/discussion last week. @nielsklazenga, I like what you've done in the dev terms list. But a question on process: we left this matter (taxonRelationship with controlled vocab, as in TCS1, vs. new predicates) undecided at the meeting. Do you want a vote, or is the now a general sense that the predicate terms are better?

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 28, 2023

Thanks @camwebb, no, let's circle back to this later. By the way, @baskaufs (or yourself) might have to correct me, but I do not think there is anything that has happened here that prevents people from using these terms as controlled terms.

@prefix tcs: <http://rs.tdwg.org/tcs/terms/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

tcs:isCongruentWith a rdf:Property, skos:Concept ;
	skos:prefLabel "is congruent with" ;
	skos:altLabel "equals" ;
	rdfs:comment """The subject and object taxon concepts have a congruent 
			taxonomic meaning, i.e. there is no conflict between the concepts""" ;
	skos:definition """The subject and object taxon concepts have a congruent 
			taxonomic meaning, i.e. there is no conflict between the concepts""" ;
	rdfs:isDefinedBy <http://rs.tdwg.org/tcs/terms> ;
	skos:inScheme <http://rs.tdwg.org/tcs/terms> .

tcs:includes a rdf:Property, skos:Concept ;
	skos:prefLabel "includes" ;
	skos:altLabel "has proper subset" ;
	rdfs:comment """The subject taxon concept has a more inclusive 
			taxonomic meaning than the object taxon concept""" ;
	skos:definition """The subject taxon concept has a more inclusive 
			taxonomic meaning than the object taxon concept""" ;
	rdfs:isDefinedBy <http://rs.tdwg.org/tcs/terms> ;
	skos:inScheme <http://rs.tdwg.org/tcs/terms> .

tcs:isIncludedIn a rdf:Property, skos:Concept ;
	skos:prefLabel "is included in" ;
	skos:altLabel "is proper subset of" ;
	rdfs:comment """The subject taxon concept has a less inclusive 
			taxonomic meaning than the object taxon concept""" ;
	skos:definition """The subject taxon concept has a less inclusive 
			taxonomic meaning than the object taxon concept""" ;
	rdfs:isDefinedBy <http://rs.tdwg.org/tcs/terms> ;
	skos:inScheme <http://rs.tdwg.org/tcs/terms> .

tcs:partiallyOverlaps a rdf:Property, skos:Concept ;
	skos:prefLabel "partially overlaps" ;
	skos:altLabel "overlaps" ;
	rdfs:comment """The subject and object taxon concepts have partially 
			overlapping taxonomic meanings, i.e. they have some members 
			in common, but each concept in addition has members that are 
			not included in the other concept""" ;
	skos:definition """The subject and object taxon concepts have partially 
			overlapping taxonomic meanings, i.e. they have some members 
			in common, but each concept in addition has members that are 
			not included in the other concept""" ;
	rdfs:isDefinedBy <http://rs.tdwg.org/tcs/terms> ;
	skos:inScheme <http://rs.tdwg.org/tcs/terms> .

tcs:isDisjointFrom a rdf:Property, skos:Concept ;
	skos:prefLabel "is disjoint from" ;
	skos:altLabel "excludes" ;
	rdfs:comment """The subject and objects taxon concepts have non-overlapping 
			taxonomic meanings, i.e. they do not have any members in common""" ;
	skos:definition """The subject and objects taxon concepts have non-overlapping 
			taxonomic meanings, i.e. they do not have any members in common""" ;
	rdfs:isDefinedBy <http://rs.tdwg.org/tcs/terms> ;
	skos:inScheme <http://rs.tdwg.org/tcs/terms> .

tcs:intersects a rdf:Property, skos:Concept ;
	skos:prefLabel "intersects" ;
	rdfs:comment """The taxonomic meanings of the subject and object taxon 
			concepts intersect, i.e. they have at least one member in common""" ;
	skos:definition """The taxonomic meanings of the subject and object taxon 
			concepts intersect, i.e. they have at least one member in common""" ;
	rdfs:isDefinedBy <http://rs.tdwg.org/tcs/terms> ;
	skos:inScheme <http://rs.tdwg.org/tcs/terms> .

@nielsklazenga
Copy link
Member

nielsklazenga commented Aug 29, 2023

@camwebb , sorry, I got your question wrong. The fate of a TaxonRelationship class and relationshipType property is linked to discussions around the use of the dwc:ResourceRelationship class and its properties in linked data. So, we have to wait for that. I want both, by the way. Having a property that can only take these six terms as its object (while still being able to use the terms elsewhere) is very appealing.

I am personally happy to define a TaxonConceptMapping (or ...Alignment) class right now (even though it is not in TCS 1) with the properties of the RDF reification vocabulary (which we do not have to define; I would have been happy with the properties of the dwc:ResourceRelationship class as well had they been available), but that was what people were protesting against. So, I guess people just have to make up their minds as to what they want or align it with what is actually achievable at the moment.

@tdwg tdwg locked and limited conversation to collaborators Jan 11, 2024
@nielsklazenga nielsklazenga converted this issue into discussion #227 Jan 11, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants