Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we need a new property dcat:seriesMember defined as dcat:inSeries's inverse? #1335

Closed
riccardoAlbertoni opened this issue Mar 22, 2021 · 7 comments · Fixed by #1461
Closed

Comments

@riccardoAlbertoni
Copy link
Contributor

Do we need a new property (let's name it dcat:seriesMember) defined as dcat:inSeries's inverse?

Requests for dcat:inSeries's inverse to connect the dataset series to its children have been discussed in issue #1307.

Part of the previous discussion is reported below.

Originally posted by @riannella in #1307 (comment)

In our sector, for example, a Regulator would say the more important relationship was from the series to the parts.
Let's have both.

Most of the concerns against adding this property elaborate on maintaining metadata consistent when a dataset is added to the series.

Originally posted by @jakubklimek in #1307 (comment)

@dr-shorthair @riannella From the point of view of an application developer, let me again state that I would strongly prefer to standardize only one of the two mutually inverse properties.
Otherwise, compliant applications and queries need to always account for the possibility that in one data catalog, the datasets are connected to the series using one property, in another using the second, which makes them more complex with each inverse property pair.

Originally posted by @agreiner in #1307 (comment)

I would suggest keeping only the inSeries property so that one needn't worry about updating metadata for the series every time a new dataset is added to that series.

Keeping the metadata of series consistent while adding new datasets seems to be required anyway by the assumption of "upstream inheritance" discussed in #1273

Originally posted by @riccardoAlbertoni in #1307 (comment)

I suspect the metadata of the dataset series needs to be updated anyway. Especially if we keep the upstream inheritance explained in https://raw.githack.com/w3c/dxwg/dcat-dataseries-issue1272/dcat/index.html#dataset-series where it is said: "The update date (dcterms:modified) of the dataset series should correspond to the latest publication or update date of the child datasets."
Also, we are having some parallel discussion of whether supporting inverse properties with a "lightweight approach" adopted by PROV-O (https://www.w3.org/TR/prov-o/#inverse-names) ( we have discussed this in tonight call, see meeting minutes)

@riannella
Copy link

Yes.
(Also, it is important to note that it is not mandatory to "maintain" both properties.)

@agreiner
Copy link
Contributor

The problem for developers with having both properties is that one doesn't know whether the creator of a dataset's metadata chose to list all the datasets in the series in the metadata for the series itself, or to assert membership in the series in the metadata for each dataset. Querying for those relationships would become treacherous, because one would have to detect both techniques and also deal with the more common case of neither being used. In this case, the flexibility in maintenance creates difficulty for consumers.

@dr-shorthair
Copy link
Contributor

I tend to agree with @agreiner .

@riannella wants to keep track of members from the context of the series. But this is easily done with just the one predicate.. When he adds a new member <m99> to a series <s88>, he adds this triple to the graph:

<m99> dcat:inSeries <s88> . 

then he can easily find all the members of the series with a simple query of that direction, e.g. in SPARQL

SELECT ?s ?m 
WHERE { ?m dcat:inSeries ?s }

Doesn't that satisfy the requirement @riannella ?

@riannella
Copy link

riannella commented Mar 24, 2021

@dr-shorthair That is a possible solution, but from this:

This document does not prescribe any particular method of deploying data catalogs expressed in DCAT. DCAT information can be presented in many forms including RDF accessible via SPARQL endpoints, embedded in HTML pages as [HTML-RDFa], or serialized as RDF/XML [RDF-SYNTAX-GRAMMAR], [N3], [Turtle], [JSON-LD] or other formats.

it cannot be the solution.

@agreiner If someone has a Series and links this to 10 datasets, and only 9 of these datasets links back to the Series (for whatever reason)....how does his create difficulty? (from a specification point-of-view)

@dr-shorthair
Copy link
Contributor

Which aspect of that statement are you interested in here @riannella? Is it just the different serializations?
That is not important - I used TTL because it is easier to type into a GitHub issue. No problem if you prefer one of the other serializations.

The point I was trying to make is that if dcat:seriesMember owl:inverseOf dcat:inSeries . then

<m99> dcat:inSeries <s88> . 

provides exactly the same information as

<s88> dcat:seriesMember <m99> . 

and it is just as easy to add one of these to the graph as the other. And SPARQL works both ways just as easily. So we might as well standardize on just one direction, so we avoid the inconsistency situation that you describe altogether . (I think this is what @agreiner and @jakubklimek are saying too.)

@riannella
Copy link

riannella commented Mar 24, 2021

The SPARQL bit. (I could describe my DCAT using RDFa in HTML files.)

I am slightly confused now...you say we have both properties (dcat:seriesMember owl:inverseOf dcat:inSeries .) - great - then you say you only need one assertion to be added to the graph (assuming every implementations uses a graph), - lets keep going - then you say, because we only need one assertion (infer the other) then we only need one property. But you started off with two properties to make the point you only need one...see i am confused :-)

@dr-shorthair
Copy link
Contributor

I am slightly confused now...you say we have both properties (dcat:seriesMember owl:inverseOf dcat:inSeries .)

I'm saying that if we notionally define the inverse pair like that, then we discover that we don't actually need both of them to represent the same knowledge. So best only include one, and explain to people who think they need the inverse why they actually don't.

If you are not implementing a graph or triple-store, then you must be mapping to RDF on the interface, in which case the same argument still applies. Nested Turtle or RDF/XML or any other serialization is just syntax - it still implies a set of triples. And in that case it doesn't really matter which direction you implement.

I am a relatively recent convert to this POV, because I wasn't thinking clearly before :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

5 participants