Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDF-star for recording commit deltas #13

Open
pfps opened this issue May 31, 2023 · 12 comments
Open

RDF-star for recording commit deltas #13

pfps opened this issue May 31, 2023 · 12 comments
Labels
use case Issue to record discussion on a use case

Comments

@pfps
Copy link
Contributor

pfps commented May 31, 2023

See https://github.com/w3c/rdf-ucr/wiki/RDF-star-for-recording-commit-deltas-to-an-RDF-graph for the current status of this use case.

Taken from https://w3c.github.io/rdf-star/UCR/rdf-star-ucr.html#annotate-commit-deltas

I want to annotate commit deltas to an RDF graph, e.g.:

r:47e1cf2 a :Commit ; 
     :delete <<:bob :age 23>> ;
     :add <<:bob :age 24>>, <<:bob :gender :male>> .

So that a triple can be searched for across commit history in SPARQL-star.

@pfps pfps added the use case Issue to record discussion on a use case label May 31, 2023
@TallTed

This comment was marked as resolved.

@rat10
Copy link

rat10 commented Jun 8, 2023

The second block of the second example

r:47a54ad a :Commit ; 
     :graph r:geneology;
     :time "2002-06-07T09:00:00"^^xsd:dateTime;
     :add <<a:bob b:gender b:male>> b:certainty 0.1 >>.

lacks an <<in the beginning of the :added quoted triple.

The labeling of namespaces and bnodes in the third block is confusing and makes the example harder to read. How about the following:

r:47a54ae a :Commit ; 
     :graph r:geneology;
     :time "2002-06-07T09:00:01"^^xsd:dateTime;
     :add << <<a:bob a:gender a:male>> b:support _:x >> ,
     	  <<_:x b:source b:news-of-the-world >> , 
     	  <<_:x b:date "1999-04-01"^^xsd:date >> , 
 	  << <<a:bob a:gender a:male>> b:support _:y >> ,
     	  <<_:y b:source b:weekly-world-news >> , 
     	  <<_:y b:date "2001-08-09"^^xsd:date >> .

@rat10
Copy link

rat10 commented Jun 8, 2023

The wiki page section Representing the use case without quoted triples compares quoted triples to standard reification. Another candidate would be named graphs - shouldn't they be included in a comparison as well? Following hints from the RDF 1.1 WG Note On Semantics of RDF Datasets a property could be defined that specifies for such commit delta graphs a semantics exactly tailored to their needs, e.g. making use of the semantics specified in the RDF-star CG report.

@pfps
Copy link
Contributor Author

pfps commented Jun 8, 2023

@TallTed Done, thanks.

@pfps
Copy link
Contributor Author

pfps commented Jun 8, 2023

@rat10 Done, thanks.

@pfps
Copy link
Contributor Author

pfps commented Jun 8, 2023

@rat10 Using named graphs might be possible. But how could the semantics of named graphs be changed to make blank node identity important when querying?

@pfps
Copy link
Contributor Author

pfps commented Jun 8, 2023

@rat10 I added some discussion of named graphs in the wiki page but it doesn't discuss semantics, except to say that it is not so important because of the presence of named graphs.

@rat10
Copy link

rat10 commented Jun 9, 2023

@rat10 Using named graphs might be possible. But how could the semantics of named graphs be changed to make blank node identity important when querying?

I'm not sure I understand your question correctly, but maybe I do. Then my answer would be:

  • it's complicated because blank nodes are
  • any good answer will have to resort to some best practice w.r.t. blank node usage, e.g. don't use them for things that have some identifying 'character' and importance, use them only for structure (lists, n-ary relations etc)
  • based on such good practice, work with Concise Bounded Descriptions (CBD). The meaning of a blank node is defined by all statements in which it occurs. Only in special circumstances, guaranteed by out-of-band means, will it be possible to faithfully capture the meaning of a blank node without documenting the whole CBD (e.g. Dydra's graph store provides such means in a very straightforward manner, but out-of-band).

Or maybe punt on the issue and just provide a means to declare if UNION or MERGE semantics are to be applied to the graph, analog to other features like referential opacity etc?

@pfps
Copy link
Contributor Author

pfps commented Jun 16, 2023

@rat10 The latter, or maybe even just basing the UNION/MERGE decision on the source of the blank nodes.

@JervenBolleman
Copy link

How does the demand for stable blank node identifiers work when dealing with data in multiple documents? and why should this identity be preserved? I am missing the why for this in the wiki write up (as it is at time of commenting)

This seems like an auto-skolemization feature add on to deal with blank node issues. I think that the core use case can be supported without this.

@marcelotto
Copy link

I'm working on an RDF-star-based VCS for RDF, where I'm faced with this use case. For the deltas, I'm actually not linking to the triples directly, but to RDF Triple Compounds instead, which is why I'm also dealing with the use case proposed in #26.

The blank node identity problem is at least partially abstracted away, by using hash ids of the RDF dataset canonicalized version of the subgraphs for the compounds. The problem of finding the correct compound when using different blank node identifiers still remains this way, but at least it can be detected if the same changes with different blank nodes were already applied.

@marcelotto
Copy link

The mentioned VCS for RDF has been released now. See here for more: https://ontogen.io/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
use case Issue to record discussion on a use case
Projects
None yet
Development

No branches or pull requests

5 participants