Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Member extraction algorithm #71

Closed
5 tasks done
pietercolpaert opened this issue Apr 29, 2023 · 7 comments
Closed
5 tasks done

The Member extraction algorithm #71

pietercolpaert opened this issue Apr 29, 2023 · 7 comments
Assignees

Comments

@pietercolpaert
Copy link
Member

pietercolpaert commented Apr 29, 2023

This rather large issue proposes to:

  1. clearly define the tree:Member class,
  2. Clear out the explanation of tree:member: it refers to a topic, not to a tree:Member,
  3. define the member extraction algorithm as part of the spec,
  4. clear out triggers for an HTTP request
  5. introduce named graph support

Related: SEMICeu/LinkedDataEventStreams#37

Pull request: #78

Follow the discussions and presentations on the mailing list: https://www.w3.org/community/treecg/

@pietercolpaert pietercolpaert changed the title tree:member points to the primary topic of a member, not to the member itself The Member extraction algorithm May 8, 2023
@xdxxxdx
Copy link
Contributor

xdxxxdx commented May 9, 2023

Hello @pietercolpaert ,
Member dereferencing
Please explain why the members needs to be dereference here?
Thanks

@pietercolpaert
Copy link
Member Author

Hello @pietercolpaert , Member dereferencing Please explain why the members needs to be dereference here? Thanks

Take for example the case of Marine Regions: https://marineregions.org/feed

In their implementation, they only foresee a list of members and when they changed. If you want the contents of the members, you need to dereference them. I’d like to add a property to make sure this can be indicated to the client that one extra HTTP request per member will be needed.

@pietercolpaert pietercolpaert self-assigned this May 25, 2023
@pietercolpaert
Copy link
Member Author

pietercolpaert commented May 25, 2023

After the W3C TREE CG meeting of 2023-05-24:

  1. A tree:Member is a set of triples. This member is contained in a collection. The set of triples that are part of the member is defined by the member extraction algorithm.
  2. The explanation of tree:member was already quite okay, the spec just needs some editing work. tree:member refers to the primary topic, except for in the case of point 5.
  3. The member extraction algorithm - see below
  4. Some discussions arose on this one and this will be continued in the call of 2023-06-07
    • Instead of just a dereferenceMember boolean flag, we could also think about a deferencePath that indicates a property path to a named node that needs to be dereferenced if you want to get to a complete member. The dereferenceMember flag could then be realized with an empty list as the object of tree:derefencePath.
    • @bergos commented: Is this needed at all? Can’t we use on the one hand the ideas behind CBD, and on the other hand use the SHACL shape to understand whether the triple set is complete?
    • @pietercolpaert’s reply on this: I see some problems with that approach with optional properties, but preparing an example of this for the next call so we can continue this discussion.
  5. Instead of a boolean property, we are opting for typing and introducing a class: tree:NamedGraphCollection that adds explicit semantics to the named graph wrt the member extraction algorithm.
  6. To be discussed on a next call

The base member extraction algorithm

Find all triples with the member URI as the subject and then repeat this for every named node and blank node that has been found in the object, except for subjects that have already been processed, and except for other members in the collection.

let Subjects = getMemberUris(triples);
members = [];
for (s of Subjects) {
      members.push(extractMember(triples, s, processedSubjects, Subjects));
}

//Recursive function
function extractMember (T, s, processedSubjects, Subjects) {
	processedSubjects.push(s); //This will prevent cycles
        member = [];
        for (t of T) {
               if (t.subject.value == s) {
	         	member.push(t);
		        if (t.object.termType !== 'Literal'  && !processedSubjects.contains(t.object.value)  && !Subjects.contains(t.object.value)) {
	                  	member.concat(extractMember (T, t.object.value,processedSubjects,Subjects);
	                }
                }
          }
       return member;
}

pietercolpaert added a commit that referenced this issue Jun 2, 2023
@bergos
Copy link
Contributor

bergos commented Jun 5, 2023

I created an example where CBD and the SHACL shape would extract the same triples. Below is the data, but you can also play with it on the SHACL Playground.

Data

@prefix ex: <http://example.org/>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.

ex:resource1
  ex:property1 [
    ex:property2 ex:resource2
  ].

ex:resource2
  ex:property3 "test".

Shape

@prefix ex: <http://example.org/>.
@prefix sh: <http://www.w3.org/ns/shacl#>.

ex:resource1Shape a sh:NodeShape;
  sh:name "resource 1 shape";
  sh:targetNode ex:resource1;
  sh:property [
    sh:name "property 1";
    sh:path ex:property1;
    sh:node ex:property2Shape
  ].

ex:property2Shape a sh:NodeShape;
  sh:name "property 2 shape";
  sh:property [
    sh:name "property 2";
    sh:path ex:property2
  ].

Extract

<http://example.org/resource1>
  <http://example.org/property1> [
      <http://example.org/property2> <http://example.org/resource2>
    ].

CBD

CBD stops after the triple with ex:resource2 as an object because named node objects are not traversed.

Shape

The SHACL requires understanding sh:property, sh:path, and sh:node. Adding constraints would make things more complicated. I think they should be explicitly excluded from the logic.

@pietercolpaert
Copy link
Member Author

From the TREE CG Call:

To motivate: if we choose CBD by default: what are the use cases for any specializations?

@pietercolpaert
Copy link
Member Author

Over July-August, it became clear this the way to go forward on this issue. Will adapt the description a bit.

@pietercolpaert
Copy link
Member Author

Has been published in the latest version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants