-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relation between fields and collections #3
Comments
@thomas-delva could you add the same example as the one used for collection but with fields? Then it will be easier to compare, but in any case, I think we don't want to have 1 solution but we want them to offer same coverage |
Below is a fields version of the examples in the gathermap slides for easier comparison. Simple exampleData: { "a": "1",
"b": [ "1", "2", "3" ] } Logical source + fields: <LS> a rml:LogicalSource ;
rml:iterator "$" ;
rml:field [
rml:name "a_field" ;
rml:reference "$.a" ] ;
rml:field [
rml:name "b_field" ;
rml:reference "$.b.*" ] . Intermediate representation:
Object map: ... objectMap [
rml:gather ( [ rml:reference "field_b" ] ) ;
rml:gatherAs rdf:List ;
rml:gatherBy "it" # can be implicit: one level higher than field_b
# "it" refers to the iterator, i.e., the "field" one level above field_b
] . Output: ... ( "1" "2" "3" ) Relational databasesInput == intermediate representation:
Logical source + fields: <LS> a rml:LogicalSource ;
rml:field [
rml:name "bookid_field" ;
rml:reference "BOOKID" ] ;
rml:field [
rml:name "id_field" ;
rml:reference "ID" ] . Triples map: <TM> a rr:TriplesMap ;
rml:logicalSource <LS> ;
rr:subjectMap [ rr:template "http://ex.com/book{bookid_field}" ] ;
rr:predicateObjectMap [
rr:predicate :writtenBy ;
objectMap [
rml:gather ( [ rr:template "http://ex.com/author{id_field}" ] ) ;
rml:gatherAs rdf:List ;
rml:gatherBy "bookid_field"
] ] . Output: :book1 :writtenBy ( :author1 ) .
:book2 :writtenBy ( :author2 :author3 ) . Nested iterationHere, fields can be declared once and then used to generate collections from different iteration levels (compare the two predicate-object maps). Data: { "id": "id",
"a": [ [ "1", "2", "3" ],
[ "4", "5", "6" ] ] } Logical source + fields: <LS> a rml:LogicalSource ;
rml:iterator "$" ;
rml:field [
rml:name "id_field" ;
rml:reference "$.id" ] ;
rml:field [
rml:name "a_outer_field" ;
rml:reference "$.a.*"
rml:field [
rml:name "a_inner_field" ;
rml:reference "$.*" ] ] . Intermediate representation:
Triples map: <TM> a rr:TriplesMap ;
rml:logicalSource <LS> ;
rr:subjectMap [ rr:template "http://ex.com/{id_field}" ] ;
rr:predicateObjectMap [
rr:predicate :a_values_grouped ;
objectMap [
rml:gather ( [ rml:reference "a_inner_field" ] ) ;
rml:gatherAs rdf:List ;
rml:gatherBy "a_outer_field" # can be implicit; one level higher than a_inner_field
] ] ;
rr:predicateObjectMap [
rr:predicate :a_values_all ;
objectMap [
rml:gather ( [ rml:reference "a_inner_field" ] ) ;
rml:gatherAs rdf:List ;
rml:gatherBy "it" # "it" refers to the iterator
] ] . Output: :id :a_values_grouped ( "1" "2" "3" ), ( "4" "5" "6" ) ;
:a_values_all ( "1" "2" "3" "4" "5" "6" ) . Nested gather mapsData (same as previous): { "id": "id",
"a": [ [ "1", "2", "3" ],
[ "4", "5", "6" ] ] } Logical source + fields (same as previous): <LS> a rml:LogicalSource ;
rml:iterator "$" ;
rml:field [
rml:name "id_field" ;
rml:reference "$.id" ] ;
rml:field [
rml:name "a_outer_field" ;
rml:reference "$.a.*"
rml:field [
rml:name "a_inner_field" ;
rml:reference "$.*" ] ] . Intermediate representation (same as previous):
Object map: ... rr:objectMap [
rr:termType rr:BlankNode ;
rml:gather ([
rr:termType rr:BlankNode ;
rml:gather ( [ rml:reference "a_inner_field" ] ) ;
rml:gatherAs rdf:List ;
rml:gatherBy "a_outer_field" # can be implicit: one level higher than a_inner_field
]) ;
rml:gatherAs rdf:List;
rml:gatherBy "it" # can be implicit: one level higher than a_outer_field
# "it" refers to the iterator, i.e., the "field" one level above a_outer_field
] ; Output: ( ( "1" "2" "3" ) ( "4" "5" "6" ) ) Multiple term maps in gather mapData: { "a": "1",
"b": [ "1", "2", "3" ],
"c": [ "4", "5", "6" ] } Logical source + fields: <LS> a rml:LogicalSource ;
rml:iterator "$" ;
rml:field [
rml:name "a_field" ;
rml:reference "$.a" ] ;
rml:field [
rml:name "b_field" ;
rml:reference "$.b.*" ]
rml:field [
rml:name "c_field" ;
rml:reference "$.c.*" ] . Intermediate representation:
Object map: ... objectMap [
rml:gather ( [ rml:reference "field_b" ] [ rml:reference "field_c" ] ) ;
rml:gatherAs rdf:List ;
rml:gatherBy "it" # can be implicit: one level higher than field_b
# "it" refers to the iterator, i.e., the "field" one level above field_b
rml:strategy rml:Append ; # default strategy
] . Output: ... ( "1" "2" "3" "4" "5" "6" ) |
Fields and collections both deal with multivalues in the source data, so they should be aligned.
Currently Franck and Christophe define the gather map as a way to generate collections, we should see how it works if fields are used instead of references: https://docs.google.com/presentation/d/1QYSyuzvN4xO3mC6FTja2RLsZS2JCZLqmt53DXE4KyxM/
In the fields paper a group by approach was proposed to generate collections, where field values are grouped by equal values of other fields, we should probably see how this compares with the gathering approach:
The text was updated successfully, but these errors were encountered: