From c06e7943d077fb0e1c95cbf8b26149851e2c81c1 Mon Sep 17 00:00:00 2001 From: David Chaves Date: Tue, 7 Oct 2025 11:59:32 +0200 Subject: [PATCH 01/14] adding base IRI --- spec/section/introduction.md | 25 --------------- spec/section/tooling.md | 59 ++++++++++++++++++++++++++++++++++-- 2 files changed, 56 insertions(+), 28 deletions(-) diff --git a/spec/section/introduction.md b/spec/section/introduction.md index c080b9e8..e69de29b 100644 --- a/spec/section/introduction.md +++ b/spec/section/introduction.md @@ -1,25 +0,0 @@ -# Base IRIs -The base IRI of the [=mapping document=] is used to resolve relative [=IRIs=] in the RML document following the specification of the Turtle serialisaiton. - -## Base IRI for mapping rules - -The [=base IRI=] of the [=Triples Map=] is used in resolving relative [=IRIs=] produced by the [=RML mapping=]. - - -
-# Triples Map that has a declared base IRI
-<#TriplesMap>
-    a rml:TriplesMap;
-    rml:baseIri  .
-
- -The [=base IRI=] MUST be a valid [=IRI=]. It SHOULD NOT contain question mark ("`?`") or hash ("`#`") characters and SHOULD end in a slash ("`/`") character. - -To obtain an absolute [=IRI=] from a relative [=IRI=], the term generation rules of RML use simple string concatenation, rather than the more complex algorithm for resolution of relative URIs defined in Section 5.2 of [RFC3986]. This ensures that the original database value can be reconstructed from the generated absolute [=IRI=]. Both algorithms are equivalent if all of the following are true: - - 1. The base IRI does not contain question marks or hashes, - 2. the base IRI ends in a slash, - 3. the relative [=IRI=] does not start with a slash, and - 4. the relative [=IRI=] does not contain any "`.`" or "`..`" path segments. - - diff --git a/spec/section/tooling.md b/spec/section/tooling.md index f4d4fff3..52348b81 100644 --- a/spec/section/tooling.md +++ b/spec/section/tooling.md @@ -16,14 +16,65 @@ or offer any other means of providing access to the [=output dataset=]. An [=RML processor=] also has access to an execution environment consisting of: * A [=logical source=] -* A base IRI used in resolving relative [=IRIs=] produced by the RML mapping. +* A [=base IRI=] How the [=logical source=] is accessed, or how users are authenticated against the database, is outside of the scope of this document. -The [=base IRI=] MUST be a valid [=IRI=]. -It SHOULD NOT contain question mark ("`?`") or hash ("`#`") characters and +#### Base IRI +A [=base IRI=] is used in resolving relative [=IRIs=] produced by the RML mapping. The [=base IRI=] MUST be +defined within the mapping document for each [=Triples Map=] or as execution environment for the [=mapping document=]. + + + + + +The [=base IRI=] MUST be a valid [=IRI=]. It SHOULD NOT contain question mark ("`?`") or hash ("`#`") characters and SHOULD end in a slash ("`/`") character. + + ### RML Validator An RML data validator is a system that takes as its input From a21adb3fdbe042fa91c985aeb1e7d4e6ccb7c622 Mon Sep 17 00:00:00 2001 From: David Chaves Date: Tue, 7 Oct 2025 12:17:45 +0200 Subject: [PATCH 02/14] adding default base IRI --- spec/section/tooling.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/spec/section/tooling.md b/spec/section/tooling.md index 52348b81..14472026 100644 --- a/spec/section/tooling.md +++ b/spec/section/tooling.md @@ -91,7 +91,8 @@ Both algorithms are equivalent if all of the following are true: 4. the relative [=IRI=] does not contain any “.” or “..” path segments. - +If no specific [=base IRI=] is provided and the [=mapping process=] needs to generate absolute [=IRIs=] from relative ones, +the default base IRI to be used by the [=RML Processor=] MUST be http://example.org/. ### RML Validator From b93ef2cf41a6576045a7e8e1a855fb107d856453 Mon Sep 17 00:00:00 2001 From: David Chaves Date: Wed, 8 Oct 2025 12:37:49 +0200 Subject: [PATCH 03/14] URIs, UnsafeIR and UnsafeURI --- spec/dev.html | 4 +- spec/section/graphmap.md | 2 +- spec/section/output.md | 2 +- spec/section/overview.md | 2 +- .../{rdfTerminology.md => terminology.md} | 21 +++- spec/section/termmap.md | 114 ++++++++++++++---- spec/section/tooling.md | 5 +- spec/section/xsdTerminology.md | 6 - 8 files changed, 115 insertions(+), 41 deletions(-) rename spec/section/{rdfTerminology.md => terminology.md} (71%) delete mode 100644 spec/section/xsdTerminology.md diff --git a/spec/dev.html b/spec/dev.html index 4d58917b..b7fbeb49 100644 --- a/spec/dev.html +++ b/spec/dev.html @@ -194,9 +194,9 @@
-
+
-
+ diff --git a/spec/section/graphmap.md b/spec/section/graphmap.md index 1c43a94c..3c264c58 100644 --- a/spec/section/graphmap.md +++ b/spec/section/graphmap.md @@ -7,7 +7,7 @@ Any [=subject map=] or [=predicate-object map=] MUST have zero or more associate 1. using the `rml:graphMap` property, whose value MUST be a [=graph map=], 2. using the [=constant shortcut property=] `rml:graph`. -[=Graph maps=] are themselves [=term maps=]. When [=RDF triples are generated=], the set of target graphs is determined by taking into account any [=graph maps=] associated with the [=subject map=] or [=predicate-object map=]. +[=Graph maps=] are themselves [=term maps=]. When [=RDF triples=] are generated, the set of target graphs is determined by taking into account any [=graph maps=] associated with the [=subject map=] or [=predicate-object map=]. If a [=graph map=] generates the special IRI `rml:defaultGraph`, then the target graph is the [=default graph=] of the [=output dataset=]. diff --git a/spec/section/output.md b/spec/section/output.md index 1fd747d6..50cf9d63 100644 --- a/spec/section/output.md +++ b/spec/section/output.md @@ -1,6 +1,6 @@ # The Output Dataset -The output dataset of an [=RML mapping=] is an [=RDF dataset=] that contains the [=generated RDF triples=] for each of the [=triples maps=] of the [=RML mapping=]. The [=output dataset=] MUST NOT contain any other [=RDF triples=] or [=named graphs=] besides these. However, [=RML processors=] MAY provide access to datasets that contain additional triples or graphs beyond those in the [=output dataset=], such as inferred triples or provenance information. +The output dataset of an [=RML mapping=] is an [=RDF dataset=] that contains the generated [=RDF triples=] for each of the [=triples maps=] of the [=RML mapping=]. The [=output dataset=] MUST NOT contain any other [=RDF triples=] or [=named graphs=] besides these. However, [=RML processors=] MAY provide access to datasets that contain additional triples or graphs beyond those in the [=output dataset=], such as inferred triples or provenance information. Conforming [=RML processors=] MAY rename [=blank nodes=] when providing access to the [=output dataset=]. This means that client applications may see actual [=blank node identifiers=] that differ from those produced by the [=RML mapping=]. Client applications SHOULD NOT rely on the specific text of the blank node identifier for any purpose. diff --git a/spec/section/overview.md b/spec/section/overview.md index 3f46258e..44183567 100644 --- a/spec/section/overview.md +++ b/spec/section/overview.md @@ -72,7 +72,7 @@ Input source: album.json "Description": "A collection of stunning cityscape images.", "CreatedDate": "2023-10-01", "DateFormat": "date", - "Author": "John Doe", + "Author": "Zoë Krüger", "Images": [ { "ID": 116, diff --git a/spec/section/rdfTerminology.md b/spec/section/terminology.md similarity index 71% rename from spec/section/rdfTerminology.md rename to spec/section/terminology.md index eaa3e3d2..6b4028cf 100644 --- a/spec/section/rdfTerminology.md +++ b/spec/section/terminology.md @@ -1,4 +1,6 @@ -# RDF Terminology +# Terminology + +## RDF Terminology This section lists some terms normatively defined in [[RDF11-CONCEPTS]] and used in RML: @@ -24,3 +26,20 @@ This section lists some terms normatively defined in [[RDF11-CONCEPTS]] and used - property - resource - subject + + +# XML Schema Definition Language (XSD) Terminology + +This section lists some terms normatively defined in [[XMLSCHEMA11-2]] and used in RML: + +- XSD Datatype +- Canonical mapping + + +# Uniform Resource Identifier Terminology + +This section lists some terms normatively defined in [[RFC3986]] and used in RML: + +- URI +- relative URIs +- Percent-encode \ No newline at end of file diff --git a/spec/section/termmap.md b/spec/section/termmap.md index 7fa8dbf3..7e238334 100644 --- a/spec/section/termmap.md +++ b/spec/section/termmap.md @@ -1,6 +1,6 @@ # Term Maps -An RDF term is either an [=IRI=], or a [=blank node=], or a [=literal=]. +An RDF term is either an [=IRI=], a [=URI=], a [=blank node=], or a [=literal=]. A term map (`rml:TermMap`) is a rule that defines how to generate an [=RDF term=] from a [=logical iteration=]. The result of the execution of that rule is the generated RDF term. @@ -23,19 +23,21 @@ if it is a rule that specifies how the [=RDF triple=]'s [=named graph=] is gener A [=term map=] generates different types of [=RDF terms=] depending on the position of the [=term map=] in the [=RDF triple=]: * a [=subject map=] (`rml:SubjectMap`) -is a rule that MUST generate either an [=IRI=] or a [=blank node=]; -* a [=predicate map=] (`rml:PredicateMap`) -is a rule that MUST generate an [=IRI=]; +is a rule that MUST generate either an [=IRI=], a [=URI=] or a [=blank node=]; +* a [=predicate map=] (`rml:PredicateMap`) +is a rule that MUST generate an [=IRI=] or a [=URI=]; * an [=object map=] (`rml:ObjectMap`) -is a rule that MUST generate an [=IRI=], a [=blank node=] or a [=literal=]; +is a rule that MUST generate an [=IRI=], a [=URI=], a [=blank node=] or a [=literal=]; * a [=graph map=] (`rml:GraphMap`) -is a rule that SHOULD generate an [=IRI=]. +is a rule that SHOULD generate an [=IRI=] or a [=URI=]. A [=term map=] MUST have * 0 or 1 [=datatype map=] or 0 or 1 [=language map=]; * 0 or 1 [=term type=]. + + ### Constant RDF Terms (`rml:constant`) A constant-valued term map is a term map that ignores the [=logical iteration=] and always generates the same [=RDF term=]. A [=constant-valued term map=] is a [=constant-valued expression map=], and is thus represented by a resource that has exactly one `rml:constant` property. The [=constant expression=] MUST be a valid [=RDF term=]. @@ -123,8 +125,8 @@ If the [=term type=] of the [=template-valued term map=] is `rml:IRI`, then a [= The IRI-safe version of a string is obtained by applying the following transformation to any character that is not in the [`iunreserved` production](http://tools.ietf.org/html/rfc3987#section-2.2) in [[RFC3987]]: -1. Convert the character to a sequence of one or more octets using [UTF-8](http://tools.ietf.org/html/rfc3629) [[RFC3629]] -2. [Percent-encode](http://tools.ietf.org/html/rfc3986#section-2.1) each octet [[RFC3986]] +1. Convert the character to a sequence of one or more octets using [UTF-8](http://tools.ietf.org/html/rfc3629) in [[RFC3629]] +2. [=Percent-encode=] each octet The following table shows examples of strings and their IRI-safe versions: @@ -137,13 +139,14 @@ The following table shows examples of strings and their IRI-safe versions: | ~A_17.1-2 | ~A_17.1-2 | @@ -195,7 +198,7 @@ Using the [=logical iteration=], the [=template value=] of the [=subject map=] w @@ -204,6 +207,63 @@ The space character is not in the [`iunreserved` production](http://tools.ietf.o + + + + + -## IRIs, Literal, Blank Nodes (rml:termType) +## IRIs, URIs, Literal, Blank Nodes (rml:termType) The term type of a [=reference-valued term map=] or [=template-valued term map=] -determines the kind of [=generated RDF term=] ([=IRIs=], [=blank nodes=] or [=literals=]). +determines the kind of [=generated RDF term=] ([=IRIs=], [=URI=], [=blank nodes=] or [=literals=]). If the term map has an optional `rml:termType` property, then its [=term type=] is the value of that property. The value MUST be an [=IRI=] and MUST be one of the following options: -* If the term map is a [=subject map=]: `rml:IRI` or `rml:BlankNode` -* If the term map is a [=predicate map=]: `rml:IRI` -* If the term map is an [=object map=]: `rml:IRI`, `rml:BlankNode`, or `rml:Literal` -* If the term map is a [=graph map=]: `rml:IRI` +* If the term map is a [=subject map=]: `rml:IRI`, `rml:URI` or `rml:BlankNode` +* If the term map is a [=predicate map=]: `rml:IRI`, `rml:URI` +* If the term map is an [=object map=]: `rml:IRI`, `rml:URI`, `rml:BlankNode`, or `rml:Literal` +* If the term map is a [=graph map=]: `rml:IRI`, `rml:URI` + + ### Default Term Types If the [=term map=] does not have a `rml:termType` property, then its [=term type=] is: * `rml:IRI`, if it is a [=subject map=], [=predicate map=] or [=graph map=] -* `rml:Literal`, if it is an [=object map=] -and at least one of the following conditions is true: -* It is a [=reference-valued term map=]. - * It has a `rml:languageMap` property (and thus a specified [=language tag=]). - * It has a `rml:datatypeMap` property (and thus a specified [=datatype=]). -* `rml:IRI`, otherwise. +* `rml:Literal`, if it is an [=object map=] and at least one of the following conditions is true (`rml:IRI`, otherwise): + * It is a [=reference-valued term map=]. + * It has a `rml:languageMap` property (and thus a specified [=language tag=]). + * It has a `rml:datatypeMap` property (and thus a specified [=datatype=]). + ### Explicitly Defined Term Types To change the default [=term type=] of a [=subject map=] or [=graph map=] -to a [=blank node=], the [=term type=] MUST be explicitly defined to be a `rml:BlankNode`. +to a [=blank node=], the [=term type=] MUST be explicitly defined to be a `rml:BlankNode` or `rml:URI. To change the default [=term type=] of an [=object map=], the [=term type=] MUST be explicitly defined: -* If the [=term type=] is `rml:IRI`, an [=IRI=] will be generated; +* If the [=term type=] is `rml:IRI`, an [=IRI=] will be generated`;` +* If the [=term type=] is `rml:URI`, a [=URI=] will be generated`;` * If the [=term type=] is `rml:BlankNode`, a [=blank node=] will be generated. If the [=term type=] is explicitly defined to be a `rml:BlankNode`, diff --git a/spec/section/tooling.md b/spec/section/tooling.md index 14472026..f6c14fd9 100644 --- a/spec/section/tooling.md +++ b/spec/section/tooling.md @@ -79,9 +79,8 @@ SHOULD end in a slash ("`/`") character.