Proposal: find a transitional solution before updating langString #13
Comments
Let us say a full RDF environment like RDFLib is updated to 1.2. This means the internal However, what should an RDF Turtle (or RDF/XML or JSON-LD) do? Should it generate A more radical approach may be (but I'am not sure) not to consider (I'm not sure about RDF/XML. Maybe it should be forgotten…) Indeed, if we do this, the WG would have to update the RDF family of specs only (and Turtle would only have to deal with editorial errata), and the only extra syntax to be updated is RDFa. SPARQL and SHACL, which is based on the Turtle syntax, might possibly choose to do an update later (or choose not to do it). (I am not sure about R2RML and CSVW.) I am not sure that our I18N friends, like @r12a or @aphillips would like that, though. Even if it is used as a syntactic sugar for a few RDF serializations, it may leak out to other usages… (I am not convinced about this approach either at this point, I am just musing…) ✝︎ Turtle is too 'close' to N-Triple, and N-Triple is really the "dump" format for RDF triples, so I do not think that a syntactic sugar in Turtle would be a good idea. |
I don't think they will, and let's face it, neither you nor I would like it very much either ;-)
Excatly! |
Actually, my proposal above is two-fold, and may be it was a mistake to merge both aspects. So, forgetting about the controversial
Since we only have 3 proposals on the tables, if we ban
|
-x-dir-XXX
as a transitional hack
Hm. The combination of However... if RDF 1.2 is defined by this WG, it still needs to update all the others... (although the time pressure is different). In some sense, the WG would become some sort of a maintainer of all things RDF... Note sure this is good or bad. |
I agree, unless we change the media types (e.g. So I really like your idea of making the 1.2 serializations "compatible" with the 1.1 family of syntaxes – only with a different interpretation. That's what worked for deprecating plain literals. Piggy-backing on the Yet another proposalConsider the following Turtle/SPARQL literal: In RDF 1.2, we would decide that As for the interpretation of the
Both options are slighly "unpure", but I believe this is the price RDF has to pay for having not included base direction in the first place... |
I like this last proposal, are at least the direction it is going in. It helps simplify the RDF Literal definition and better leverages the datatype element. Arguably, this would have been a better way for RDF 1.1 to have gone when introducing langString. At also allows JSON-LD to make use of type maps and could lead to the depreciation of |
The goal is not to deprecate |
On the I would still opt for your second option:
i.e., to properly update RDF concepts but not to deprecate However, @gkellogg @pchampin how would that work with indexing? I thought the JSON-LD 1.1 type maps is for object types (i.e., real RDF types) and not for datatypes... (From the current JSON-LD draft, in https://w3c.github.io/json-ld-syntax/#node-type-indexing
|
The way I see it, JSON-LD would encode those literals as Symmetrically, the RDF to JSON-LD algorithm should convert any Of course, some other implementations may still produce a value object of the form |
B.t.w.... the spec for the This means we can do something like:
|
Actually, the API implementation of expansion works (compaction seems to require a minor tweak for term selection) with both node objects and data objects. We could change the name of the Node Type Indexing section and add some minor text to include this (we need to do something if we want to exclude value objects in the API). Try, for example this playground link. In retrospect, I’m not sure why the syntax document restricted type indexing to node objects. |
@iherman If BCP47 evolves so much as to allow |
@iherman @pchampin That's right. The syntax of BCP47, in any of its iterations, has never allowed any characters except a-z, 0-9, and hyphen. Future iterations are not envisioned, but certainly it would be a breaking change (and extremely remarkable) to add more characters to what's permitted and still somehow be BCP47. However, underscores are used in some systems for locale identifiers and most implementations of language/locale mapping are at least a little permissive about exchanging one for the other (because developers can't remember which one to use). |
@gkellogg Your example in the playgroud is strange, but I followed the idea and built another one. I'm not very comfortable with it though. I consider the One thing that type maps won't allow me to do is to add direction to some of the values: in my example above, I would like to be able to write: "name_map": {
"fr": "Lyon",
"ar": { "@value": "ليون", "@direction": "rtl" }
} I don't want to have to put the direction in the key (as in |
The problem with the expanded value object version is that there's no way to use maps, which I think might be important here.
You'd need to add a term such as "ar-rtl", if the direction is conveyed in a type map. Language maps may be best, if we adopt the |
Well, fortunately, RDF is not (at least, not normatively). The abstract syntax specifies that "the language tag MUST be well-formed according to section 2.2.9 of [BCP47]", and concrete syntaxes only accept dashes between letters and digits. I quickly tried a few implementations, none of them let me use underscore in language tags... |
@pchampin Agreed. @gkellogg I agree with @pchampin that the direction doesn't want to participate in the map/language negotiation. I'll point out that string-meta has a section on this. One of the things about appending gunk to the end of the language tags that is not "default ignorable" is that is can interfere with BCP47's prefix matching language negotiation heuristic. Re-separating the direction from the language tag helps prevent |
If we go for a discrete direction, we'll need to update the JSON-LD language map, which is currently restricted to having values which are plain strings, to allow either plain strings or value objects with no conflicting |
@gkellogg Yes: that's definitely a problem. See my link to "section on this" for thoughts on this (which includes improvements to language maps having to do with language tag handling as well). |
I have added this to the charter as an alternative. |
I'm trying here to flesh out my argument (made at the end of this week's telco) that we could use the
-x-dir-XXX
private subtag, not to sove the problem in the long term, but as a mean to ensure a smooth transition between the current state of the standards, and the future state where base direction is cleanly integrated. Doing this, we could limit the charter of the new WG to a smaller set of specifications (RDF concepts, semantics and concrete syntaxes, basically) and leave it to other WGs to update the rest of the specifications (possibly together with other changes).I understand that many people are pessimistic about letting the genie out of the bottle, fearing that once this private subtag is in use, it will spread and pollute even clean standards (such as HTML). And honnestly, I share those concerns. But I think/hope that we can contain this risk – let the genie out "on parole", if you like. The alternative (updating all the specs at once) seems equaly risky.
The core idea is that, in the future RDF model (I'll call it RDF 1.2 for convenience), we update langString as described in
langString.html
, but in the abstract syntax, we forbid the language tag to contain-x-dir-ltr
or-x-dir-rtl
. Instead, whenever those subtags are encoutered (either in a concrete syntax of programmatically), they MUST be stripped out of the language tag, and interpreted as the base direction1.This means that a Turtle 1.1 file (or SPARQL 1.1 query) may contain
"مرحبا world, how are you?"@en-x-dir-ltr
, but any RDF1.2 implementation will convert it automatically to"مرحبا world, how are you?"@en^ltr
. The RDF 1.2 family of concrete syntaxes would accept the private subtag during parsing (for backward compatibility), possibly with a warning, but it would be illegal to use it when serializing.When serializing from an RDF 1.2 implementation to a 1.1 family of language (e.g. SPARQL 1.1 results), implementations MAY encode the direction information using the private subtag, in order to preserve that information. Thanks to the principles above, an RDF 1.2 store and an RDF 1.2 client may communicate using SPARQL 1.1, but will not spread the private subtag further.
Once all specs (and corresponding implementations) are updated, only old static files may still contain the private subtag – and raise warnings whenever parsed by RDF 1.2 implementations.
1 What happens when the subtag and and explicit base direction are provided needs to be decided... I think it should be an error.
The text was updated successfully, but these errors were encountered: