New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Round tripping language tags case #73
Comments
Alternate choice: leave no expectation of case-preservation when converting ShExC<->ShExJ PROPOSE:
|
"ISO": indeed eg This won't help round-tripping but at least will enforce determinism. |
If I understand, the proposal is to:
e.g. Does this derive from the BCP47 grammar or some other text in the doc? Can you make a PR on the spec value constraints section (and maybe value set parsing) to make this concrete? I'd propose @gkellogg and @ericprud as reviewers. |
Your description is correct (but there's also a dash between the two sequences). I don't think this normalization has any bearing on validation, since validation must be case-insensitive. |
I wasn't worried about the validation, just what exactly how to specify the canonical form. I guess you have something in mind like:
Where in BCP47 do the capitalization rules come from? Can we justify the rules above? |
This is wrong. This would be ambiguous for eg x-whatever. The rules require capitalziation in eg x-what-Ever and x-what-EV but require nothing in x-whatever or x-what-everything or x-what-eve. Just refer to sec 2.1.1. IMHO you don't need to restate the rules, just give some examples |
I think I misread what "joined" means. I still think you don't need to restate the rules, but if you want to do it, please change "set of sequences" to "sequence of strings" |
Related: #71
Apart from values set values with language tags, ShExC, ShExJ and ShExR can be exactly round tripped, c.f. schema tests. Because language-tagged literals are expressed as JSON-LD object literals and RDF parsers are not responsible for preserving upper/lower case in literal language tags, a ShExC schema:
would be be translated to ShExR:
An RDF parser is allowed to parse that as
en-gb
so it would round-trip to ShExC:This doesn't affect semantics of validation but it can be a pain for folks who like to follow ISO language code rules where regions should be upper case, i.e.
en-GB
. (This has little impact as no one uses ShExR anyways.) Round-tripping between ShExC and ShExJ (as JSON) is unaffected by this.PROPOSE:
The text was updated successfully, but these errors were encountered: