-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feedback while working on a JSON Schema for WP #287
Comments
Ya, I asked this initially in #228. I probably should have left it open, since the table restructuring didn't solve it except in the sense it requires an object. |
Thanks for doing that. I try to answer to some of these issues; we will have to work out the exact editorial work together.
I think that for this, as well as for other issues, what we have to check is whether the structured data tool for schema.org accepts a structure or not. For this specific issue,
is accepted (pleasant surprise, I expected the opposite:-). Which means that we should say somewhere in the spec that a simple string is acceptable in those positions (and must be converted into Persons when converting into WebIDL).
I think these two terms should be indeed required.
I am not sure I understand. We do say "one or more" which, for me, means that an array can be accepted as values. Is this what you mean?
In general, every time we expect a string (names is the typical example) the usage of
Correct, we will have to have a proper multilingual example that we can use.
You mean every time we expect a string we can accept an array of strings (or localized strings)? Yes.
Yes.
I am a bit wary of repeating the specification of, say, the accessibility terms in our spec; we have a reference to those.
I think yes. Cc @mattgarrish |
I'll update the JSON Schema based on my own personal take on what should be required/allowed. Once that's done, I'll post an update here to document some of the choices I made in order to discuss them together. Update: Done. |
The updated JSON Schema should now cover every term in our manifest: https://github.com/w3c/wpub/blob/master/schema/publication.schema.json To finalize this schema, I had to take a few decisions regarding what's allowed or not. 1. Localizable Strings All localizable strings can be expressed using:
This schema is applied to: To illustrate, here are three examples using As a string
As an object
As an array of objects
The following example is rejected by the JSON Schema:
A mixed array containing both strings and objects would be rejected as well. All objects in the array MUST be unique. 2. Creators All creators allow:
Whenever an object is used, To illustrate, here are four examples using As a string
As an object
As an array of strings
As an array of objects
Mixed arrays containing both strings and objects are also allowed. All objects in the array MUST be unique but the JSON Schema would validate the same creator expressed using both a string and an object. 3. Accessibility I've applied strict validation on the various accessibility terms by forcing them to be either a string or an array and using an 4. Relative URIs In JSON Schema, it's possible to validate that a string is a URI using But since we allow relative URIs, I had to drop this validation to allow any string. I wonder if this might also be a problem for other situations as well. |
Brilliant! Before commenting below: just as for the json-ld context, it would be good to find a final place for the schema, so that tools could use that. I realize that means re-writing the schemas before you used the cross referencing possibility. I just wonder whether it becomes too unwieldy if all schema files are folded into one. That may avoid forcing schema validation libraries to do HTTP request all the time (tools could store a copy locally) But that is something we can handle later.
What is the rationale not allowing an array with strings or and array with a mixture of strings and objects? I understand it is more complex but sounds a bit contrived for an author. If the default language for the document is English, ie, the
seems redundant, and this looks more natural:
I do not see the problem on the implementation level either.
Right, I agree with this. All the more reasons to allow the same flexibility for localizable strings.
Yes, this is unfortunate, but we cannot help this. |
If you have more than one string, you get in a weird situation where it's impossible to know which language you should apply:
I think it's safer to force all arrays in localizable string to use objects instead. It's also much easier to test uniqueness when an array is all string or all objects.
Creators are quite different from
Yikes. I'm always in favour of DRY (don't repeat yourself) and references helped a lot while writing this schema. |
FYI, the JSON Schema for a publication is already available at https://w3c.github.io/wpub/schema/publication.schema.json I've tested examples in various validation tools and they could all support the various references that are already used in the current version. |
I do not see the problem. This is obviously bad authoring but the rules are clear (and the same as in HTML): if the language is set (in inLanguage) then that applies for a string, unless the string does not set the language for itself. The second string will be set to english in this case (ie, if the inLanguage is set). Authoring error. So?
If this is the only reason then, I am sorry, but I do not agree.
Yikes indeed, I understand. The issue I see is, however, that the schema validation becomes impossible offline although, I think, using this validation and building into the processing would be extremely useful. Is it necessary to use absolute urls? If the 'main entry point' to the schema used only relative URLs, one could make a copy of the whole collection. (Apologies, I am not very familiar with all the details of JSON schemas.) |
This goes beyond authoring, it will affect the processing of the manifest as well. When processing our WebIDL, we can't allow two different values for the same language on a localizable string. |
I must admit I really do not understand the problem.
I do not see any downsides that would justify imposing an unnecessary restriction.
I am not 100% sure of the validity of this statement but even if this is true, that is completely orthogonal. An implementation would have to make such a check on the result, regardless on whether the original manifest author used objects or strings. |
@iherman the flexibility you're talking about comes at a very high cost:
This is not limited to
As a UA, what should be displayed as the title of the publication if I have three strings that are all tagged as being in English? |
@HadrienGardeur I believe the term "much more complex" is... overstating it. I guess we will have to agree that we disagree on that one, and let the rest of the group make the decision. |
This discussion was somehow continued at #288 (comment). I'm going to create an alternate schema for localizable strings based on language maps that I won't reference from the main JSON Schema for now. Update: https://github.com/w3c/wpub/blob/master/schema/localizable-map.schema.json |
@HadrienGardeur administrative comment: I think we should have a separate issue on the usage or not usage of language maps instead of what we have now. It would be clearer to handle it as in issue of its own. Other issues were spread over other issues, too (eg, #290) and in the process of being closed. Are there other detailed issues in this one that should kept open separately? We could then close this one... |
I think we need to explicitly state it in the spec and include examples for every format whenever a term allows a combination of string/array/object. |
@HadrienGardeur I would leave that to @mattgarrish; maybe some sort of a general statement that makes it clear that string can be used instead of an array of strings, or in place of an object, etc, could be done as a general statement rather than making the text even more difficult to read. I agree that the examples should reflect these more clearly. |
I've updated the JSON Schema to use relative references. It should now be possible to embed it in other projects like your proof of concept @iherman. |
Since we won't have any calls for the next two weeks, I've started working on a JSON Schema for WP.
A first draft is available at https://github.com/w3c/wpub/blob/master/schema/publication.schema.json
While working on this schema, I've had to go through the specification again and it highlighted several issues with the current draft.
1. Creators
The JSON serialization for creators feels very much underdefined: there's a good list of roles that I can use for my JSON Schema but everything else feels underdefined.
name
,@type
) or have a strong preference for specific terms?2. Item-specific language
I find that part of the spec every bit as confusing as creators:
@value
+@language
instead of a string? An explicit list would help a lot.3. String/Array/Object
Overall, it's fairly difficult to know right now where we allow strings, arrays or objects.
For example:
name
(string for the simple case, object for a single localized value, array of objects for multiple localizations)?4. Accessibility
There are many different values for the various accessibility terms that we support, but they're not listed in our specification.
Should we validate these values at all?
For languages (both
inLanguage
and@language
),readingProgression
andinDirection
, I've already handled the validation of their values.The text was updated successfully, but these errors were encountered: