Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we need a URI datatype? #226

Closed
mlagally opened this issue Sep 6, 2018 · 9 comments
Closed

Do we need a URI datatype? #226

mlagally opened this issue Sep 6, 2018 · 9 comments
Assignees
Labels
Needs review Issue was fixed, but is still open for post-merge reviews

Comments

@mlagally
Copy link
Contributor

mlagally commented Sep 6, 2018

We should consider adding a URI conforming to https://tools.ietf.org/html/rfc3986 to the set of supported data types.

@handrews
Copy link

handrews commented Sep 7, 2018

In JSON Schema this is handled with the format keyword, with values uri, uri-reference, iri, and iri-reference. While the rules around handling format in JSON Schema are a little weird (because it's potentially expensive to validate fully and reliably), the TD could impose additional processing rules such as "The uri etc. formats MUST be validated", or "... MUST be validated for http://, https://, etc. URI schemes".

@mkovatsc
Copy link
Contributor

Yes, currently the anyURI type (inherited from XML Schema via Linked Data definitions) is used in the model for href (Linked Data) Properties outside the data schemas. There we can have these explicit processing rules. So the common cases can be handled inexpensively.

Within data schemas, they would be modelled is a string type together with format.

@sebastiankb
Copy link
Contributor

sebastiankb commented Nov 15, 2018

what is the conclusion of this issue here? The format term shall be also specified in the JSON Schema model, right?

@mlagally
Copy link
Contributor Author

Yes, currently the anyURI type (inherited from XML Schema via Linked Data definitions) is used in the model for href (Linked Data) Properties outside the data schemas. There we can have these explicit processing rules. So the common cases can be handled inexpensively.

Within data schemas, they would be modelled is a string type together with format.

Not sure I understand this comment with respect to the "format", since there is no "format" of a data schema.
The current types for data schemas (and thus for properties) in the TD spec are more limited than the types in JSON-Schema, which include a uri among other useful types, such as date-time and email.
See https://json-schema.org/understanding-json-schema/reference/string.html

We should consider adding these types too, since they are convenient and enable additional validation of a TD.

@handrews
Copy link

@mlagally the JSON Schema type keyword does not include anything like uri, date-time, email, etc. These are all handled through the format keyword, which for all current documented cases works alongside "type": "string" (numeric formats have been proposed, but none have gotten enough support to be added to the spec yet).

Validation for the format keyword is optional, which is a confusing situation we are trying to at least partially address in the next draft. Although given how late this draft already is we might punt the format stuff over to the following draft.

The reason it is optional is that validating some formats is somewhere between expensive and impossible. For example, I don't think I've ever found a definitive way to tell if an email address is syntactically correct, as there's such a confusing history of standards vs actual behavior over the past several decades. URIs are also a challenge, as many of the details are scheme-specific, and new schemes can be created by anyone at any time. There are a lot of examples where it's easy to tell that something is not a valid URI, but being absolutely certain that it is valid is sometimes very hard depending on what the scheme requires.

@handrews
Copy link

Of course, the TD is free to impose more clear requirements on how format is processed. We are working on JSON Schema features to help make this more predictable as well.

@sebastiankb sebastiankb added the Needs discussion more discussion is needed before getting to a solution label Jan 9, 2019
@egekorkan
Copy link
Contributor

From Princeton Testfest:
We agreed to put format keyword into. For implementation report, @mlagally and @sebastiankb will implement TDs

@sebastiankb sebastiankb added Work In Progress Issue is being taken care of and removed Needs discussion more discussion is needed before getting to a solution labels Feb 18, 2019
@sebastiankb
Copy link
Contributor

done

@sebastiankb sebastiankb added Needs review Issue was fixed, but is still open for post-merge reviews Work In Progress Issue is being taken care of and removed Work In Progress Issue is being taken care of Needs review Issue was fixed, but is still open for post-merge reviews labels Feb 20, 2019
@sebastiankb
Copy link
Contributor

The description of format will be extended that also costumer based values can be assigned

@sebastiankb sebastiankb self-assigned this Feb 27, 2019
@sebastiankb sebastiankb added Needs review Issue was fixed, but is still open for post-merge reviews and removed Work In Progress Issue is being taken care of labels Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs review Issue was fixed, but is still open for post-merge reviews
Projects
None yet
Development

No branches or pull requests

5 participants