New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URIs should be IRIs #32

Closed
carueda opened this Issue Oct 30, 2016 · 5 comments

Comments

Projects
None yet
2 participants
@carueda
Member

carueda commented Oct 30, 2016

Backend counterpart of mmisw/orr-portal#48

(Summary of changes in RDF 1.1, which uses IRIs - https://www.w3.org/TR/rdf11-new/#identifiers)

@carueda carueda added this to the Feb-2017 milestone Oct 30, 2016

@carueda carueda self-assigned this Oct 30, 2016

@carueda

This comment has been minimized.

Show comment
Hide comment
@carueda

carueda Nov 10, 2016

Member

@graybeal perhaps I haven't spent enough time yet and this would probably have to wait until some time next year. Would you be able to provide some concrete technical guidance, perhaps with assistance from your groups?. I've looked at rfc398 (some excerpts below), but still finding it confusing in terms of the various pieces involved (browser, http servers, and actual underlying orr-ont code along with its used Jena and OWLAPI libraries...). Perhaps it doesn't have to be complete set of adjustments at once, but that we could take some incremental steps toward fully supporting IRIs.

For example, since "every URI is by definition an IRI" (p. 11), perhaps an initial step could be a systematic renaming of all parameters and documentation from "URI" to "IRI", and of course along with appropriate clarification in texts/tooltips and documentation that only URIs are actually supported (but that we are working toward fully supporting IRIs, etc, etc...). This would be "low hanging fruit" in terms of at least showing some progress in this regard sooner than later ...


from https://www.ietf.org/rfc/rfc3987.txt:

IRIs are meant to replace URIs in identifying resources for protocols, formats, and software components that use a UCS-based character repertoire. These protocols and components may never need to use URIs directly, especially when the resource identifier is used simply for identification purposes. However, when the resource identifier is used for resource retrieval, it is in many cases necessary to determine the associated URI, because currently most retrieval mechanisms are only defined for URIs. In this case, IRIs can serve as presentation elements for URI protocol elements. An example would be an address bar in a Web user agent.

[7..] This informative section provides guidelines for supporting IRIs in the same software components and operations that currently process URIs: Software interfaces that handle URIs, software that allows users to enter URIs, software that creates or generates URIs, software that displays URIs, formats and protocols that transport URIs, and software that interprets URIs. These may all require modification before functioning properly with IRIs.

Member

carueda commented Nov 10, 2016

@graybeal perhaps I haven't spent enough time yet and this would probably have to wait until some time next year. Would you be able to provide some concrete technical guidance, perhaps with assistance from your groups?. I've looked at rfc398 (some excerpts below), but still finding it confusing in terms of the various pieces involved (browser, http servers, and actual underlying orr-ont code along with its used Jena and OWLAPI libraries...). Perhaps it doesn't have to be complete set of adjustments at once, but that we could take some incremental steps toward fully supporting IRIs.

For example, since "every URI is by definition an IRI" (p. 11), perhaps an initial step could be a systematic renaming of all parameters and documentation from "URI" to "IRI", and of course along with appropriate clarification in texts/tooltips and documentation that only URIs are actually supported (but that we are working toward fully supporting IRIs, etc, etc...). This would be "low hanging fruit" in terms of at least showing some progress in this regard sooner than later ...


from https://www.ietf.org/rfc/rfc3987.txt:

IRIs are meant to replace URIs in identifying resources for protocols, formats, and software components that use a UCS-based character repertoire. These protocols and components may never need to use URIs directly, especially when the resource identifier is used simply for identification purposes. However, when the resource identifier is used for resource retrieval, it is in many cases necessary to determine the associated URI, because currently most retrieval mechanisms are only defined for URIs. In this case, IRIs can serve as presentation elements for URI protocol elements. An example would be an address bar in a Web user agent.

[7..] This informative section provides guidelines for supporting IRIs in the same software components and operations that currently process URIs: Software interfaces that handle URIs, software that allows users to enter URIs, software that creates or generates URIs, software that displays URIs, formats and protocols that transport URIs, and software that interprets URIs. These may all require modification before functioning properly with IRIs.

@carueda carueda changed the title from URIs should be IRIs #48 to URIs should be IRIs Feb 1, 2017

@carueda

This comment has been minimized.

Show comment
Hide comment
@carueda

carueda Feb 1, 2017

Member

@graybeal Any comments to my proposal here?

Member

carueda commented Feb 1, 2017

@graybeal Any comments to my proposal here?

@graybeal

This comment has been minimized.

Show comment
Hide comment
@graybeal

graybeal Feb 1, 2017

Member

Well, I've perhaps prematurely taken you up on renaming the documentation, most of it now uses the term IRIs.

I was under the impression that making it possible to use IRIs in software would be basically a matter of using IRI-compatible libraries, and that most libraries at this stage (that we'd be likely to use, anyway) would be IRI-compatible. As I understood it last time I looked (several years ago), IRIs are basically just a different definition of the legal strings allowed, in order to support internationalization.

So I'm a little surprised if there is a big cost here. Will check a little with the team here.

Member

graybeal commented Feb 1, 2017

Well, I've perhaps prematurely taken you up on renaming the documentation, most of it now uses the term IRIs.

I was under the impression that making it possible to use IRIs in software would be basically a matter of using IRI-compatible libraries, and that most libraries at this stage (that we'd be likely to use, anyway) would be IRI-compatible. As I understood it last time I looked (several years ago), IRIs are basically just a different definition of the legal strings allowed, in order to support internationalization.

So I'm a little surprised if there is a big cost here. Will check a little with the team here.

@carueda

This comment has been minimized.

Show comment
Hide comment
@carueda

carueda Feb 16, 2017

Member

Agree, hopefully no "big cost", but I'd like to get some input that you can get from your team in terms of any gotchas and such. Yes, the libraries used by ORR have long supported IRIs, but I guess my concern is more at the level of how such IRIs are handled by browsers and client application in terms of corresponding encoding toward transfer to the backend, and how they should be decoded in the backend; how to properly handle special IRI symbols for display purposes; etc.

Member

carueda commented Feb 16, 2017

Agree, hopefully no "big cost", but I'd like to get some input that you can get from your team in terms of any gotchas and such. Yes, the libraries used by ORR have long supported IRIs, but I guess my concern is more at the level of how such IRIs are handled by browsers and client application in terms of corresponding encoding toward transfer to the backend, and how they should be decoded in the backend; how to properly handle special IRI symbols for display purposes; etc.

@carueda carueda modified the milestones: Feb-2017, Mar-2017 Mar 3, 2017

@carueda carueda modified the milestones: June-2017, Mar-2017 May 30, 2017

@carueda

This comment has been minimized.

Show comment
Hide comment
@carueda

carueda May 31, 2017

Member

(just some personal note)

ORR uses both Apache Jena and OWL-API libraries.

Jena: Although claimed to be "incomplete", https://jena.apache.org/documentation/notes/iri.html offers some acceptable description of how to deal with IRIs. In particular, Jena seems to put great attention to validation.

OWL-API: There's some Javadoc for their IRI class but no other specific IRI documentation AFAICT (https://github.com/owlcs/owlapi/wiki). In particular, I'm not seeing any explicit methods to validate a string as representing an IRI.

Member

carueda commented May 31, 2017

(just some personal note)

ORR uses both Apache Jena and OWL-API libraries.

Jena: Although claimed to be "incomplete", https://jena.apache.org/documentation/notes/iri.html offers some acceptable description of how to deal with IRIs. In particular, Jena seems to put great attention to validation.

OWL-API: There's some Javadoc for their IRI class but no other specific IRI documentation AFAICT (https://github.com/owlcs/owlapi/wiki). In particular, I'm not seeing any explicit methods to validate a string as representing an IRI.

@carueda carueda added the in progress label May 31, 2017

@carueda carueda closed this in 6c8af15 May 31, 2017

carueda added a commit that referenced this issue May 31, 2017

Merge pull request #49 from mmisw/32_iri
resolve #32 "URIs should be IRIs"

@carueda carueda removed the in progress label May 31, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment