Relative IRIs may clash with terms #49

Closed
niklasl opened this Issue Jan 10, 2012 · 5 comments

Comments

Projects
None yet
4 participants
@niklasl
Member

niklasl commented Jan 10, 2012

Altough we've removed @base, the issue remains where relative IRIs (who seem to be still supported) and terms may clash. For instance, consider this data (assuming it is located at http://example.org/):

{
  "@context": {"homepage": "http://xmlns.com/foaf/0.1/homepage"},
  "@id": "homepage#me",
  "homepage": {"@id": "homepage"}
}

Currently, the triple for this data would become:

<http://example.org/homepage#me>
    <http://xmlns.com/foaf/0.1/homepage> <http://xmlns.com/foaf/0.1/homepage> .

Not the intended:

<http://example.org/homepage#me>
    <http://xmlns.com/foaf/0.1/homepage> <http://example.org/homepage> .

This is because the @id given for homepage would be found among the terms and not be resolved against "http://example.org/".

There are various ways of resolving this:

  1. Only allow full IRIs. That will probably be problematic since there are many good use cases for them, e.g. for referencing contexts, and if nothing else it is most certainly common practise for linking on the web.
  2. Not allow relative IRIs that might be confused with terms, e.g. by requiring non-full IRI paths to start with either "/" (i.e. be absolute) or "./" (which is already a requirement in the URI spec if the first segment contains a colon, to disambiguate from a protocol). This could work but perhaps suffers in part of the problems of 1 (since it isn't uncommon for relative links to be given as only the leaf segment of a path).
  3. Don't allow terms or CURIEs as values in @id (including for coerced values). That's problematic since there are many needs for these, both in contexts and in certain instance data cases for referencing e.g. classes and properties (and as @type is basically defined as {"@id": "rdf:type", "@type": "@id"}). We could have two different keywords, one for taking TERMorCURIEorAbsIRI values (mainly used in coercion, possibly named @term), and one for only IRIs or perhaps CURIEorIRI. (See RDFa 1.1 for the definitions of these.)
@lanthaler

This comment has been minimized.

Show comment
Hide comment
@lanthaler

lanthaler Jan 12, 2012

Member

I don't think option 1 is not viable since often it's difficult for a service to figure out its base IRI. E.g. this would make serving a collection of JSON-LD documents over CDN impossible.

Option 2 seems to be a good idea as because it makes the intent very clear. I, e.g., would have classified the above example as correct - I would have seen that as my intent. But that's probably just because of the strange value "homepage".

Option 3 could work but, as you said, is problematic.

So summarized I tend to support option 2. Relative IRIs seem to be problematic also in another instance, see issue #56. Option 2 could solve this problem as well I think. The only open problem would then be a the use of an unspecified prefix which would be interpreted as a schema, but I can't see anything we could do about that.

Btw, I think misread the URI spec - if I'm not completely off:

relative-ref = relative-part ...
relative-part = ... / path-no-scheme / ..
path-no-scheme = segment-nz-nc ( "/" segment )
segment-nz-nc = 1
( unreserved / pct-encoded / sub-delims / "@" )
; non-zero-length segment without any colon ":"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

So it can start with any character. The first segment just can't contain a colon.

-----Original Message-----
From: Niklas Lindström [mailto:reply+i-2793880-
7ce675672543f0c89f7683a1586ab8e5b60dfc8f-456407@reply.github.com]
Sent: Wednesday, January 11, 2012 1:16 AM
To: Markus Lanthaler
Subject: [json-ld.org] Relative IRIs may clash with terms (#49)

Altough we've removed @base, the issue remains where relative IRIs
(who seem to be still supported) and terms may clash. For instance,
consider this data (assuming it is located at http://example.org/):

{
  "@context": {"homepage": "http://xmlns.com/foaf/0.1/homepage"},
  "@id": "homepage#me",
  "homepage": {"@id": "homepage"}
}

Currently, the triple for this data would become:

<http://example.org/homepage#me>
    <http://xmlns.com/foaf/0.1/homepage>

http://xmlns.com/foaf/0.1/homepage .

Not the intended:

<http://example.org/homepage#me>
    <http://xmlns.com/foaf/0.1/homepage>

http://example.org/homepage .

This is because the @id given for homepage would be found among the
terms and not be resolved against "http://example.org/".

There are various ways of resolving this:

  1. Only allow full IRIs. That will probably be problematic since there
    are many good use cases for them, e.g. for referencing contexts, and if
    nothing else it is most certainly common practise for linking on the
    web.
  2. Not allow relative IRIs that might be confused with terms, e.g. by
    requiring non-full IRI paths to start with either "/" (i.e. be
    absolute) or "./" (which is already a requirement in the URI
    spec
    if the first
    segment contains a colon, to disambiguate from a protocol). This could
    work but perhaps suffers in part of the problems of 1 (since it isn't
    uncommon for relative links to be given as only the leaf segment of a
    path).
  3. Don't allow terms or CURIEs as values in @id (including for
    coerced values). That's problematic since there are many needs for
    these, both in contexts and in certain instance data cases for
    referencing e.g. classes and properties (and as @type is basically
    defined as {"@id": "rdf:type", "@type": "@id"}). We could have two
    different keywords, one for taking TERMorCURIEorAbsIRI values (mainly
    used in coercion, possibly named @term), and one for only IRIs or
    perhaps CURIEorIRI. (See [RDFa 1.1](http://www.w3.org/TR/rdfa-
    core/#s_datatypes) for the definitions of these.)

Reply to this email directly or view it on GitHub:
#49

Member

lanthaler commented Jan 12, 2012

I don't think option 1 is not viable since often it's difficult for a service to figure out its base IRI. E.g. this would make serving a collection of JSON-LD documents over CDN impossible.

Option 2 seems to be a good idea as because it makes the intent very clear. I, e.g., would have classified the above example as correct - I would have seen that as my intent. But that's probably just because of the strange value "homepage".

Option 3 could work but, as you said, is problematic.

So summarized I tend to support option 2. Relative IRIs seem to be problematic also in another instance, see issue #56. Option 2 could solve this problem as well I think. The only open problem would then be a the use of an unspecified prefix which would be interpreted as a schema, but I can't see anything we could do about that.

Btw, I think misread the URI spec - if I'm not completely off:

relative-ref = relative-part ...
relative-part = ... / path-no-scheme / ..
path-no-scheme = segment-nz-nc ( "/" segment )
segment-nz-nc = 1
( unreserved / pct-encoded / sub-delims / "@" )
; non-zero-length segment without any colon ":"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

So it can start with any character. The first segment just can't contain a colon.

-----Original Message-----
From: Niklas Lindström [mailto:reply+i-2793880-
7ce675672543f0c89f7683a1586ab8e5b60dfc8f-456407@reply.github.com]
Sent: Wednesday, January 11, 2012 1:16 AM
To: Markus Lanthaler
Subject: [json-ld.org] Relative IRIs may clash with terms (#49)

Altough we've removed @base, the issue remains where relative IRIs
(who seem to be still supported) and terms may clash. For instance,
consider this data (assuming it is located at http://example.org/):

{
  "@context": {"homepage": "http://xmlns.com/foaf/0.1/homepage"},
  "@id": "homepage#me",
  "homepage": {"@id": "homepage"}
}

Currently, the triple for this data would become:

<http://example.org/homepage#me>
    <http://xmlns.com/foaf/0.1/homepage>

http://xmlns.com/foaf/0.1/homepage .

Not the intended:

<http://example.org/homepage#me>
    <http://xmlns.com/foaf/0.1/homepage>

http://example.org/homepage .

This is because the @id given for homepage would be found among the
terms and not be resolved against "http://example.org/".

There are various ways of resolving this:

  1. Only allow full IRIs. That will probably be problematic since there
    are many good use cases for them, e.g. for referencing contexts, and if
    nothing else it is most certainly common practise for linking on the
    web.
  2. Not allow relative IRIs that might be confused with terms, e.g. by
    requiring non-full IRI paths to start with either "/" (i.e. be
    absolute) or "./" (which is already a requirement in the URI
    spec
    if the first
    segment contains a colon, to disambiguate from a protocol). This could
    work but perhaps suffers in part of the problems of 1 (since it isn't
    uncommon for relative links to be given as only the leaf segment of a
    path).
  3. Don't allow terms or CURIEs as values in @id (including for
    coerced values). That's problematic since there are many needs for
    these, both in contexts and in certain instance data cases for
    referencing e.g. classes and properties (and as @type is basically
    defined as {"@id": "rdf:type", "@type": "@id"}). We could have two
    different keywords, one for taking TERMorCURIEorAbsIRI values (mainly
    used in coercion, possibly named @term), and one for only IRIs or
    perhaps CURIEorIRI. (See [RDFa 1.1](http://www.w3.org/TR/rdfa-
    core/#s_datatypes) for the definitions of these.)

Reply to this email directly or view it on GitHub:
#49

@dlongley

This comment has been minimized.

Show comment
Hide comment
@dlongley

dlongley Jan 12, 2012

Member

+1 for option 2.

Member

dlongley commented Jan 12, 2012

+1 for option 2.

@lanthaler

This comment has been minimized.

Show comment
Hide comment
@lanthaler

lanthaler Jan 23, 2012

Member

Issue #46 might be related to this, especially the need for IRI normalization.

Member

lanthaler commented Jan 23, 2012

Issue #46 might be related to this, especially the need for IRI normalization.

@lanthaler

This comment has been minimized.

Show comment
Hide comment
@lanthaler

lanthaler Jan 24, 2012

Member

2012-01-24 Telecon:

RESOLVED: The lexical space for keys in JSON-LD key-value statements is if a term: NCName, if a prefix: NCName for the prefix, otherwise the lexical space for an IRI.

RESOLVED: The lexical space for keys in JSON-LD Context key-value statements is if a term: NCName, if a prefix: NCName for the prefix, otherwise the lexical space for an absolute IRI.

Member

lanthaler commented Jan 24, 2012

2012-01-24 Telecon:

RESOLVED: The lexical space for keys in JSON-LD key-value statements is if a term: NCName, if a prefix: NCName for the prefix, otherwise the lexical space for an IRI.

RESOLVED: The lexical space for keys in JSON-LD Context key-value statements is if a term: NCName, if a prefix: NCName for the prefix, otherwise the lexical space for an absolute IRI.

gkellogg added a commit that referenced this issue Jan 27, 2012

Add definitions for absolute IRI and relative IRI.
Add requirements that keys expand to absolute IRIs to be processed.
(It's an open issue of what to do with the value of keys that do not expand to absolute IRIs).
This partially addresses issue #49 and the resolution of 1/24/2012 http://json-ld.org/minutes/2012-01-24/#resolution-1
@gkellogg

This comment has been minimized.

Show comment
Hide comment
@gkellogg

gkellogg Apr 9, 2012

Member

Also completed in d3c4996.

Member

gkellogg commented Apr 9, 2012

Also completed in d3c4996.

@gkellogg gkellogg closed this Apr 9, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment