Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what is null base URL for an embedded json-ld? #103

Closed
iherman opened this issue Dec 6, 2018 · 12 comments
Closed

what is null base URL for an embedded json-ld? #103

iherman opened this issue Dec 6, 2018 · 12 comments

Comments

@iherman
Copy link
Member

iherman commented Dec 6, 2018

This issue came up in the Publishing Working Group, see https://github.com/w3c/wpub/issues/374. The question is what exactly the JSON-LD processor's behavior should be if, for some reasons the baseURI value for the <script> element is null. An example for such a situation is in https://github.com/w3c/wpub/issues/374#issuecomment-444537196, brought up by @danielweck.

@danielweck
Copy link
Member

In the JSON-LD processing model, what happens when a relative "path" cannot be resolved to an absolute URL due to unsuitable baseURI (e.g. null, or data: URL)?

  1. complete failure (i.e. abort the processing algorithm)
    or:
  2. skip the unresolved URL (i.e. ignore the referenced resource), and continue processing.

https://github.com/w3c/wpub/issues/374#issuecomment-444798500

@ajs6f
Copy link
Member

ajs6f commented Dec 6, 2018

I believe that a null base URI is explicitly allowed, at least in our editor's draft.

@ajs6f
Copy link
Member

ajs6f commented Dec 6, 2018

My reading of what I believe to be the relevant piece of RFC makes a null base URI have all components undefined except an empty path.

@gkellogg
Copy link
Member

gkellogg commented Dec 6, 2018

This is quite similar to what would happen to another RDF serialization, such as RDFa or Turtle, if there were no base IRI established. In those cases, processors will not generate absolute IRIs, which is invalid, but potentially useful. (My own processors allow this, unless set to a validate mode). Typically, as these are web specs, documents are considered to have a location which is the fallback for the base IRI. But, in some cases they don't.

The IRI Expansion Algorithm has two cases: if we are doing document relative expansion, or not. As @ajs6f points out, in the document relative case, RFC3986 applies, but the result will end up being a still relative IRI (provided that the input is not already an absolute IRI). The fall-through case is the same if there is no vocabulary mapping, and the relative IRI is returned.

Note that the [Deserialize JSON-LD to RDF Algorithm](Deserialize JSON-LD to RDF Algorithm) will not emit invalid triples, which would include any that have relative IRIs, as they are not considered valid RDF. However, implementations may have leeway in what they actually do if not in strict validation more. Otherwise, we're restricted by the RDF data model, which requires that IRIs be absolute (see RDF 1.1 Concepts:

An RDF triple consists of three components:

  • the subject, which is an IRI or a blank node
  • the predicate, which is an IRI
  • the object, which is an IRI, a literal or a blank node

An IRI (Internationalized Resource Identifier) within an RDF graph is a Unicode string [UNICODE] that conforms to the syntax defined in RFC 3987 [RFC3987].

IRIs in the RDF abstract syntax must be absolute, and may contain a fragment identifier.

I'm not clear that we should say anything more about it in the spec.

@ajs6f assertion that it is explicitly allowed, is not generally true, as the cited error message is specifically for values of @base, and setting @base to null unassigns the value, reverting to the document location.

@iherman
Copy link
Member Author

iherman commented Dec 7, 2018

@gkellogg what is not clear to me (sorry, I am not familiar with all the details of the API spec) is whether an implementation is supposed to raise the error and exactly under what circumstances.

The issue came up in the WPUB work (and raised by @danielweck) on what a processor handling a Web Publication Manifest should do if an embedded manifest does not have a valid base value. My approach is that this behavior should be in line with what a JSON-LD processor would do in general...

@gkellogg
Copy link
Member

gkellogg commented Dec 7, 2018

No, the API doesn’t define such an error, neither do we test for it. It would currently create relative IRIs in expansion, and would drop triples in toRdf.

What behavior would be preferred?

@iherman
Copy link
Member Author

iherman commented Dec 7, 2018

I must admit I do not have a very clear idea either.

Looking at the alternatives by @danielweck (#103 (comment)), I guess both of them go one step further. I think aborting the full processing is probably to drastic; I am not sure what "skipping the unresolved URL" would mean for the different API-s. For an RDF generation I guess this would mean skipping all relevant triples, I am not sure about expansion and compaction.

I think that, at the minimum, a warning should be issued, fwiw. As you said, the generation of a relative URL in RDF indeed leads to a non-conformant (ie, possibly erroneous) RDF data.

Obviously, this is a very small corner case.

@dlongley
Copy link
Contributor

dlongley commented Dec 7, 2018

It is useful behavior to be able to expand/compact without modifying URLs. My understanding is that when the base URL is explicitly set to null, a processor will skip URL processing and leave the values untouched during expansion/compaction. A base URL must be specified to convert to RDF (for conformance reasons). So I think what we've spec'd and what (I believe) the processors currently do is the right behavior.

@iherman
Copy link
Member Author

iherman commented Dec 7, 2018

@dlongley is there any warning/error raised by the processor?

@dlongley
Copy link
Contributor

dlongley commented Dec 7, 2018

@iherman, no warning or error raised.

@iherman
Copy link
Member Author

iherman commented Dec 8, 2018

I believe there should be a warning at least. At the minimum when RDF is generated.

@BigBlueHat BigBlueHat changed the title what if null base URL for an embedded json-ld? what is null base URL for an embedded json-ld? Feb 21, 2019
@iherman
Copy link
Member Author

iherman commented Aug 9, 2019

This issue was discussed in a meeting.

  • RESOLVED: close #103 as wontfix because the JSON-LD document would be treated as if it were alone–and not embedded.
View the transcript What is null base URL for an embedded json-ld?
Ivan Herman: See Issue 103
Benjamin Young: the question is raised when JSON-LD is embedded in HTML or anything.
Ivan Herman: someone came up with this example involving data: URL and iframe,
… leading to a situation where the base URL is not clear.
Gregg Kellogg: the fact that it is in HTML does not change anything,
… the processor tries to determine the base URL based on the source document URL.
Proposed resolution: close #103 as wontfix because the JSON-LD document would be treated as if it were alone–and not embedded. (Benjamin Young)
Ivan Herman: the problem is the use of the data: URL. The question is not JSON-LD specific. It would be the same with HTML
Gregg Kellogg: this is a TAG issue
Ivan Herman: It is a HTML problem. Whatever HTML does, we do it.
Dave Longley: +1
Ruben Taelman: +1
Gregg Kellogg: +1
Benjamin Young: +1
Benjamin Young: and if HTML has no answer wrt to the base URL, then you are on your own.
Ivan Herman: +1
Pierre-Antoine Champin: +1
Tim Cole: +1
Resolution #4: close #103 as wontfix because the JSON-LD document would be treated as if it were alone–and not embedded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants