Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanding @vocab properties consistently #56

Closed
pjohnston-wiley opened this issue Aug 22, 2018 · 10 comments
Closed

Expanding @vocab properties consistently #56

pjohnston-wiley opened this issue Aug 22, 2018 · 10 comments

Comments

@pjohnston-wiley
Copy link
Contributor

(I wasn't sure if this was a syntax or an API issue. If i got the wrong one, please let me know and i will refile)

Consider the following:

{
  "@context": {
    "@base": "http://example1.com/",
    "ex1": "http://example1.com/",
    "ex2": "http://example2.com/",
    "id": "@id",
    "fred": {
      "@id": "ex1:fred",
      "@type": "@vocab"
    },
    "barney": {
      "@id": "ex2:barney"
    },
    "mnemonic": "rdf:value"
  },
  "fred": [
    {
      "id": "barney",
      "mnemonic": "the sidekick"
    },
    "barney"
  ]
}

The counterintuitive result of expansion is:

[
  {
    "http://example1.com/fred": [
      {
        "@id": "http://example1.com/barney",
        "rdf:value": [
          {
            "@value": "the sidekick"
          }
        ]
      },
      {
        "@id": "http://example2.com/barney"
      }
    ]
  }
]

In other words, one barney registers as a result of the top-level @vocab declaration, while the other registers as a result of the specific barney alias against the fred property of type @vocab.

I understand why this happens, the spec has different algorithms for strings on the right of a property vs objects, but it's a bit odd. To the lay reader (i.e. me), it seems like an inconsistent application. If the IRI expansion algorithm were able to take into account explicit aliases for @vocab properties, it could be made consistent, and a similar application to compaction would deliver a symmetric behavior.

The context, in terms of practical application, is that we maintain an overarching RDF model to govern our data and i would like the folks implementing our APIs to know as little about the RDF-ness of it as possible, which is where JSON-LD comes in. By using @vocab i was able to get implementers to omit a namespace qualifier on plain taxonomy values (the second barney in the original example), but as soon as they want to include anything richer they have to qualify it. Using the same context, to guarantee semantic consistency the example would have to be written as:

{
    "fred": [
      "barney",
      {
         "id": "ex2:barney",
         "mnemonic": "the sidekick"
      }
    ]
}

i.e. we have to qualify the second barney with ex2 for it to work consistently, which means exposing API developers to the RDF-ness of our model, or breaking the connection.

@gkellogg
Copy link
Member

The reason for this is that @id, @type and properties are expanded using different bases. Typically, @id refers to something in the document, a property something in the vocabulary, and @type may be either. Thus, when you see "barney" as a string value of "fred", it is looked up relative to the vocabulary, as you have "@type": "@vocab" in the term definition for "fred"; if you used "@type": "@id", it would be document relative. When the algorithm sees "id": "barney", it is always interpreted document relative.

One way to fake out @id resolution is to use Compact IRIs. You can also make @vocab expand to be document relative in 1.1 by setting "@vocab": "".

As for the why of this, it comes from the observation that documents and vocabularies are really different things, and the expansion rules need to consider this. Also, @vocab is a powerful feature that can lead to such confusing results.

Here are some links to explore for the history:

Turtle avoided this by always resolving relative IRIs to the base, and requiring prefixes (including the empty string "") for vocabulary terms. In JSON, the practice of using bare words as keys meant that we needed to interpret the intention of the JSON author to know what typically is used as a vocabulary term vs what is used for something within the document.

@azaroth42
Copy link
Contributor

azaroth42 commented Aug 24, 2018

I think the resolution to not fix w3c/json-ld-api#7 would be a precedent for close won't fix on this issue as well.

@pjohnston-wiley
Copy link
Contributor Author

Thanks @gkellogg for taking the time to provide the why and the suggestions, unfortunately they don't really help with the issue at hand (#wontfix notwithstanding). I'm not sure what the blank @version feature would do here - it's somewhat weird looking, and it doesn't appear to be implemented anywhere yet?

I am fine with making the @context more complicated – in our case it is generated programmatically anyway. What i am less inclined to do is make our API developers have to consider RDF-ness in an API context.

@azaroth42 One thing that allowed you to close issue 7 was that there was a way to solve the problem using scoped contexts. So far i don't have a way to address this particular scenario.

So... I went down a couple of rabbit holes.

Overriding @id behavior

Setting id to be of @type @vocab is syntactically valid, however the implementations currently ignore it. This would actually make some sense to me.

{
  "@context": {
    "@base": "http://example3.com/",
    "ex1": "http://example1.com/",
    "ex2": "http://example2.com/",
    "id": {
       "@id": "@id",
       "@type": "@vocab"
    },
    "fred": {
      "@id": "ex1:fred",
      "@type": "@vocab"
    },
    "barney": {
      "@id": "ex2:barney"
    },
    "mnemonic": "rdf:value"
  },
  "fred": [
    {
      "id": "barney",
      "mnemonic": "the sidekick"
    },
    "barney"
  ]
}

Scoped context

Next i went the way of scoped contexts. This is not particularly elegant, but will work for the specific use case where the name of the alias matches the name portion of the IRI, and where you mostly operate under a single namespace IRI for the given property:

{
  "@context": {
    "@version": 1.1,
    "@base": "http://example3.com",
    "ex1": "http://example1.com/",
    "ex2": "http://example2.com/",
    "id": "@id",
    "fred": {
      "@id": "ex1:fred",
      "@type": "@id",
      "@context": {"@base": "http://example2.com/"}      
    },
    "mnemonic": "rdf:value"
  },
  "fred": [
    {
      "id": "barney",
      "mnemonic": "wilma"
    },
    "barney"
  ]
}

Note i added example3.com to illustrate how this differs from the global default.

@gkellogg this only works on the dev playground - am i right that your distiller doesn't yet support 1.1 features?

However, outside of this narrow use case, you are stuck. For example, if you want to alias barney to ex2:betty there is no way to support this as far as i can tell.

@azaroth42 azaroth42 added this to Discuss-Call in JSON-LD Management Aug 27, 2018
@gkellogg
Copy link
Member

I'm not sure what the blank @version feature would do here - it's somewhat weird looking, and it doesn't appear to be implemented anywhere yet?

You probably meant "blank @vocab feature"; It sets the default vocabulary to be the same as the document base, which would mean that otherwise undefined terms could be expanded to the same as @id, so it sort of gets what you want; if you don't use undefined terms, then this isn't an issue. It should be implemented in my processor, available here, although it may need an update. The distill probably supports the most 1.1 features of any such online service, but needs to be updated when the underlying gems change, I can look at that.

Setting id to be of @type @vocab is syntactically valid, however the implementations currently ignore it. This would actually make some sense to me.

We specifically discussed this and decided to not allow changing the processing model of keywords such as @type and @id, other than to change their container behavior.

If you're in the case where you want the values of @id to be vocabulary relative, it sounds like you're defining a vocabulary, so it would make sense that @base be the same as @vocab, which you could either do for the document as a whole, or in a scoped or embedded context.

Your second example should do what you want.

@azaroth42 azaroth42 moved this from Discuss-Call to Discuss-F2F in JSON-LD Management Sep 28, 2018
@iherman
Copy link
Member

iherman commented Oct 1, 2018

This issue was discussed in a meeting.

  • No actions or resolutions
View the transcript 4.2. @vocab property expansion
Rob Sanderson: Link: #56
Rob Sanderson: This one is about the expansion of @vocab properties and the issue claims that it is counter-intuitive
… from my perspective, it goes back to #7 (won’t-fix) that the processing model for @type shouldn’t be able to customized
… more than allowing @container: @set
… to avoid monkeying around and overcomplicating
… the response is that that fixes #7, but you can’t fix this issue with scoped contexts
… but Gregg has an example that fixes it…
Gregg Kellogg: well…..
… his feeling is that he’s willing to do things in a context to hide this, but @base is restricted from being in a remote context, and that doesn’t satisfy him
… this is about discrepancies between vocabularies space and document space, where you can use @vocab: “”
… and that might get to a point where you’re not surprised, but there’s overlap with Manu’s blank-nodes as properties, where @vocab is a solution, but not a wide enough oe
… maybe we need to make that a bit wider, make it any relative IRI relevant to the document base
… in 1.0, you can generate an error with this, and we could make it document relevant, which widens what we can put in vocal, which handles both this and Manu’s request
Rob Sanderson: can you talk through why the first Barney expands this wy?
Gregg Kellogg: if we look at Barney, it’s value is a node definition. In any context, ex2:barney evaluates to the URl
… But fred has the value, with the value parnet
… string values are intended to be treated as an IRI within the vocab space, so if barney is the string value of fred, we evaluate barney as vocab-relevant
… there is no @vocab defined, but barney is a term, so we ill now interpret them as iris, so it’s got a value of ex2:barney
… so when we evaluate things in the vocabulary space, it will look to terms and compact IRIs
… so it satisfies the term use case
… when we look at it as ID, which makes it @id, which is in doc space, and we do not interpret things in that space relative to the vocabularies
… in document space, it cannot be a bare term, so we don’t evaluate it
… it could have been barney:, which would work.
… he asked why ID can’t be treated in the vocab space, and that goes back to our decision
… why didn’t we want terms to be values?
… since we allow compact terms?
… typically, where you see this conflation, you are closer to being in the use case where you’re defining a vocabulary
… where it does make sense
Rob Sanderson: was that clear?
Ivan Herman: so I’m not ashamed to say, no.
… It is because it is not complicated, bu what this means for us and readers, is that expansions, and the bifurcations, we should have that in a note and understand it
… frankly, I don’t have a clear idea how things work
… the other comment is that we should have a rule that when an issue is put in, and when it has an example, the example should be reduced.
… from the point of this issue, there are additional things that make things more complicated
Rob Sanderson: +1
Ivan Herman: we should really ask them to make it more simple and to be careful, since much here is not relevant
Adam Soroka: whatever comes out of this, we might really need a note lays out the spaces around expansion, since the suspicion is that this arises because it’s really hard to understand this
Gregg Kellogg: I think that the API document is fairly consistent is clear about this, and the people who are the most likely to be confused are unlikely to read this
… so we’re stuck with the JSON vocab, in turtle there are pnames or iris. In JSON, we can’t do that, so we need to discern the intent around meaning
… thus the bifurcation between the spaces
… it’s laid out, and it’s easy to get lost, and adding another note in syntax documents.
… how much detail to be need to get into?
… we need a primer
… and we need a champion for that primer
Ivan Herman: I get the problem, and it’s not 1.0, it’s just…terrible.
… and I don’t think it needs to be in the specification, and a primer would be the ideal place for it.
… authors will never read the API, and I don’t want to. I just want to put data in JSON-LD
… that’s why the primer is required.
… otherwise people will be lost
Jeff Mixter: the explanation is helpful, and it would be helpful to have the example to be recreated to be realistic, and it’s really confusing with freds and barneys
… it might be better to have real-world examples
Rob Sanderson: a complication with #33
Rob Sanderson: w3c/json-ld-api#33
Rob Sanderson: It would be nice to have the Uris compact to not just ID, but also to @id. Doesn’t this mean changing the space?
Gregg Kellogg: the way I take this option is to not compact to allow properties, but not values
… that things would remain in their expanded state.
… doing this wouldn’t add additional space
… it’s how you interpret string values. If you’ve got an object, it’s unambiguous
Rob Sanderson: If the context wanted ID: barney to be compacted to id: barney, and another term compacted to a string, then those two otherwise identical lines would have different value spaces
Gregg Kellogg: if you were able to specify the value space of a term, as in the first comment of #33…I think the issues comes down to wanting to specifiy for a property when compacting, you want to compact the term, but not the value.
… if you do this, it continues to be unambiguous, …something about strings…
Rob Sanderson: is there a proposal?
Gregg Kellogg: do we need a issue to change the rules about @vocab to use relative IRIs in document space
… I can create that issue
Ivan Herman: I would like to move this to face-to-face
Gregg Kellogg: sure
Rob Sanderson: sure
Adam Soroka: could we leave a note for the primer?
… maybe a label?
… about the spaces?
Gregg Kellogg: we should create an issue for the primer about that
Rob Sanderson: gregg, create the issue about vocabs, adam, about the primer, and we’ll defer this to face-t-face
Adam Soroka: +1 to that!

@iherman
Copy link
Member

iherman commented Oct 26, 2018

Here is the minimal example that has the problem area focused on only:

{
  "@context": {
    "fred": {
      "@id": "http://a.b.c",
      "@type": "@vocab"
    },
    "barney": {
      "@id": "abc:def"
    }
  },
  "fred": "barney"
}

@iherman
Copy link
Member

iherman commented Oct 26, 2018

See also http://tinyurl.com/y78bzssn

@gkellogg gkellogg moved this from Discuss-F2F to Discuss-Call in JSON-LD Management Oct 27, 2018
@iherman
Copy link
Member

iherman commented Oct 27, 2018

This issue was discussed in a meeting.

  • RESOLVED: Make #56 editorial to add a "@type":"@vocab" example and otherwise no change
View the transcript Gregg Kellogg: the concern is that the same term expands differently in different places
… gkellogg then describes the example in the issue
Rob Sanderson: the example has two triples in it:
Benjamin Young: triples from the playground from the first example in #56
Benjamin Young: <http://example1.com/barney> "the sidekick" .
Benjamin Young: _:b0 <http://example1.com/fred> <http://example1.com/barney> .
Benjamin Young: _:b0 <http://example1.com/fred> <http://example2.com/barney> .
Ivan Herman: I would say that in the second case (barney is a string) it’s the defn of barney that comes in
… nothing to do with that fred is defined by vocab
Gregg Kellogg: no, it does have to do with
… our rules are intentionally different for different kinds of values
… the complaint is that the rules for resolving IRIs vary
… in the example the value space of fred is vocab
… the fact that there is an entry in the vocab space for barney means nothing when we use barney as an ID.
Adam Soroka: that’s the two different spaces
Gregg Kellogg: unless it happens to be used in a CURI
Ivan Herman: this is confusing
Gregg Kellogg: in 1.0 there is a distinction between the vocabulary (properties, types) and values, which come from the document.
… if you have a term in the vocab with a name that is used as instance data, in each position they are resolved differently.
Benjamin Young: a remix of the example of #56 using 2 contexts in http://tinyurl.com/y7wpjogw
Gregg Kellogg: we’re getting back into deep history of 1.0 here. Qe felt there are good reasons to make this distinction, but we may have been wrong, of course
Adam Soroka: ivan is now showing examples on the projector, see http://tinyurl.com/yclwe2ax
Gregg Kellogg: when you have a doc, you ascribe meaning by bring properties and types to it.
… but you would want to resolve relative value URIs differently than vocab terms
Rob Sanderson: but this kind of problematic situation shouldn’t arise
Benjamin Young: sometimes you don’t get a choice
Rob Sanderson: you should be able to avoid it by inverting the order of multiple contexts, if they colliding in this way between them
Ivan Herman: in the first example with barney, http://example1.com/ is used for base and the first appearance of fred
Gregg Kellogg: the fact that barney is a term in the context doesn’t affect value expansion
Ivan Herman: if I change the base, various changes result
… we could pare down the example
Benjamin Young: this could confuse naive devs
Rob Sanderson: you should never get this to begin with
… what would be the use of interpreting barney as the same thing in the example
… you wouldn’t construct that data.
Benjamin Young: [uses his hands to construct a list of annotations]
… one past use case we’ve gotten is when you start with a list of ids,then you reference down into a list of inclusions.
… the use case from which this current issue came is gathering that up into a packaged graph with lots of graphs about the same stuff
Rob Sanderson: http://tinyurl.com/yb3olq5y
Benjamin Young: in JSON-LD we are constantly using many contexts, that raises the likelihood of this
Adam Soroka: more discussion about whether or not this situation can realistically arise
Rob Sanderson: the data in the OP example is meaningless
Adam Soroka: ivan brings more examples to the projector
Ivan Herman: here is a minimal situation to display the problem
Ivan Herman: A simpler version of the example:
{
"@context": {
"fred": {
"@id": "http://a.b.c",
"@type": "@vocab"
},
"barney": {
"@id": "abc:def"
}
},
"fred": "barney"
}
(see http://tinyurl.com/y78bzssn)
Adam Soroka: agreement that ivan’s new example minimally demonstrates the problem
Gregg Kellogg: why does one want to create terms with '@type':"@vocab" to begin with?
Ivan Herman: we should have a very clear example of this and explanation of this in the docs
Gregg Kellogg: I think we have examples
Ivan Herman: with lots of explanation
Gregg Kellogg: we have a document space and a vocab space, and the only time document terms get resolved in vocab space is in compact IRIs
… and when the term used as the property is defined as '@type':"@vocab"
Benjamin Young: this example wasn’t contrived—it came out of extant tooling
Adam Soroka: disagreement about how responsible users are to avoid these kinds of colisions when using other people context’s
Ivan Herman: Patrick (the OP) is a very expert user. What about the less-expert users?
… it’s a return to what I earlier said: what if I am semweb literate, and I come to this, and I can’t make my graph work sanely in JSON-LD
Benjamin Young: even with Patrick’s expertise he hit this problem, and not by trying to develop a curious example
… it was normal work
Gregg Kellogg: this might stem from divergent missions
… surprising that someone trying to turn Turtle into JSON-LD doesn’t find that obvious
Ivan Herman: might need to push this into the primer
… the doc as is tried to make it usable for JSON devs
… we may need to write from the other direction
… for RDF folks coming to JSON
Rob Sanderson: in Patrick’s defense, there is no good example for this
Gregg Kellogg: I can try to write this up
… one concern: we are still getting to “own” JSON-LD
… we all need to get to an adequate mastery level to be able to talk about this
… we could try assigning issues to people who raise them
Proposed resolution: Make #56 editorial to add a "@type":"@vocab" example (Rob Sanderson)
Adam Soroka: +1
Proposed resolution: Make #56 editorial to add a "@type":"@vocab" example and otherwise no change (Rob Sanderson)
Rob Sanderson: +1
Simon Steyskal: +1
Benjamin Young: +1
Gregg Kellogg: +1
Adam Soroka: +1
Harold Solbrig: +1
Resolution #4: Make #56 editorial to add a "@type":"@vocab" example and otherwise no change

@gkellogg gkellogg moved this from Discuss-Call to Editorial Work in JSON-LD Management Oct 29, 2018
@gkellogg gkellogg moved this from Editorial Work to Editorial work complete in JSON-LD Management Dec 6, 2018
@iherman
Copy link
Member

iherman commented Dec 8, 2018

This issue was discussed in a meeting.

  • RESOLVED: close issues 56, 77, and 102 as having been addressed by recent merges
View the transcript Gregg Kellogg: issue 56 - this was looking for examples to indicate unexpected behavior when type is a vocab
… i think ball was in Ivan’s court whether the examples were sufficient
… we are not closing the issue until after the PR, so question is whether we are ready to close
… we added 2 examples
… next issue is 77, disambiguate uses of @type
… Webex seems to have some problems with this meeting, especially if changing focus on my computer
… we discussed adding @datatype, instead decided to add text around different uses of @type
… are people satisfied with wording?
Gregg Kellogg: use of @none regarding language
… in seeming conflict with requirement that language has to be string, @none didn’t specify
… fixed that wording making clear that @none has to be a string
Benjamin Young: these are tagged proposed closing
… we need to decide if ready to close
Adam Soroka: I wanted to mention the @datatype issue - I’m okay with this, but we may need a gitHub tag for whenever issues about @type comes up
… this will be a way to verify that spec wording update is enough to minimize confusion
Gregg Kellogg: w3c/json-ld-wg#24
Gregg Kellogg: I did raise issue 24 about surfacing the minutes on this topic like we did in Community Group
… could we do this?
… was useful in community group
Ivan Herman: minutes cleanup is done by me, and summarizing each meeting may not be a good idea
Gregg Kellogg: there is a scribe tool that builds summaries and indexes automatically
Benjamin Young: it works on issues and action lists at top/bottom of each meeting minutes
Gregg Kellogg: the good thing is it aggregates all minutes on each issue
Benjamin Young: I will take an action to investigate to see if we can build index of minutes by issue
… I think this is a good idea
Gregg Kellogg: will work with bigbluehat on this since I worked on CG
Benjamin Young: can we close these 3 issues
… 56, 77 and 102 ?
Ivan Herman: can we do a resolution to close these 3 issues?
Proposed resolution: close issues 56, 77, and 102 as having been addressed by recent merges (Benjamin Young)
Ivan Herman: +1
Gregg Kellogg: +1
Tim Cole: +1
Benjamin Young: +1
David Newbury: +1
Pierre-Antoine Champin: +1
Resolution #1: close issues 56, 77, and 102 as having been addressed by recent merges

@iherman iherman closed this as completed Dec 8, 2018
@gkellogg gkellogg removed this from Editorial work complete in JSON-LD Management Dec 8, 2018
pjohnston-wiley added a commit to WileyLabs/json-ld-syntax that referenced this issue Dec 9, 2018
@pjohnston-wiley
Copy link
Contributor Author

Belatedly, thank you all for the time you spent on this. Closing. chz p

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants