Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing without a predicate #19

Open
gkellogg opened this issue Jun 30, 2018 · 20 comments

Comments

@gkellogg
Copy link
Member

commented Jun 30, 2018

For consideration by the JSON-LD 1.1 WG...

Assuming a nested set of resources where leaf nodes are frequently repeated, it is difficult to find the definition of the node after compaction. Imagine a classification that is used on the second item in a list, and again on the 26th. It would be nice to have a place to look up the label for the classification, instead of repeating it on both 2 and 26. Similarly, information about repeated people, services, or anything else could benefit from this pattern.

As prior art, and use case for inclusion, JSON API has the notion of "included" -- a slot where you can put resources that are included in others, such that developers can always know where to find them. In my work, this has come up with repeated services in IIIF, and classifications, people and places when describing the cultural heritage objects they relate to.

The identifier map pattern is already in this space, but insufficient as it requires a predicate to map to, and the relationship is to a resource somewhere nested in the data structure, not the top level resource. There would also need to be framing support as an extension to @embed:@never such that the inclusions were not embedded in the object data, but with a pointer to where they should go.

Example data:

{
  "id": "1",
  "type": "eg:Thing-with-Items",
  "eg:items": [
    {
      "id":"2",
      "classification": "enum:c6",
      "service": "enum:s2",
    },
    { "id": "3...26 go here", "type": "eg:X"}, 
    {
      "id": "27",
      "classification": "enum:c6"
    }    
  ],
  "included": {
    "enum:c6": {"type": "eg:Type", "label": "Classification 6"},
    "enum:p1": {"type": "eg:Person", "label": "Person 1"},
    "enum:s2": {"type": "eg:Service", "label": "Login Service"}
  }  
}

Playground example with identifier map: http://tinyurl.com/yd5z87xg

The inclusion term could either be a new keyword like @id (@included or @inclusions) that was then re-aliased in the context (to, e.g. included), or it could be a new keyword value for @container (included: {@container: @included}). I think the former is the (slightly) better design, as it makes it more obviously a field rather than a data structure. It would only be usable in a resource that is not nested within other resources (e.g. the top level JSON object ... which might be in an array or @graph). Framing could then use the same keyword: @embed: @included.

Original issue: Indexing without a predicate #650

@gkellogg

This comment has been minimized.

Copy link
Member Author

commented Aug 30, 2018

I wonder if this could leverage the @nest capability, which if you'll recall, allows you to nest properties of a node under an intermediate property. If that value were a node reference, then that could mean to apply the properties of the referenced node to the referencing node. For example:

{
  "@context": {
    "@vocab": "http://example/",
    "id": "@id", 
    "type": "@type",
    "eg": "http://example/",
    "classification": {"@type": "@id", "@nest": "@id"},
    "included": {"@container": "@id"}
  },
  "id": "1",
  "type": "eg:Thing-with-Items",
  "eg:items": [
    {
      "id":"2",
      "classification": "enum:c6",
      "service": "enum:s2",
    },
    { "id": "3...26 go here", "type": "eg:X"}, 
    {
      "id": "27",
      "classification": "enum:c6"
    }    
  ],
  "included": {
    "enum:c6": {"type": "eg:Type", "label": "Classification 6"},
    "enum:p1": {"type": "eg:Person", "label": "Person 1"},
    "enum:s2": {"type": "eg:Service", "label": "Login Service"}
  }  
}

Here, the "classification": {"@type": "@id", "@nest": "@id"} could signal that nested properties are found through the referenced id, and that strings are interpreted as IRIs. This could expand to something like the following:

[{
  "@id": "1",
  "@type": ["http://example/Thing-with-Items"],
  "http://example/items": [{
    "@id": "2",
    "@type": ["http://example/Type", "http://example/Service"],
    "http://example/label": [
      {"@value": "Classification 6"},
      {"@value": "Login Service"}
    ]
  }, {
    "@id": "2",
    ...
  }, {
    "@id": "27",
    "@type": ["http://example/Type"],
    "http://example/label": [{"@value": "Classification 6"}]
  },
  "http://example/included": [{
    "@type": "http://example/Type", "http://example/label": ["@value": "Classification 6"]
  }, {
    "@type": "http://example/Person", "http://example/label": ["@value": "Person 1"]
  }, {
    "@type": "http://example/Service", "http://example/label": ["@value": "Login Service"]
  }]
}]

Of course, here may be issues with this, but it leverages the nesting concept and keyword and prevents needing to add a new keyword.

@workergnome

This comment has been minimized.

Copy link

commented Sep 1, 2018

I can say that this is a feature that would make my life significantly better--we've been using @embed: @always to deal with this which is not an optimal solution. And it would be really, really nice to have more alignment between JSON-LD and JSON-API, and this is one of the major differences that makes that hard to do.

@azaroth42

This comment has been minimized.

Copy link
Contributor

commented Sep 4, 2018

Per json-ld-api#33, I prefer a new keyword rather than overloading permutations of existing keywords which has proven confusing in the past. That said, @nest: @id seems reasonably close to the desired semantics for the relevant keywords and functionality and @nest is new in 1.1

@iherman

This comment has been minimized.

Copy link
Member

commented Sep 5, 2018

I think I understand the intention (references to other syntaxes help a lot), but I do not understand the exact mechanism. The playground example doesn't help, all three snippets on the screen look identical to me (the @context are all the same). Can someone provide a clearer proposal?

(If my understanding is remotely correct, it is also based on the new @container feature involving @id. I must admit that I am increasingly worried about overloading meanings. "@container":"@list" is used for the characterization of objects, whereas "@container":"@id" is for indexing. These are very different notions in my mind, and I would prefer to make things explicit by using some sort of a @index keyword of some sort. Overloading meanings make the language more difficult to understand.)

@azaroth42

This comment has been minimized.

Copy link
Contributor

commented Sep 17, 2018

Discussion on WG call of 2018-09-14 led to the conclusion that this is likely a framing issue, not a syntax issue. Framing based solutions to be explored.

(For the sake of issue management, we'll leave the issue here)

@azaroth42

This comment has been minimized.

Copy link
Contributor

commented Sep 17, 2018

In light of the framing discussion, how about something like a new value for the @embed flag that allows the frame to say that it should be put in @included rather than inline or omitted?

The syntax document would need to define "@included" so that it could be aliased to something else, otherwise expansion would try to apply the default vocab to it as a regular term.

Then the example in the issue would be generated by something like:

Data:

{
  "@context": {
    "included": "@included",
    "id": "@id"
  },
  ...
}

Frame:

{
 "items": [
  {
   "classification": { "@embed": "@included" },
   "service": {"@embed": "@included"}
  }
 ]
}

@azaroth42 azaroth42 moved this from Discuss-Call to Discuss-GH in JSON-LD Management Sep 17, 2018

@azaroth42 azaroth42 moved this from Discuss-GH to Discuss-Call in JSON-LD Management Oct 12, 2018

@azaroth42

This comment has been minimized.

Copy link
Contributor

commented Dec 14, 2018

On the WG call of 2018-12-14, we discussed the feature in the abstract and two separate implementation patterns.

The feature is, in JSON-LD terms, to have a way to not embed nodes in the compacted JSON tree where they're encountered, but instead have the nodes serialized in a separate part of the tree and referenced from where they're encountered. In essence, it is a desire to exclude some nodes from being embedded in the tree and instead treat them like "striped" RDF/XML, or as if they were disconnected nodes but without requiring @graph: [] at the root of the document.

  1. The reference could be completely internal. Just a way to create a pointer within the document, that could be arbitrarily generated like a blank node.
  2. The reference could be the URI of the resource, making the references object function very similarly to an identifier map, just without the predicate.

My original issue assumed option 2, but the use case could be solved with option 1 as well. I think option 1 would require more machinery than option 2, as the references would need to be understood in the context.

@iherman

This comment has been minimized.

Copy link
Member

commented Dec 16, 2018

This issue was discussed in a meeting.

  • No actions or resolutions
View the transcript 5.5. Indexing without a predicate
Benjamin Young: #19
Rob Sanderson: I opened issue 19 initially in the CG
… based on an observation by 2 communities
Rob Sanderson: last time we discussed this we thought it might be a good fit for framing
… for creating it; and the context definition for understanding what the particular … is
… you could have an “included block” that contains hundreds of items, you only have to reference but don’t have to add all everytime you use them
Gregg Kellogg: looks very much like “itemref” of microdata
… unwinding this in compaction might be challenging to say at least
Benjamin Young: the current container options don’t work because they force you to have a name or these?
Rob Sanderson: no, the issue we like to solve -> if you don’t have the entire graph in memory, you don’t necessarily find the reference for everything
… i.e. optimizing search
Gregg Kellogg: just noting that there’s a CG started on updating N3
… one of the things they tackle first is adding formula/pattern
… in RDFa we did something similar by adding some reasoning
Ivan Herman: I’m looking at the example and I’m not sure I understand what the graph is you want to generate out of that?
Rob Sanderson: a very connected one
… that allows to avoid having to define a specific node every time it is encountered
Ivan Herman: relative URIs within JSON-LD?
… couldn’t we use an internal URL, like one would use in turtle
Pierre-Antoine Champin: I’m a bit confused by Ivan’s last example
Ivan Herman: the original problem is “we don’t want to repeat things”
… one way to do this in turtle is to use internal URIs
Pierre-Antoine Champin: could I extend this by having only the included key in the top-level object and a set of ids in the corresponding object?
… basically what we discussed during TPAC
Rob Sanderson: we can already do that, so you can have the pattern RDF XML kinda uses
… but it’s a very good point
… some things we want to have nested, some enumerated
… potentially a framing algo could say “put these refs separate, but those others nested”
Rob Sanderson: in JSON schema there’s a definitions block one can reference
Benjamin Young: https://json-schema.org/latest/relative-json-pointer.html
Rob Sanderson: $ref something something magic
… I was thinking more about arbitrary ids, or something along those lines
Benjamin Young: $ref is actually here https://json-schema.org/latest/json-schema-core.html#rfc.section.8.3.2.p.1
Gregg Kellogg: it kinda looks like nesting?
Gregg Kellogg: [gives example]
Rob Sanderson: I don’t think it’s a graph container, as there’s only one graph. It could be either an identifier container or a nesting container, without a mapped predicate
Pierre-Antoine Champin: [thinks about possible hacks to do that]
Ivan Herman: not sure which way we want to go
Benjamin Young: no more calls for the rest of the year

gkellogg added a commit that referenced this issue Dec 21, 2018

Define additional profile URIs, and the interpretation of an unspecif…
…ied URL for specifying context or frame.

For #19.

gkellogg added a commit that referenced this issue Jan 12, 2019

Define additional profile URIs, and the interpretation of an unspecif…
…ied URL for specifying context or frame.

For #19.

@azaroth42 azaroth42 moved this from Discuss-Call to Discuss-F2F in JSON-LD Management Jan 25, 2019

@iherman

This comment has been minimized.

Copy link
Member

commented Feb 9, 2019

This issue was discussed in a meeting.

  • RESOLVED: Continue to explore @nest with additional features, such as @container:@id, as a solution to issue #19 {: #resolution3 .resolution}
  • ACTION: gkellogg and pchampin to explore effect of @nest+@container:@id on compaction and expansion
View the transcript 3. “itemref”, issue 19
Rob Sanderson: issue occurs when resource occurs multiple times in the graph. What would be nice that if you knew that terms got used repeatedly…
… would be nice if you had references from the inclusion to included. JSON API calls it “included”
… JSON Schema has $ref.
David Newbury: https://jsonapi.org/format/1.1/#document-top-level
David Newbury: an example of it in the JSON-API spec is here: https://jsonapi.org/format/1.1/#document-compound-documents
Rob Sanderson: useful in graph context so you can use references rather than values
… is this a frame issue or syntax? We decided both - could go into framing to know that “included” is not a predicate, it is the inclusion
… references block rather than <base#>included.
Gregg Kellogg: did you consider the RDFa approach, where there is a way to output triples where after parsing there is a reasoning step?
Ivan Herman: I thought that was more directly done…
Gregg Kellogg: … that was microdata. RDFa is more directly — reasoner takes triples and outputs w/ different subject.
Jeff Mixter: is there a way to solve this with @graph?
… I have a first block of JSON which is object outside of a graph and add subgraphs with aliased keyword
Ivan Herman: this is mixing levels — syntax is similar but this is not a graph
Gregg Kellogg: inverse properties? Included have reverse relationships to items that are included
… is classificaton_of is at term that is an @reverse – achieves separation of concerns but also includes expanding, compacting and framing for round trip
Rob Sanderson: would still need an @nest property.
Ivan Herman: there are two ways to look at this:
… 1) enum:c6 is an internal reference that we could handle with fragment id in graph, but I have an extra triple in the graph …
… you get extra links
… 2) conceptually expect value of enum:c6 to be physically replicated and put back into the node
… itemref did the replication option
… JSON Schema creates a fragment identifier, but is this what you are looking for?
David Newbury: our use case is the latter case
… because in a JSON only environment, knowing where to go is difficult.
Ivan Herman: Option 2) requires duplication and massaging in graph…
Rob Sanderson: gregg’s proposal w/ included : {"@container": "@id"} (sort of) works
Ivan Herman: included should be a nest
David Newbury: how do I get option 2) (included under classification)?
Gregg Kellogg: we’d still need an inverse thing. If I have an id map but want to say it is sort of transparent…
Ivan Herman: if a term is defined to be @nest, does @id still work or do you ignore that once and for all?
Gregg Kellogg: @nest allows me to use an intermediate property to hold things which are pushed up. We want subtree to be somewhere else
Ivan Herman: if included is @nest, is @container: @id still valid?
Gregg Kellogg: round tripping is an issue as well.
Benjamin Young: posted playground example above that uses “embedded”. Seems to do what you want. Note that “included” is an array in
… json API not an object. Also introducing a non-JSON reference mechanism
Ivan Herman: what you do is define a graph, not the content of the graph
Rob Sanderson: there is a blank node _:b0 which has a name and a type
Gregg Kellogg: use a preprocessing tool or do it the way RDFa does it?
David Newbury: I could do this but it wouldn’t be valid JSON-LD …
Gregg Kellogg: It would be, but it wouldn’t be the graph you are looking for
Harold Solbrig: (discussion about examples on FTF document… w/ @nest and rather than containing , references object…)
David Newbury: in practice we use @id in our main document and use a placeholder in data, but requires an addition piece of semantic ata
Pierre-Antoine Champin: 2 questions. 1) Do we agree that the enum term should be defined as well? (a: yes)
… 2) is "@type": "@nest" the way it would be written? (a: no)
Rob Sanderson: nest: https://www.w3.org/TR/json-ld11/#ex-65-defining-property-nesting
Gregg Kellogg: could handle it with n3 reasoning?
… it seems like we are trying to do things at a totally different level.
Adam Soroka: one other wrinkle … this would play oddly with a streaming processor.
Gregg Kellogg: this is the reason we did rdfa the way we did
Ivan Herman: in rdfa we define terms and additional semantic rules, which is what we do here.
Gregg Kellogg: it has already been done, we could just reference it.
Gregg Kellogg: https://www.w3.org/TR/html-rdfa/#property-copying
Ivan Herman: done through RDF, but way too complicated…
Pierre-Antoine Champin: reminds me of the very first version of RDF rdf:aboutEach
Rob Sanderson: http://tinyurl.com/ydgfcgl4
Harold Solbrig: (azaroth using playground example between jane and john…)
Ivan Herman: copying vs. referencing. We can say that copying stuff is outside json-ld.
… reference, however, might be doable. What do we need to make the example on the screen (enum:c6, … in issue #19) work
… . included is there because of bookkeeping. The approach feels natural
… if included is nested, you take it out of the equation altogether…
Rob Sanderson: needs to be a new syntax ("@id": "@nest"?)
Simon Steyskal: works as expected on playground but @id: @nest doesn’t work
Pierre-Antoine Champin: https://json-ld.org/playground-dev/#startTab=tab-nquads&json-ld=%7B%22%40context%22%3A%5B%22http%3A%2F%2Fschema.org%2F%22%2C%7B%22labels%22%3A%7B%22%40id%22%3A%22%40nest%22%7D%7D%5D%2C%22%40type%22%3A%22Person%22%2C%22labels%22%3A%5B%7B%22familyName%22%3A%22Doe%22%7D%2C%7B%22givenName%22%3A%22Jane%22%7D%5D%7D
Gregg Kellogg: is there a way through @nest to subsume @graph while defining a bush
Gregg Kellogg: today, nesting requires the object
Gregg Kellogg: There’s obviously work to be done…
Rob Sanderson: how much?
Gregg Kellogg: (waffles and ponders…) involves extending id of nesting… there a lot of angles to this, man.
David Newbury: to clarify, we’re not addressing framing right now, correct?
Ivan Herman: workergnome – is this approach still ok? Does it accomplish what you want?
Proposed resolution: Continue to explore @nest with additional features, such as @container:@id, as a solution to issue #19 (Rob Sanderson)
Ivan Herman: +1
Simon Steyskal: +1
Rob Sanderson: +1
Jeff Mixter: +1
Harold Solbrig: +1
Gregg Kellogg: +1
David I. Lehn: +1
Adam Soroka: +1
Pierre-Antoine Champin: +1
David Newbury: +1
Resolution #3: Continue to explore @nest with additional features, such as @container:@id, as a solution to issue #19 {: #resolution3 .resolution}
Benjamin Young: +1
Action #1: gkellogg and pchampin to explore effect of @nest+@container:@id on compaction and expansion
@iherman

This comment has been minimized.

Copy link
Member

commented Mar 29, 2019

This issue was discussed in a meeting.

  • ACTION: propose a concrete solution, considering link and nest (Rob Sanderson)
  • ACTION: propose a concrete solution, considering link and nest (David Newbury)
View the transcript Indexing without a predicate
Rob Sanderson: Link: #19
Rob Sanderson: we discussed at the F2F
… there was an action of gkellogg and pchampin to look into it
Gregg Kellogg: I didn’t have time to look into it yet
Pierre-Antoine Champin: me neither
Rob Sanderson: when an item appears randomly in multiple places in the document,
… it would be nice to put this item in a kind of “bucket” where its full description is stored,
… rather than to have to browse the full document to find the random place where the full description is included
Ivan Herman: this is essentially the ‘itemref’ feature of microdata
… copying that mechanism in JSON-LD seems complicated, but maybe not impossible
Dave Longley: sounds like a framing issue, similar to "@anywhere"
Rob Sanderson: this is not only related to framing, you need something in the context as well
Gregg Kellogg: this is indeed very much like ‘itemref’
… my concern is that it will be complicated if we want to ensure round-trip (compaction/expansion)
… like we do for other features
… that could be done using default and framing, but seems like a very complex solution
Dave Longley: we do have special keywords in the framing compaction algorithm that are treated differently to avoid dropping undefined terms, etc.
David Newbury: is there a way to handle this as a post-processing step?
Gregg Kellogg: the RDFa reference mechanism involves looking in the graph, adding triples and remove triples that were part of the pattern
Ivan Herman: if we do that (i.e., reproduce the RDFa ref mechanism) people will run away screaming
… what we are trying to do is some sort of internal references, essentially relative URIs
… it would still require to define a bush and not a tree, which forces us to use @graph,
… but it might work
Dave Longley: if we consider working in memory, consider @link which is implemented in the Javascript processor
… to ensure that an object is stored only in one place
Gregg Kellogg: the problem is that you would typically create cycles internally
… I’m not sure relative URIs can be used without introducing a level of indirection
Action #3: propose a concrete solution, considering link and nest (Rob Sanderson)
Action #4: propose a concrete solution, considering link and nest (David Newbury)
Gregg Kellogg: if we are moving towards better support for streaming profiles
… we can’t rely on in-memory storage only
… You would need a lot of bookkeeping to handle this.
@azaroth42

This comment has been minimized.

Copy link
Contributor

commented Apr 8, 2019

Requirement: Functionality that allows resource nodes to be serialized at a particular location in the JSON tree, rather than where they are encountered. There isn't necessarily a direct relationship between the top resource node in the tree and the resource node to be serialized (similar to @nest) and the serialization should be a JSON object where the URI of the node is the key (similar to @container @id).

Rationale: There are many use cases (e.g. IIIF, Linked Art, JSON-API,...) where nodes are referred to sporadically throughout the graph, but without any particular obvious first location. Embedding them always would be overkill in many situations, and the cost to find the first occurrence is arbitrarily high based on the size of the tree. Instead, having them at a knowable location makes this a single look up, rather than a tree traversal.

Proposed Solution:

Introduce a new keyword @included which may be present only as a key at the top of the serialized tree. It may be aliased. It functions like @container @id, in that the keys are terms that resolve to URIs, and the values are JSON objects that have the properties of the resource. Resources in the tree can refer to the URIs of nodes in the @included property in the normal way.

Like @nest, @included in instance data does not generate a triple during expansion, instead expansion simply descends into the JSON object and processes each key/value into the graph.

And now the solution splits into two options for discussion:

Option A - framing:

In framing, @included is a new value for the @embed directive. @embed: @included means to instead embed the resource encountered in the @included instance data property, rather than where it is currently encountered.

Option B - compaction:

In a context, a property may be defined as @type: @included. This means that all of the values of the property are to be compacted into the @included structure and the (compacted) URI of the resource is used where the reference to the resource is encountered. As @included only works for resources, in these situations it also has the same meaning as @type: @id in that the value is a URI.

@gkellogg

This comment has been minimized.

Copy link
Member Author

commented Apr 22, 2019

I think that flattening/folding included blocks needs to be done as part of expansion. I don't think the logic works for compaction, as there is no a-priori knowledge of what is in the include blocks. It might work in framing, though:

If the frame includes an @include block, then each node object which is framed is compared against the contents of each include object and replaced with a reference to that include block. However, the only way to know what term to use as the reference to the include block is to rely on the nesting defined for the properties contained in that block. This is particularly an issue for @type, which can't use this. In your examples, the same properties are used in each included block, so you can't rely on nesting here either.

It might be that you need to just use a single property, such as "included" (as defined in the context/frame somehow) so that each included block is a reference from this common property. This shouldn't affect expansion, but I don't see how to figure this out during framing.

So, for example, if you have something based on the original example from above:

{
  "@context": {
    "@vocab": "http://example/",
    "@base": "http://example/",
    "id": "@id", 
    "type": "@type",
    "eg": "http://example/",
    "classification": {"@type": "@included"},
    "service": {"@type": "@included"},
    "included": "@included"
  },
  "id": "1",
  "type": "eg:Thing-with-Items",
  "eg:items": [
    {
      "id":"2",
      "classification": "enum:c6",
      "service": "enum:s2"
    },
    {
      "id": "27",
      "classification": "enum:c6"
    }    
  ],
  "included": {
    "enum:c6": {"type": "eg:Type", "label": "Classification 6"},
    "enum:p1": {"type": "eg:Person", "label": "Person 1"},
    "enum:s2": {"type": "eg:Service", "label": "Login Service"}
  }  
}

This could expand as follows:

[{
  "@id": "http://example/1",
  "@type": ["http://example/Thing-with-Items"],
  "http://example/items": [{
    "@id": "http://example/2",
    "@type": ["http:/example/Type", "http://example/Service"],
    "http://example/label": [
      {"@value": "Classification 6"},
      {"@value": "Login Service"}
    ]
  }, {
    "@id": "http://example/27",
    "@type": ["http:/example/Type"],
    "http://example/label": [
      {"@value": "Classification 6"}
    ]
  }]
}]

A frame to get back close might look like the following:

{
  "@context": {
    "@vocab": "http://example/",
    "@base": "http://example/",
    "id": "@id", 
    "type": "@type",
    "eg": "http://example/",
    "classification": {"@type": "@included"},
    "service": {"@type": "@included"},
    "included": "@included"
  },
  "type": "eg:Thing-with-Items",
  "eg:items": {},
  "included": {
    "enum:c6": {"type": "eg:Type", "label": "Classification 6"},
    "enum:p1": {"type": "eg:Person", "label": "Person 1"},
    "enum:s2": {"type": "eg:Service", "label": "Login Service"}
  }  
}

This could get you back to the following, but it would choose just one of "classification" or "service" to contain the references.

{
  "@context": {
    "@vocab": "http://example/",
    "@base": "http://example/",
    "id": "@id", 
    "type": "@type",
    "eg": "http://example/",
    "classification": {"@type": "@included"},
    "service": {"@type": "@included"},
    "included": "@included"
  },
  "id": "1",
  "type": "eg:Thing-with-Items",
  "eg:items": [
    {
      "id":"2",
      "classification": ["enum:c6", "enum:s2"]
    },
    {
      "id": "27",
      "classification": "enum:c6"
    }    
  ],
  "included": {
    "enum:c6": {"type": "eg:Type", "label": "Classification 6"},
    "enum:p1": {"type": "eg:Person", "label": "Person 1"},
    "enum:s2": {"type": "eg:Service", "label": "Login Service"}
  }  
}
@gkellogg

This comment has been minimized.

Copy link
Member Author

commented May 3, 2019

See conversation in w3c/json-ld-wg#52 (comment) about @nest, but on reflection, this may play out a bit differently.

I suggest we do only the Expansion phase, where a property with @container: @included is expected to have one or more IRI values which are interpreted as references to a top-level @included (or alias) property, where those values are added as either values of the object, or as values of the property. If the property also have @nest, then adding the values of to that property would have the same affect as lifting them up to the object itself.

It's probably easier to simply not involve @nest, and just add the values directly to the referencing object.

When complete, the property referencing the included values would be removed.

Is it an error to reference included IRIs which aren't present, or are they simply ignored?

@iherman

This comment has been minimized.

Copy link
Member

commented May 4, 2019

This issue was discussed in a meeting.

  • No actions or resolutions
View the transcript Indexing without a predicate
Benjamin Young: link: #19
Gregg Kellogg: Related w3c/json-ld-wg#52
Benjamin Young: this is issue is also known as @included
… proposed by azaroth
… there is a related proposal by gkellogg
Gregg Kellogg: there are several ways of doing something like id-ref
… one of them would be to combine @nest and @container:@id
… Rob’s proposal would better be handled in expansion (this is where syntactic sugar is removed).
… Properties declared as e.g. @container:@include would look into a special @include container.
… Problem with compaction, which can not easily reverse this kind of extension.
… More appropriate in Framing.
… Seems quite complex and convoluted, with a lot of corner cases.
@azaroth42

This comment has been minimized.

Copy link
Contributor

commented Jul 12, 2019

A hopefully simpler proposal and example:

Introduce a new keyword @included which may be present only as a key at the top of the serialized tree. It may be aliased. The value space is always a JSON object, functionally equivalent to an @id container, in that the keys are always URIs (allowing for compaction as normal), and the values are the serialization of the resource identified by that URI.

@included does not generate a triple during expansion. Instead, expansion descends into the JSON object when its URI is encountered in the graph.

@included is generated during compaction when the algorithm encounters @type: @included in a context for a property definition.

For example, this data:

{
  "@context": {
    "eg": "https://example.com/ns/",
    "hasService": {"@id": "eg:hasService", "@type": "@id"},
    "label": "eg:label",
    "Thing": "eg:Thing",
    "Service": "eg:Service"
  },
  "@id": "https://example.org/1",
  "@type": "Thing",
  "hasService": {
  "@id": "https://example.org/service",
  "@type": "Service",
  "label": "My Service"
  }
}

Expands in 1.0 and 1.1 to this form:

  {
    "@id": "https://example.org/1",
    "@type": [
      "https://example.com/ns/Thing"
    ],
    "https://example.com/ns/hasService": [
      {
        "@id": "https://example.org/service",
        "@type": [
          "https://example.com/ns/Service"
        ],
        "https://example.com/ns/label": [
          {
            "@value": "My Service"
          }
        ]
      }
    ]
  }

The following would also expand to the same form:

{
  "@context": {
    "eg": "https://example.com/ns",
    "hasService": {"@id": "eg:hasService", "@type": "@included"},
    "label": "eg:label",
    "Thing": "eg:Thing",
    "Service": "eg:Service",
    "included": "@included"
  },
  "@id": "https://example.org/1",
  "@type": "Thing",
  "hasService": "https://example.org/service",
  "included": {
    "https://example.org/service": {
        "@type": "Service",
        "label": "My Service"
     }
  }
}

When compacting, the @type: @included definition for hasService would trigger the creation of the included (as aliased from @included), and insert the compacted form of https://example.org/service into the JSON object.

@gkellogg

This comment has been minimized.

Copy link
Member Author

commented Jul 22, 2019

I have a straw man implementation of the expansion part of this, which raises some cases to consider:

  • I presume that if the JSON document is an array that each object member may have its own included block.
  • What if an included block exists but no term has "@type": "@included"? Presume it's simply ignored.
  • What if multiple properties map to @included in the top-level object? Error, or first found wins?
  • What if a property references a missing included key? Presume that it's simply dropped.
  • What if a property with @type: @included contains values that aren't strings mapping to IRIs? I presume they're added to the output as if there were no included.
  • What if @included appears in a non top-level object?
  • What if a value in an included block includes a property with @type: @included? Creates an order-dependent expansion, or requires that included value expansion does not happen until the value is processed, not at the beginning. Seems like this should be disallowed.
  • What if the included map includes the key @none? error or ignored?
  • What if the value of the included key is not a map? Error or dropped?
  • What if a property has @type: @included, but there is no included block? Presume it defaults to an empty map.
  • Can included keys be vocabulary relative? Why not just treat them as strings?

Basic implementation

  1. After looking for type-scoped contexts, if there is no included map, and a key expanding to @included is found, create included from the value, by expanding keys document-relative and expanding the values using the expansion algorithm, and pass this to subsequent invocations of the expansion algorithm.
  2. After looking for @type: @json, if the property has a type mapping of @included, the value must be a string or an array of strings. Expand each value relative to either the document or the vocabulary, depending on the type-mapping of the property term and add the concatenation of any included map value found for that expanded value.
@gkellogg

This comment has been minimized.

Copy link
Member Author

commented Jul 23, 2019

Two other observations:

  1. as the expansion is entirely syntactic, why use IRI expansion for the keys? They could just be strings.
  2. Compaction would need to invent an identifier to use as a key of @included, as none remains in the expanded form. Plus, it would result in compacting the whole content of the property. If in compacted form the property of type @compacted had multiple string values, each one would be inserted when expanding, but could not reasonably be reconstructed when compacting.
  3. If the same value were repeated for different properties, it might be algorithmically challenging to match that with an existing value of @included.

We may want to skip compacting altogether. We previously thought about doing it with framing, although many of the same challenges remain.

@gkellogg

This comment has been minimized.

Copy link
Member Author

commented Jul 23, 2019

Some expansion examples:

Includes referenced identifier

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/data": {
      "@type": "Data",
      "label": "label"
    }
  }
}

=>

[{
  "http://example.org/data": [{
    "@type": ["http://example.org/Data"],
    "http://example.org/label": [{"@value": "label"}]
  }]
}]

Includes with array value

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/data": [{
      "@type": "Data",
      "label": "label"
    }, {
      "@type": "Data2",
      "label": "label2"
    }]
  }
}

=>

[{
  "http://example.org/data": [{
    "@type": ["http://example.org/Data"],
    "http://example.org/label": [{"@value": "label"}]
  }, {
    "@type": ["http://example.org/Data2"],
    "http://example.org/label": [{"@value": "label2"}]
  }]
}]

Included in different top-level objects of an array

[{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/data": {
      "@type": "Data",
      "label": "label"
    }
  }
}, {
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/data": {
      "@type": "Data2",
      "label": "label2"
    }
  }
}]

=>

[{
  "http://example.org/data": [{
    "@type": ["http://example.org/Data"],
    "http://example.org/label": [{"@value": "label"}]
  }]
}, {
  "http://example.org/data": [{
    "@type": ["http://example.org/Data2"],
    "http://example.org/label": [{"@value": "label2"}]
  }]
}]

Multiple values for an @included property

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": ["http://example.org/data1", "http://example.org/data2"],
  "@included": {
    "http://example.org/data1": {
      "@type": "Data",
      "label": "label"
    },
    "http://example.org/data2": {
      "@type": "Data2",
      "label": "label2"
    }
  }
}

=>

[{
  "http://example.org/data": [{
    "@type": ["http://example.org/Data"],
    "http://example.org/label": [{"@value": "label"}]
  }, {
    "@type": ["http://example.org/Data2"],
    "http://example.org/label": [{"@value": "label2"}]
  }]
}]

@included with no @type: @included

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@id"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/data": {
      "@type": "Data",
      "label": "label"
    }
  }
}

=>

[{
  "http://example.org/data": [{"@id": "http://example.org/data"}]
}]

Only uses first entry mapping to @included

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"},
    "includedA": "@included",
    "includedB": "@included"
  },
  "data": "http://example.org/data",
  "includedB": {
    "http://example.org/data": {
      "@type": "Data2",
      "label": "label2"
    }
  },
  "includedA": {
    "http://example.org/data": {
      "@type": "Data",
      "label": "label"
    }
  }
}

=>

[{
  "http://example.org/data": [{
    "@type": ["http://example.org/Data"],
    "http://example.org/label": [{"@value": "label"}]
  }]
}]

Missing @included

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data"
}

=> Error: "missing @included referent"

Missing @included referent

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/other": {
      "@type": "Data",
      "label": "label"
    }
  }
}

=> Error: "missing @included referent"

@included in a non-top-level object

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "block": {
    "data": "http://example.org/data",
    "@included": {
      "http://example.org/data": {
        "@type": "Data",
        "label": "label"
      }
    }
  }
}

=> Error: "invalid @included map"

@included term within @included map

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": {
    "http://example.org/data": {
      "@type": "Data",
      "label": "label",
      "data": "http://example.org/data"
    }
  }
}

=> Error: "invalid @included map"

@included not a map

{
  "@context": {
    "@vocab": "http://example.org/",
    "data": {"@type": "@included"}
  },
  "data": "http://example.org/data",
  "@included": true
}

=> Error: "invalid @included map"

@azaroth42

This comment has been minimized.

Copy link
Contributor

commented Jul 23, 2019

What if multiple properties map to @included in the top-level object? Error, or first found wins?

I think it should be an error. I think multiple maps should be okay, but only one should be present. With multiple and picking the first, it would be determined by the key order ... which isn't deterministic without further processing (which we already specify, but isn't lovely). Alternatively, if the keys are unique across the maps, they could be merged ... but that seems like a challenge for creation, even if we don't specify it in 1.1. So ... my thought would be error.

What if a property references a missing included key? Presume that it's simply dropped.

e.g. that there's a reference but no included block with that value as a key? If the reference value MUST expand to a IRI, then there's still a valid triple, just no data with the URI as the subject. So I would keep the triple where the IRI is the object.

And this is why I think the keys in the block and the reference should be required to expand to IRIs.

What if @included appears in a non top-level object?

I think that should be an error. There's no point having non-top-level inclusion blocks, as you would never find them.

What if the included map includes the key @none? error or ignored?

Error, as it doesn't expand to an IRI.

What if the value of the included key is not a map? Error or dropped?

Error.

Can included keys be vocabulary relative? Why not just treat them as strings?

That is an excellent question. I think they should be vocabulary relative, such that you could do crazy things like having RDFS descriptions in the @included to give the domain / range of properties that are defined in the context.

We may want to skip compacting altogether. We previously thought about doing it with framing, although many of the same challenges remain.

I think the strongest use case is expansion -- there's data that looks like this in the wild, and the structure helps the target audience of JSON developers do their thing. If it we don't specify how to generate that form in compact() in 1.1 ... that's not the end of the world.

@azaroth42

This comment has been minimized.

Copy link
Contributor

commented Jul 23, 2019

Thanks for the work on this Greg!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.