Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Framing blank node unnamed graphs #27

Closed
dlongley opened this issue Oct 29, 2018 · 24 comments · Fixed by #63
Closed

Framing blank node unnamed graphs #27

dlongley opened this issue Oct 29, 2018 · 24 comments · Fixed by #63

Comments

@dlongley
Copy link
Contributor

@dlongley dlongley commented Oct 29, 2018

I'm trying to create an "identity frame" for some data that uses "@container": "@graph". The data looks like this:

{
  "@context": {
    "@version": 1.1,
    "id": "@id",
    "claim": {"@id": "ex:claim", "@container": "@graph"},
    "name": "ex:name",
    "proof": {"@id": "ex:proof", "@type": "@id", "@container": "@graph"},
    "signer": {"@id": "ex:signer", "@type": "@id"},
    "subject": {"@id": "ex:subject", "@type": "@id"},
    "knows": {"@id": "ex:knows", "@type": "@id"}
  },
  "id": "ex:cred",
  "issuer": {
    "id": "ex:issuer",
    "name": "Foo"
  },
  "subject": {
    "id": "ex:subject",
    "name": "the subject",
    "knows": {
      "id": "ex:issuer",
      "name": "Someone else"
    }
  },
  "proof": {
    "@type": "ex:Proof",
    "name": "the proof",
    "signer": {
      "id": "ex:subject",
      "name": "something different"
    }
  }
}

And the frame like this:

{
  "@context": {
    "@version": 1.1,
    "id": "@id",
    "claim": {"@id": "ex:claim"},
    "name": "ex:name",
    "proof": {"@id": "ex:proof", "@type": "@id"},
    "signer": {"@id": "ex:signer", "@type": "@id"},
    "subject": {"@id": "ex:subject", "@type": "@id"},
    "knows": {"@id": "ex:knows", "@type": "@id"}
  },
  "subject": {
    "@embed": "@always"
  },
  "proof": {}
}

A link to the playground is here: http://tinyurl.com/yaewmkk5

The goal is for the output to match the input, but it does not. Does anyone know what tweaks are necessary to accomplish this? Do we have proper support for framing blank node named graphs? From looking at the jsonld.js implementation, it seemed like only non-blank node named graphs were supported... and the spec doesn't seem to give any examples with blank nodes. It seems we have both a spec and an implementation gap here. If not, could someone share a working example?

@gkellogg

This comment has been minimized.

Copy link
Member

@gkellogg gkellogg commented Oct 29, 2018

Note that the frame will need to have @graph to start with the default graph, rather than the merged graph.

It should do more than that, although my implementation doesn't quite get there. Use the following frame in my distiller:

{
  "@context": {
    "@version": 1.1,
    "id": "@id",
    "claim": {"@id": "ex:claim"},
    "name": "ex:name",
    "proof": {"@id": "ex:proof", "@type": "@id"},
    "signer": {"@id": "ex:signer", "@type": "@id"},
    "subject": {"@id": "ex:subject", "@type": "@id"},
    "knows": {"@id": "ex:knows", "@type": "@id"}
  },
  "@graph": {
    "subject": {
      "@embed": "@always"
    },
    "proof": {}
  }
}

I get the following output:

{
  "@context": {
    "@version": 1.1,
    "id": "@id",
    "claim": {
      "@id": "ex:claim"
    },
    "name": "ex:name",
    "proof": {
      "@id": "ex:proof",
      "@type": "@id"
    },
    "signer": {
      "@id": "ex:signer",
      "@type": "@id"
    },
    "subject": {
      "@id": "ex:subject",
      "@type": "@id"
    },
    "knows": {
      "@id": "ex:knows",
      "@type": "@id"
    }
  },
  "@graph": [
    {
      "ex:proof": {
        "@graph": [
          {
            "@type": "ex:Proof",
            "name": "the proof",
            "signer": "ex:subject"
          },
          {
            "id": "ex:subject",
            "name": "something different"
          }
        ]
      },
      "id": "ex:cred",
      "subject": {
        "id": "ex:subject",
        "knows": {
          "id": "ex:issuer",
          "name": "Someone else"
        },
        "name": "the subject"
      }
    }
  ]
}

The JavaScript implementation doesn't quite do this, and generates an error in any case using this frame.

I'll need to see why the proof term isn't being used, and why we're not dropping the @graph elements. We also don't embed ex:subject, which needs to be investigated.

The step at 4.4.1 in the Framing Algorithm should handle finding the graph and embed it. It should also work on the merged graph, I think, but doesn't seem to for me. I need to see if these are implementation issues, or if the algorithm is missing something. Help appreciated if you can look at it from your end.

@iherman

This comment has been minimized.

Copy link
Member

@iherman iherman commented Feb 9, 2019

This issue was discussed in a meeting.

  • RESOLVED: close framing#27 wontfix, as there’s no justification for the required RDF layer requirement that the blank node identity of the named graph is the default subject of the triples in the graph {: #resolution15 .resolution}
View the transcript Framing blank node unnamed graphs
Rob Sanderson: ref: #27
Gregg Kellogg: how can SHeX validate verifiable claims?
… there was no reasonable way for SHeX to figure out where to start in that graph to begin validation. Why not just reuse the blank node as the default subject of the graph?
Ivan Herman: I remember, and I am opposed to this.
Rob Sanderson: if it was not a blank node, does the problem go away?
Gregg Kellogg: if it had an identity, it wouldn’t get to this point.
Gregg Kellogg: if you use a graph container, should we use the blank node as the default subject for the graph?
Ivan Herman: that’s semantically wrong.
Rob Sanderson: is this a RDF problem?
Ivan Herman: no. A blank node for the graph and a blank node within the graph are two different things.
Gregg Kellogg: JSON people have a tree-based view, and graphs are not required to have a root.
… so it’s not unreasonable to add a property to indicate in the root.
Gregg Kellogg: this is used in framing, where the top node has a id
Harold Solbrig: I object to bnode, because if there’s not a stake in the ground, having magic to b-nodes…
Gregg Kellogg: fragment identifiers would be a better solution.
Proposed resolution: close framing#27 wontfix, as there’s no justification for the required RDF layer requirement that the blank node identity of the named graph is the default subject of the triples in the graph (Rob Sanderson)
Gregg Kellogg: +1
Rob Sanderson: +1
David Newbury: +1
Ivan Herman: +1
Harold Solbrig: +1!
David I. Lehn: +1
Jeff Mixter: +1
Adam Soroka: +0
Resolution #15: close framing#27 wontfix, as there’s no justification for the required RDF layer requirement that the blank node identity of the named graph is the default subject of the triples in the graph {: #resolution15 .resolution}
@iherman iherman closed this Feb 9, 2019
@dlongley

This comment has been minimized.

Copy link
Contributor Author

@dlongley dlongley commented Feb 9, 2019

@iherman,

The conversation doesn't seem to apply to this issue but rather to this other one about SHeX and blank node identifier reuse:

w3c/json-ld-api#26

Was the wrong issue closed?

@dlongley

This comment has been minimized.

Copy link
Contributor Author

@dlongley dlongley commented Feb 9, 2019

This issue has to do with a bug in either the framing algorithm or the spec (or both) ... it is unrelated to SHeX or reusing blank nodes with named graph subjects/etc.

@iherman

This comment has been minimized.

Copy link
Member

@iherman iherman commented Feb 9, 2019

@dlongley I just did what the resolution said in the minutes... @azaroth42, has there been a mistake in the minutes?

(I am on my way to the airport, I cannot check it now.)

@gkellogg

This comment has been minimized.

Copy link
Member

@gkellogg gkellogg commented Feb 9, 2019

Yes, this is an issue we didn’t discuss and needs to be reopened.

@gkellogg gkellogg reopened this Feb 9, 2019
@iherman

This comment has been minimized.

Copy link
Member

@iherman iherman commented Feb 9, 2019

Can somebody send me the changes on the minutes? Better: do the changes?

@iherman

This comment has been minimized.

Copy link
Member

@iherman iherman commented Feb 15, 2019

For the records: #27 (comment) is indeed unrelated to this issue. Change happened on the minutes on 2019-02-15.

@iherman

This comment has been minimized.

Copy link
Member

@iherman iherman commented Feb 15, 2019

This issue was discussed in a meeting.

  • RESOLVED: Leave framing 27 open, and to investigate the cause. Record open issue in the next PWD of Framing. {: #resolution2 .resolution}
View the transcript 4. Framing blank node unnamed graphs (Issue frame27)
Rob Sanderson: the issue is about Framing blank node unnamed graphs
Rob Sanderson: #27
Dave Longley: I have an example of a document with a graph container,
… and couldn’t to get the same document as my input using framing.
… It seems to defeat the purpose of framing.
Gregg Kellogg: It is true that this should be possible,
… but I doubt we can address that problem before the next publication.
Rob Sanderson: should there be a note in the next WD about that issue?
Proposed resolution: Leave framing 27 open, and to investigate the cause. Record open issue in the next PWD of Framing. (Rob Sanderson)
Gregg Kellogg: this would appear in the list of open issue that we append to each WD
Simon Steyskal: +1
Pierre-Antoine Champin: +1
Rob Sanderson: +1
Dave Longley: +1
Adam Soroka: +1
Harold Solbrig: +1
Ivan Herman: +1
David Newbury: +1
Benjamin Young: +1
Rob Sanderson: given the low number of open issue, I would prefer some editor text to make the issue more visible
David I. Lehn: +1
Resolution #2: Leave framing 27 open, and to investigate the cause. Record open issue in the next PWD of Framing. {: #resolution2 .resolution}
Gregg Kellogg: +1
Dave Longley: once we get sealed contexts working the way we need we might be able to look into addressing this one next
Dave Longley: (we == Digital Bazaar)
@azaroth42 azaroth42 moved this from Discuss-GH to Editorial Work in JSON-LD Management Apr 4, 2019
@gkellogg

This comment has been minimized.

Copy link
Member

@gkellogg gkellogg commented Aug 7, 2019

So, I re-tried this with the following frame:

{
  "@context": {
    "@version": 1.1,
    "id": "@id",
    "claim": {"@id": "ex:claim"},
    "name": "ex:name",
    "proof": {"@id": "ex:proof", "@type": "@id", "@container": "@graph"},
    "signer": {"@id": "ex:signer", "@type": "@id"},
    "subject": {"@id": "ex:subject", "@type": "@id"},
    "knows": {"@id": "ex:knows", "@type": "@id"}
  },
  "@graph": {
    "subject": {
      "@embed": "@always"
    },
    "proof": {}
  }
}

I do get mostly the expected results:

{
  "@context": {
    "@version": 1.1,
    "id": "@id",
    "claim": {
      "@id": "ex:claim"
    },
    "name": "ex:name",
    "proof": {
      "@id": "ex:proof",
      "@type": "@id",
      "@container": "@graph"
    },
    "signer": {
      "@id": "ex:signer",
      "@type": "@id"
    },
    "subject": {
      "@id": "ex:subject",
      "@type": "@id"
    },
    "knows": {
      "@id": "ex:knows",
      "@type": "@id"
    }
  },
  "id": "ex:cred",
  "subject": {
    "id": "ex:subject",
    "name": "the subject",
    "knows": {
      "id": "ex:issuer",
      "name": "Someone else"
    }
  },
  "proof": [
    {
      "@type": "ex:Proof",
      "name": "the proof",
      "signer": {
        "id": "ex:subject",
        "name": "something different"
      }
    },
    "ex:subject"
  ]
}

The issuer property is missing, but it's not defined in the context. By adding @container: @graph, the proof property is used, and contains the graph value, although it's in an array, and probably shouldn't be with out the @set being present in the context definition.

The playground still doesn't generate the right output, however, but it seems that the algorithm is (mostly) correct.

@gkellogg gkellogg moved this from Editorial Work to Editorial work complete in JSON-LD Management Aug 7, 2019
@gkellogg

This comment has been minimized.

Copy link
Member

@gkellogg gkellogg commented Aug 7, 2019

Actually, the only oddity here is that "proof" contains two items, the second of which is "ex:subject", and I'm not sure where that comes from.

@gkellogg

This comment has been minimized.

Copy link
Member

@gkellogg gkellogg commented Aug 19, 2019

Actually, the only oddity here is that "proof" contains two items, the second of which is "ex:subject", and I'm not sure where that comes from.

This is because an added restriction is necessary to get a single item, which is done in #63.

@azaroth42 azaroth42 moved this from Editorial work complete to Discuss-Call in JSON-LD Management Aug 29, 2019
@dlongley

This comment has been minimized.

Copy link
Contributor Author

@dlongley dlongley commented Aug 30, 2019

@gkellogg,

This is because an added restriction is necessary to get a single item...

So while the output in #63 looks right to me, I don't understand why adding the type should be required to avoid bringing in ex:subject as another match for proof. In fact, I still don't understand where that's coming from, it seems like an artifact that wouldn't generate any quads. Furthermore, the original data does not have a proof relationship that points at two blank node named graphs, rather it points at just one... anyway, I'm confused about that.

@iherman

This comment has been minimized.

Copy link
Member

@iherman iherman commented Aug 30, 2019

This issue was discussed in a meeting.

  • No actions or resolutions
View the transcript Rob Sanderson: See Framing #27
Dave Longley: See a particular comment
Dave Longley: I raised this issue while trying to frame VC-like data,
… asked if other implementations had the same issue.
… This example generate a strange artifact (“ex:subject” at the bottom).
Gregg Kellogg: this artifact does not produce any triple,
… we can remove it by filtering on type. But is the output still correct?
… The issue may not be in framing, but somewhere else (maybe compaction?).
Rob Sanderson: to try and summarize: in the original data, there is exactly one named graph.
… inside that is a proof which is a graph container.
… The frame generates the subject as a separate graph?
Gregg Kellogg: (something about this behavior being legitimate with other data)
Dave Longley: may be this could be solved with @include?
Gregg Kellogg: there might be other holes in framing which we have not seen yet…
… We should try to flesh out the test suite further.
@gkellogg gkellogg closed this in #63 Aug 30, 2019
@gkellogg gkellogg reopened this Aug 30, 2019
@gkellogg

This comment has been minimized.

Copy link
Member

@gkellogg gkellogg commented Aug 30, 2019

Some observations:

  • Some of the issues are in compaction, but I did try a test where I used @embed: @never at the top level, which prevents anything but a node reference from being added.
    • Seems like we should only honor the @never (or other embedding restrictions) if we are actually embedded, and not at the top level (may require some additional state).
    • Adding an embedded flag to the framing state, defaulting to false, and set to true when recursing to embedded nodes would seem to address this.
  • It seems we should not add a node at the top level if it is embedded, but this creates a processing order dependency.
  • In compaction, I think more work for graph containers is necessary, and use @included if there is more than one.
    • We could also put @included within the first/arbitrary object,
    • or also not select that property if there’s more than a single top-level node object, which may make sense too.
gkellogg added a commit that referenced this issue Aug 30, 2019
…efaulting to false, and set to true when recursing into node objects. Add tests to not add node at top level if embedded is false and it has already been serialized, and also use the check on `@embed: @never` to make that happen only when in an embedded node object.

Update test g010 to not use `@type` in framing.

For #27.
@dlongley

This comment has been minimized.

Copy link
Contributor Author

@dlongley dlongley commented Aug 31, 2019

@gkellogg,

  • Seems like we should only honor the @never (or other embedding restrictions) if we are actually embedded, and not at the top level (may require some additional state).

Would this make it impossible to use @never to produce output with only node references? For example:

Input:

{
  "@graph": [{
    "@id": "ex:foo1",
    "@type": "ex:Foo"
  }, {
    "@id": "ex:foo2",
    "@type": "ex:Foo"    
  }, {
    "@id": "ex:bar",
    "@type": "ex:Bar"
  }]
}

Frame:

{
  "@type": "ex:Foo",
  "@embed": "@never"
}

Output:

{
  "@graph": [
    {
      "@id": "ex:foo1"
    },
    {
      "@id": "ex:foo2"
    }
  ]
}

I don't know if that's a use case or not -- but I suppose it could be. I think that's my only potential concern with the approach.

  • In compaction, I think more work for graph containers is necessary, and use @included if there is more than one.
  • We could also put @included within the first/arbitrary object,
    or also not select that property if there’s more than a single top-level node object, which may make sense too.

So here are my thoughts on the two options:

My first thought was that seeing @included in the output when using "@container": "@graph" would be more undesirable/unexpected. Post framing, you'd have to check for this special case within your results. This suggested to me that we ought to treat it like a non-match. The whole point of using "@container": "@graph" is that you're defining a term that is intended to have a value of one or more blank node named graphs and you're looking for a clean, compact way to express that. Injecting an @included in there is at odds with that.

However, on second thought, I think it would be odd to make it impossible to compact to an "@container": "@graph" term when you've got a blank node named graphs with multiple roots (i.e., you're using @included). We just introduced this @included term to make it easier for people to do things just like that. I imagine authors may eventually craft JSON-LD that does just this and they'd be quite surprised that it would not round trip using their context. It feels like preventing a term match here would therefore be creating a special case which we should avoid.

So, I think using @included seems like the best path forward of the two options. I think frame authors should either focus on creating frames that ensure @included won't surface or be ready to handle/reject results where it does.

Also, I didn't quite understand this comment (emphasis mine):

We could also put @included within the first/arbitrary object

I may be misunderstanding what you're saying here, but it wouldn't be arbitrary nor would I think it would affect compaction because @included would be in the expanded output produced during framing.

So we'd get something like this from framing:

// top level
"@graph": [{
  // will be come a `@graph: @container`
  // post compaction
  "ex:proof": [{
    // first blank node named graph
    "@graph": [{
      // two matches at the graph root level
      "@included": [{
        "@id": "ex:foo1",
        "@type": ["ex:Foo"]
      }, {
        "@id": "ex:foo2"
        "@type": ["ex:Foo"]
      }]
    }]
  }, {
    // second blank node named graph
    "@graph": [{
      // just one match at the graph root level
      "@id": "ex:bar1",
      "@type": ["ex:Bar"]
    }]
  }]
}]

Compaction would just run normally on that and produce:

// top level
"@context": {
  "proof": {"@id": "ex:proof", "@container": "@graph"}
},
"@graph": [{
  "proof": [{
    // first blank node named graph
    // two matches at the root graph level
    "@included": [{
      "@id": "ex:foo1",
      "@type": "ex:Foo"
    }, {
      "@id": "ex:foo2"
      "@type": "ex:Foo"
    }]
  }, {
    // second blank node named graph
    // just one match at the root graph level
    "@id": "ex:bar1",
    "@type": "ex:Bar"
  }]
}]

So, framing would handle injecting @included based on the same named graph where multiple matches occurred. So, if you're inside a named graph and you're at the "root level" of that named graph, and we find more than one match, then we have to wrap the matches in an object with a key of @included and a value that is the array of matches -- and use that as the single value for the named graph @graph array. With just one match, we use it directly as the value for the named graph @graph array. If this is what you meant then we're on the same page. But it seems to me like this wouldn't be a modification to compaction at all.

In fact, maybe we could get away with always outputting @included for named graph matches and letting the compaction step of framing remove it when there's just one node (does compaction do that now?). I don't know what would be easier/more efficient, just putting that out there.

  • It seems we should not add a node at the top level if it is embedded, but this creates a processing order dependency.

I thought @once covered most of these cases already -- and that we didn't add something to the top level unless it matched the frame up there. If it matches at the top level and elsewhere, I don't see why we'd exclude it at the top level in particular. That seems like it would have to be a different @embed mode -- and that it would introduce more complexity. Do we really need to do anything here?

@gkellogg

This comment has been minimized.

Copy link
Member

@gkellogg gkellogg commented Aug 31, 2019

@gkellogg,

  • Seems like we should only honor the @never (or other embedding restrictions) if we are actually embedded, and not at the top level (may require some additional state).

Would this make it impossible to use @never to produce output with only node references? For example:

Input:

{
  "@graph": [{
    "@id": "ex:foo1",
    "@type": "ex:Foo"
  }, {
    "@id": "ex:foo2",
    "@type": "ex:Foo"    
  }, {
    "@id": "ex:bar",
    "@type": "ex:Bar"
  }]
}

Frame:

{
  "@type": "ex:Foo",
  "@embed": "@never"
}

Output:

{
  "@graph": [
    {
      "@id": "ex:foo1"
    },
    {
      "@id": "ex:foo2"
    }
  ]
}

I don't know if that's a use case or not -- but I suppose it could be. I think that's my only potential concern with the approach.

Yes, this wouldn't be supported, but that's not embedding. The @embed flag is about embedding nodes within other nodes, not about expressing top-level node references, for which there really is no use case as they simply describe free flowing nodes with no incoming or outgoing vertices. I'd say it's an anti-pattern to use node references with no incoming links.

From the syntax doc:

A common idiom found in JSON usage is objects being specified as the value of other objects, called object embedding in JSON-LD...

Embedding is a JSON-LD feature that allows an author to use node objects as property values. This is a commonly used mechanism for creating a parent-child relationship between two nodes.

(As an aside, the algorithm should probably also display node references as immediate values of @included.)

@gkellogg

This comment has been minimized.

Copy link
Member

@gkellogg gkellogg commented Aug 31, 2019

  • In compaction, I think more work for graph containers is necessary, and use @included if there is more than one.
  • We could also put @included within the first/arbitrary object,
    or also not select that property if there’s more than a single top-level node object, which may make sense too.

So here are my thoughts on the two options:

My first thought was that seeing @included in the output when using "@container": "@graph" would be more undesirable/unexpected. Post framing, you'd have to check for this special case within your results. This suggested to me that we ought to treat it like a non-match. The whole point of using "@container": "@graph" is that you're defining a term that is intended to have a value of one or more blank node named graphs and you're looking for a clean, compact way to express that. Injecting an @included in there is at odds with that.

👍

However, on second thought, I think it would be odd to make it impossible to compact to an "@container": "@graph" term when you've got a blank node named graphs with multiple roots (i.e., you're using @included). We just introduced this @included term to make it easier for people to do things just like that. I imagine authors may eventually craft JSON-LD that does just this and they'd be quite surprised that it would not round trip using their context. It feels like preventing a term match here would therefore be creating a special case which we should avoid.

But we have many cases where you can't compact using a term which doesn't match. I think it's in perfect keeping with the spirit of term selection (as challenging as it may be) to only match when the value pattern allows it to be expressed using this term, otherwise a more appropriate term could be selected.

So, I think using @included seems like the best path forward of the two options. I think frame authors should either focus on creating frames that ensure @included won't surface or be ready to handle/reject results where it does.

👎

Also, I didn't quite understand this comment (emphasis mine):

We could also put @included within the first/arbitrary object

I may be misunderstanding what you're saying here, but it wouldn't be arbitrary nor would I think it would affect compaction because @included would be in the expanded output produced during framing.

So we'd get something like this from framing:

// top level
"@graph": [{
  // will be come a `@graph: @container`
  // post compaction
  "ex:proof": [{
    // first blank node named graph
    "@graph": [{
      // two matches at the graph root level
      "@included": [{
        "@id": "ex:foo1",
        "@type": ["ex:Foo"]
      }, {
        "@id": "ex:foo2"
        "@type": ["ex:Foo"]
      }]
    }]
  }, {
    // second blank node named graph
    "@graph": [{
      // just one match at the graph root level
      "@id": "ex:bar1",
      "@type": ["ex:Bar"]
    }]
  }]
}]

Compaction would just run normally on that and produce:

// top level
"@context": {
  "proof": {"@id": "ex:proof", "@container": "@graph"}
},
"@graph": [{
  "proof": [{
    // first blank node named graph
    // two matches at the root graph level
    "@included": [{
      "@id": "ex:foo1",
      "@type": "ex:Foo"
    }, {
      "@id": "ex:foo2"
      "@type": "ex:Foo"
    }]
  }, {
    // second blank node named graph
    // just one match at the root graph level
    "@id": "ex:bar1",
    "@type": "ex:Bar"
  }]
}]

So, putting @embedded in the first object would be reproducible if the ordered option is used, so in your case above, the result could be the following:

// top level
"@context": {
  "proof": {"@id": "ex:proof", "@container": "@graph"}
},
"@graph": [{
  "proof": [{
    // First node exists as value of _proof_
    "@id": "ex:foo1",
    "@type": "ex:Foo"
    "@included": [{
      // Second node _included_ within first node.
      "@id": "ex:foo2"
      "@type": "ex:Foo"
    }]
  }, {
    // second blank node named graph
    // just one match at the root graph level
    "@id": "ex:bar1",
    "@type": "ex:Bar"
  }]
}]

This is more in keeping of the sprit of @included, IMHO.

So, framing would handle injecting @included based on the same named graph where multiple matches occurred. So, if you're inside a named graph and you're at the "root level" of that named graph, and we find more than one match, then we have to wrap the matches in an object with a key of @included and a value that is the array of matches -- and use that as the single value for the named graph @graph array. With just one match, we use it directly as the value for the named graph @graph array. If this is what you meant then we're on the same page. But it seems to me like this wouldn't be a modification to compaction at all.

If we were to do so, this step would most likely be handled by compaction, as framing just results in an expanded output, in which there is no @container: @graph to deal with, and arbitrarily inserting @included would not mesh well with other compaction strategies.

I think the mechanism I showed is more natural, but unless ordering, the selection of the first node may be arbitrary, but this is a corner-case. This could be something done in the compaction algorithm, when the term has @container: @graph and there are multiple values, but as I said above, I think this just shouldn't be a match.

In fact, maybe we could get away with always outputting @included for named graph matches and letting the compaction step of framing remove it when there's just one node (does compaction do that now?). I don't know what would be easier/more efficient, just putting that out there.

In expanded form, named graphs are serialized like the following:

{
  "@graph": [{
    "ex:proof": [
      {"@id": "_:b1"},
      {"@id": "_:b2"}
    ]
  }, {
    "@graph": [{
      "@id": "ex:foo2",
      "@type": "ex:Foo"
    }, {
      "@id": "ex:foo1",
      "@type": "ex:Foo"
    }]
  }, {
    "@graph": [{
      "@id": "ex:bar1",
      "@type": "ex:Bar"
    }]
  }]
}

This falls out of the current way RDF would serialize quads to JSON-LD. Of course, you could always frame with @embed to if that's the shape you want.

  • It seems we should not add a node at the top level if it is embedded, but this creates a processing order dependency.

I thought @once covered most of these cases already -- and that we didn't add something to the top level unless it matched the frame up there. If it matches at the top level and elsewhere, I don't see why we'd exclude it at the top level in particular. That seems like it would have to be a different @embed mode -- and that it would introduce more complexity. Do we really need to do anything here?

The issue came to light when creating a trivial frame with @embed: @never in the top-level of a named graph. In that case, I think it would indicate that all nodes should be created directly under @graph, and only node references embedded, but the algorithm didn't handle that, as it doesn't know the difference between the top-level and something embedded.

With @embed: @once, the implication is that it should not appear at the top-level if it is embedded, rather than serializing a node reference at the top-level. The wording of 4.3, which handles @once is a bit odd: "... and parent has an existing embedded node in parent associated with graph name ...". I think it means that if the node has already been serialized in the graph, output a node reference, but this doesn't not consider if it's at top-level.

The embedded flag would seem to be necessary to record if we're in an embedding state, or not.

@dlongley

This comment has been minimized.

Copy link
Contributor Author

@dlongley dlongley commented Aug 31, 2019

But we have many cases where you can't compact using a term which doesn't match. I think it's in perfect keeping with the spirit of term selection (as challenging as it may be) to only match when the value pattern allows it to be expressed using this term, otherwise a more appropriate term could be selected.

Can I get clarity on what you think should be a match and what shouldn't? For example:

Would this roundtrip (proof would be a match for the value)?

"@context": {
  "proof": {"@id": "ex:proof", "@container": "@graph"}
},
"proof": {
  "@id": "ex:foo1",
  "@type": "ex:Foo"
  "@included": {
    "@id": "ex:foo2"
    "@type": "ex:Foo"
  }
}

What about this?

"@context": {
  "proof": {"@id": "ex:proof", "@container": "@graph"}
},
"proof": {
  "@included": [{
    "@id": "ex:foo1",
    "@type": "ex:Foo"
  }, {
    "@id": "ex:foo2"
    "@type": "ex:Foo"
  }]
}

If you think both of these should be matches, can you give an example of something that wouldn't roundtrip? If you think the first should round trip but not the second, don't you think that will be confusing/inconsistent? I would think it would special case @included and make it harder to use to represent "bushes" since there are corner cases to consider.

I'd rather people framing things have to consider cases where their specified matches might produce multiple results at the root level of a graph. That's already a thing they have to do with the default graph when @container: @graph isn't involved.

It seems to me that the rule for whether or not @container: @graph is a match is just if a value uses a simple blank node named graph. That also matches the definition/use for the feature. Why would a user expect a non-match just because a particular blank node named graph had multiple roots?

@dlongley

This comment has been minimized.

Copy link
Contributor Author

@dlongley dlongley commented Aug 31, 2019

The issue came to light when creating a trivial frame with @embed: @Never in the top-level of a named graph. In that case, I think it would indicate that all nodes should be created directly under @graph, and only node references embedded, but the algorithm didn't handle that, as it doesn't know the difference between the top-level and something embedded.

With @embed: @once, the implication is that it should not appear at the top-level if it is embedded, rather than serializing a node reference at the top-level. The wording of 4.3, which handles @once is a bit odd: "... and parent has an existing embedded node in parent associated with graph name ...". I think it means that if the node has already been serialized in the graph, output a node reference, but this doesn't not consider if it's at top-level.

The embedded flag would seem to be necessary to record if we're in an embedding state, or not.

👍

@gkellogg

This comment has been minimized.

Copy link
Member

@gkellogg gkellogg commented Aug 31, 2019

But we have many cases where you can't compact using a term which doesn't match. I think it's in perfect keeping with the spirit of term selection (as challenging as it may be) to only match when the value pattern allows it to be expressed using this term, otherwise a more appropriate term could be selected.

Can I get clarity on what you think should be a match and what shouldn't? For example:

Would this roundtrip (proof would be a match for the value)?

"@context": {
  "proof": {"@id": "ex:proof", "@container": "@graph"}
},
"proof": {
  "@id": "ex:foo1",
  "@type": "ex:Foo"
  "@included": {
    "@id": "ex:foo2"
    "@type": "ex:Foo"
  }
}

Yes, because when selecting "proof", you have a value with just a single node definition. The contents of @included isn't considered.

What about this?

"@context": {
  "proof": {"@id": "ex:proof", "@container": "@graph"}
},
"proof": {
  "@included": [{
    "@id": "ex:foo1",
    "@type": "ex:Foo"
  }, {
    "@id": "ex:foo2"
    "@type": "ex:Foo"
  }]
}

Yes, that would match too, because there is still just a single member (@included: [...]).

If you think both of these should be matches, can you give an example of something that wouldn't roundtrip? If you think the first should round trip but not the second, don't you think that will be confusing/inconsistent? I would think it would special case @included and make it harder to use to represent "bushes" since there are corner cases to consider.

Look how each expands:

[{
  "ex:proof": [{
    "@graph": [{
      "@id": "ex:foo1",
      "@type": ["ex:Foo"],
      "@included": [{
        "@id": "ex:foo2",
        "@type": ["ex:Foo"]
      }]
    }]
  }]
}]

and

[{
  "ex:proof": [{
    "@graph": [{
      "@included": [{
        "@id": "ex:foo1",
        "@type": ["ex:Foo"]
      }, {
        "@id": "ex:foo2",
        "@type": ["ex:Foo"]
      }]
    }]
  }]
}]

If, instead, you had the following expanded:

[{
  "ex:proof": [{
    "@graph": [{
      "@id": "ex:foo1",
      "@type": ["ex:Foo"]
    }, {
      "@id": "ex:foo2",
      "@type": ["ex:Foo"]
    }]
  }]
}]

it would not select "proof", as the value to compact has two members. You can address this in framing through the use of @included.

I'd rather people framing things have to consider cases where their specified matches might produce multiple results at the root level of a graph. That's already a thing they have to do with the default graph when @container: @graph isn't involved.

It seems to me that the rule for whether or not @container: @graph is a match is just if a value uses a simple blank node named graph. That also matches the definition/use for the feature. Why would a user expect a non-match just because a particular blank node named graph had multiple roots?

Well, I won't stand in the way of this, but in that case, I'd say that the compaction algorithm should insert the @included in the first node definition member with the remaining nodes to yield the following:

{
  "@context": {
    "@version": 1.1,
    "proof": {"@id": "ex:proof", "@container": "@graph"}
  },
  "proof": {
    "@id": "ex:foo1",
    "@type": "ex:Foo",
    "@included": {
      "@id": "ex:foo2",
      "@type": "ex:Foo"
    }
  }
}
@gkellogg

This comment has been minimized.

Copy link
Member

@gkellogg gkellogg commented Aug 31, 2019

Note that this discussion is really about w3c/json-ld-api#143. If we agree what the framing algorithm should do, as indicated by this PR, we can close it.

The only change I'll make is to also consider values of @included to be top-level nodes and set embedded flag to false.

gkellogg added a commit that referenced this issue Sep 1, 2019
…efaulting to false, and set to true when recursing into node objects. Add tests to not add node at top level if embedded is false and it has already been serialized, and also use the check on `@embed: @never` to make that happen only when in an embedded node object.

Update test g010 to not use `@type` in framing.

For #27.
@gkellogg gkellogg closed this Sep 2, 2019
@gkellogg gkellogg removed this from Discuss-Call in JSON-LD Management Sep 2, 2019
@azaroth42

This comment has been minimized.

Copy link
Contributor

@azaroth42 azaroth42 commented Sep 5, 2019

Should have a formal resolution to close.

@azaroth42 azaroth42 reopened this Sep 5, 2019
@gkellogg

This comment has been minimized.

Copy link
Member

@gkellogg gkellogg commented Sep 9, 2019

This issue was discussed in a meeting.

  • RESOLVED: Close api #143 as resolved by api PR #145 and #146
View the transcript Framing blank nodes
Rob Sanderson: last discussion we agreed that we couldn’t solve it on a call
… so gkellog and dlongley went off to look at it
Gregg Kellogg: we found a problem in a framing test where @container : @graph got mangled in re-expansion
… a bug in the compaction algo
… if the value is an array, it puts them in an @included block
… i tried [s solution] but it turned out not to be defined well enough
Rob Sanderson: all of that is solved and merged?
Gregg Kellogg: yep
Gregg Kellogg: See API PR #146
Gregg Kellogg: See API PR #145
Proposed resolution: Close framing #27 as not being the issue, and the real issues being addressed is api #143, solved by api PRs # 145 and #146 (Rob Sanderson)
Rob Sanderson: +1
Benjamin Young: +1
Dave Longley: +1
Gregg Kellogg: +1
Ivan Herman: +1
Adam Soroka: +1
Pierre-Antoine Champin: +1
Ruben Taelman: +1
Rob Sanderson: RESOLVE: Close framing #27 as not being the issue, and the real issues being addressed is api #143, solved by api PRs # 145 and #146
Proposed resolution: Close api #143 as resolved by api PR #145 and #146 (Rob Sanderson)
Rob Sanderson: +1
Ivan Herman: +1
Ruben Taelman: +1
Gregg Kellogg: +1
Adam Soroka: +1
Pierre-Antoine Champin: +1
Benjamin Young: +1
Dave Longley: +1
Resolution #2: Close api #143 as resolved by api PR #145 and #146
@gkellogg gkellogg closed this Sep 9, 2019
@azaroth42 azaroth42 added the satisfied label Nov 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.