Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compacting graph container with multiple nodes broken #143

Closed
gkellogg opened this issue Aug 31, 2019 · 19 comments · Fixed by #146
Closed

Compacting graph container with multiple nodes broken #143

gkellogg opened this issue Aug 31, 2019 · 19 comments · Fixed by #146

Comments

@gkellogg
Copy link
Member

gkellogg commented Aug 31, 2019

As described in w3c/json-ld-framing#27 (comment), there is an issue when compacting named graphs with multiple top-level nodes under a property with @container: @graph.

Consider the following:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/"
  },
  "input": {
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }
}

When compacted using:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "input": {"@container": "@graph"}
  }
}

Generates the following:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "input": {"@container": "@graph"}
  },
  "input": [
    {"value": "x"},
    {"value": "y"}
  ]
}

But this describes two different named graphs.

We should either not select the term for compaction if there is more than one top-level node, or use either @graph or @included to wrap them all together, which pretty much undoes the affect of @container: @graph.

@gkellogg
Copy link
Member Author

I think we should solve this by not selecting "input", as it can't express the result properly. Adding the use of @graph or @included would be just as if we weren't using a graph container, and could end up allocating a blank node for a graph name which isn't actually used.

@dlongley
Copy link
Contributor

@gkellogg,

Hmm, or is it the case that we have an expansion bug here? Shouldn't the above input actually be equivalent to this (note @graph: @container is removed)?:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/"
  },
  "input": {
    "@graph": [{
      "@graph": [
        {"value": "x"},
        {"value": "y"}
      ]
    }]
  }
}

When using @graph inside of an @graph: @container it would seem to me to generate yet another blank node named graph. Which would mean that we'd want the expanded output to be:

[{
  "http://example.org/input": [{
    "@graph": [{
      "@graph": [{
        "http://example.org/value": [{"@value": "x"}]
      }, {
        "http://example.org/value": [{"@value": "y"}]
      }]
    }]
  }]
}]

But, instead, it's dropping an @graph:

[{
  "http://example.org/input": [{
    "@graph": [{
      "http://example.org/value": [{"@value": "x"}]
    }, {
      "http://example.org/value": [{"@value": "y"}]
    }]
  }]
}]

@dlongley
Copy link
Contributor

This step of the Expansion Algorithm has a caveat that causes the above to happen:

13.8.3.6.1) If container mapping includes @graph and if item is not a graph object, set item to a new map containing the key-value pair @graph-item, ensuring that the value is represented using an array.

That step is in there to prevent double expanding graphs -- but I think that might just be wrong. I'm going through the tests now to see what's going on there, and so far all of the expansion tests are just testing to make sure we don't do this double expansion whenever a @container: @graph is used. But, it seems to me, we would want it to do the double expansion.

@dlongley
Copy link
Contributor

Yes, that seems to be all that's tested for, to handle this special case. There are toRDF tests as well but those are just repeats of these special case expansion tests. It seems like they should either be changed or removed if we can agree that dropping an extra @graph into the value should change what's being expressed. I think it should and I'm not sure why it was ever any other way (perhaps it was something wacky I did in the original implementation). Ideally we'd just drop this special handling.

@gkellogg
Copy link
Member Author

@gkellogg,

Hmm, or is it the case that we have an expansion bug here? Shouldn't the above input actually be equivalent to this (note @graph: @container is removed)?:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/"
  },
  "input": {
    "@graph": [{
      "@graph": [
        {"value": "x"},
        {"value": "y"}
      ]
    }]
  }
}

When using @graph inside of an @graph: @container it would seem to me to generate yet another blank node named graph. Which would mean that we'd want the expanded output to be:

That's not what @graph with multiple values means, from 4.9 Named Graphs, when not used at the top-level, @graph introduces a single named graph, taking the graph name from the node containing the @graph, if present, or a blank node otherwise. So, the double @graph isn't necessary (and would be extremely confusing).

Your example would yield the following quads:

_:b0 <http://example.org/input> _:b1 .
_:b4 <http://example.org/value> "y" _:b2 .
_:b3 <http://example.org/value> "x" _:b2 .

Without the extra @graph, you get the following:

_:b0 <http://example.org/input> _:b1 .
_:b3 <http://example.org/value> "y" _:b1 .
_:b2 <http://example.org/value> "x" _:b1 .

which is, what we want.

Going back to JSON-LD, my distiller gives the following:

{
  "@graph": [{
    "@id": "_:b0",
    "http://example.org/input": {"@id": "_:b1"}
  },
  {
    "@id": "_:b1",
    "@graph": [{
      "@id": "_:b3",
      "http://example.org/value": "y"
    }, {
      "@id": "_:b2",
      "http://example.org/value": "x"
    }
    ]
  }]
}

Which could then be trivially framed back to:

{
  "http://example.org/input": {
    "@graph": [{
      "http://example.org/value": "y"
    },{
      "http://example.org/value": "x"
    }]
  }
}

@dlongley
Copy link
Contributor

dlongley commented Aug 31, 2019

So, the double @graph isn't necessary (and would be extremely confusing).

I don't think this is about it being necessary, rather what it means if someone does it.

A term with @container: @graph means that its values will be interpreted as being in an implicit blank node named graph. If someone adds an additional @graph in there -- I don't think we should be doing some extra lifting to remove it for them, rather, we should assume that it is what they meant to express. Doing otherwise, I think, is more confusing. So if an author is using such a term, don't add @graph to a value unless you really want to create yet another named graph. And, yes, the quads would be just as you say.

Imagine a scenario where you wanted to make statements about another non-blank node named graph in a graph container. How would you do it? Imagine someone's attempt:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "input": {"@container": "@graph"}
  },
  "input": {
    "@id": "ex:someIdentifier",
    "ex:about": "this graph",
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }
}

If you converted to quads, you'd get:

<ex:someIdentifier> <ex:about> "this graph" .
_:b0 <http://example.org/input> <ex:someIdentifier> .
_:b1 <http://example.org/value> "x" <ex:someIdentifier> .
_:b2 <http://example.org/value> "y" <ex:someIdentifier> .

But this is wrong! It's as if there was no @container: @graph at all. That "ex:about" statement is supposed to be wrapped up and isolated in a blank node named graph, but it's bled into the default graph instead. The quads you should be getting would be more like:

<ex:someIdentifier> <ex:about> "this graph" _:b1 .
_:b0 <http://example.org/input> _:b1 .
_:b2 <http://example.org/value> "x" <ex:someIdentifier> .
_:b3 <http://example.org/value> "y" <ex:someIdentifier> .

I'm asserting that this:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "input": {"@container": "@graph"}
  },
  "input": {
    "@id": "ex:someIdentifier",
    "ex:about": "this graph",
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }
}

And this:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/"
  },
  "input": {
    "@id": "ex:someIdentifier",
    "ex:about": "this graph",
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }
}

Are not the same thing. Why should we treat them as such?

@gkellogg
Copy link
Member Author

So, the double @graph isn't necessary (and would be extremely confusing).

I don't think this is about it being necessary, rather what it means if someone does it.

A term with @container: @graph means that its values will be interpreted as being in an implicit blank node named graph. If someone adds an additional @graph in there -- I don't think we should be doing some extra lifting to remove it for them, rather, we should assume that it is what they meant to express. Doing otherwise, I think, is more confusing. So if an author is using such a term, don't add @graph to a value unless you really want to create yet another named graph. And, yes, the quads would be just as you say.

Imagine a scenario where you wanted to make statements about another non-blank node named graph in a graph container. How would you do it? Imagine someone's attempt:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "input": {"@container": "@graph"}
  },
  "input": {
    "@id": "ex:someIdentifier",
    "ex:about": "this graph",
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }
}

If you converted to quads, you'd get:

<ex:someIdentifier> <ex:about> "this graph" .
_:b0 <http://example.org/input> <ex:someIdentifier> .
_:b1 <http://example.org/value> "x" <ex:someIdentifier> .
_:b2 <http://example.org/value> "y" <ex:someIdentifier> .

But this is wrong! It's as if there was no @container: @graph at all. That "ex:about" statement is supposed to be wrapped up and isolated in a blank node named graph, but it's bled into the default graph instead. The quads you should be getting would be more like:

It is wrong, and that's not what I get on my distiller.

<ex:someIdentifier> <ex:about> "this graph" _:b1 .
_:b0 <http://example.org/input> _:b1 .
_:b2 <http://example.org/value> "x" <ex:someIdentifier> .
_:b3 <http://example.org/value> "y" <ex:someIdentifier> .

That is what I get. That's also what I get on the json-ld.org playground.

I'm asserting that this:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "input": {"@container": "@graph"}
  },
  "input": {
    "@id": "ex:someIdentifier",
    "ex:about": "this graph",
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }
}

And this:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/"
  },
  "input": {
    "@id": "ex:someIdentifier",
    "ex:about": "this graph",
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }
}

Are not the same thing. Why should we treat them as such?

We don't.

@dlongley
Copy link
Contributor

I was generating that output using the playground. I'm not sure what happened, but it was probably user (me) error.

So here's where things get weird then:

This:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "input": {"@container": "@graph"}
  },
  "input": {
    "ex:about": "this graph",
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }
}

Yields these quads (if I'm not misusing the playground again somehow):

_:b0 <http://example.org/input> _:b1 .
_:b2 <ex:about> "this graph" _:b1 .
_:b3 <http://example.org/value> "x" _:b2 .
_:b4 <http://example.org/value> "y" _:b2 .

But this:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "input": {"@container": "@graph"}
  },
  "input": {
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }
}

Yields these quads, eliminating the extra graph:

_:b0 <http://example.org/input> _:b1 .
_:b2 <http://example.org/value> "x" _:b1 .
_:b3 <http://example.org/value> "y" _:b1 .

Why would we want this? What benefit is there? If someone puts that extra @graph in there, I think we should presume it's what they meant to do. If we're going to have a special case in the expansion algorithm it should be justified -- do we know why we have it?

@dlongley
Copy link
Contributor

So I'm not sure what's going on with me or the playground, but I just tried this:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "input": {"@container": "@graph"}
  },
  "input": [{
    "@id": "ex:someIdentifier",
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }]
}

And that produced these quads:

_:b0 <http://example.org/input> <ex:someIdentifier> .
_:b1 <http://example.org/value> "x" <ex:someIdentifier> .
_:b2 <http://example.org/value> "y" <ex:someIdentifier> .

There's no blank node named graph in there, which is wrong. Which means it's being treated the same as:

```js
{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/"
  },
  "input": [{
    "@id": "ex:someIdentifier",
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }]
}

It should instead produce:

_:b0 <http://example.org/input> _:b1 .
_:b2 <http://example.org/value> "x" <ex:someIdentifier> .
_:b3 <http://example.org/value> "y" <ex:someIdentifier> .

Am I misusing the playground again?

@gkellogg
Copy link
Member Author

I was generating that output using the playground. I'm not sure what happened, but it was probably user (me) error.

So here's where things get weird then:

This:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "input": {"@container": "@graph"}
  },
  "input": {
    "ex:about": "this graph",
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }
}

Yields these quads (if I'm not misusing the playground again somehow):

_:b0 <http://example.org/input> _:b1 .
_:b2 <ex:about> "this graph" _:b1 .
_:b3 <http://example.org/value> "x" _:b2 .
_:b4 <http://example.org/value> "y" _:b2 .

But this:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "input": {"@container": "@graph"}
  },
  "input": {
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }
}

Yields these quads, eliminating the extra graph:

_:b0 <http://example.org/input> _:b1 .
_:b2 <http://example.org/value> "x" _:b1 .
_:b3 <http://example.org/value> "y" _:b1 .

Why would we want this? What benefit is there? If someone puts that extra @graph in there, I think we should presume it's what they meant to do. If we're going to have a special case in the expansion algorithm it should be justified -- do we know why we have it?

Looks like some bug in generating the node map, as the @graph should create a new graph name, and not @default. Perhaps in step 6.10, where id doesn't get set? I'll create a test for this.

@gkellogg
Copy link
Member Author

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/",
    "input": {"@container": "@graph"}
  },
  "input": [{
    "@id": "ex:someIdentifier",
    "@graph": [
      {"value": "x"},
      {"value": "y"}
    ]
  }]
}

Must be the same node map bug, as my implementation does just the same thing.

@gkellogg
Copy link
Member Author

I created #144 for these issues.

@dlongley
Copy link
Contributor

@gkellogg, I believe that if the special case code I mentioned in #143 (comment) is removed this problem goes away.

@dlongley
Copy link
Contributor

It seems to me that the purpose of this special case code may have been to allow @graph to be used like it is at the root level, to specify multiple roots in the @default graph, except here, for a blank node named graph.

If that's the case, it seems to me that trying to enable @graph to work that way in this one particular case is not preferable to using @included now that we have it. I believe it's also leading to these other bugs and/or points of confusion.

@gkellogg
Copy link
Member Author

gkellogg commented Sep 1, 2019

@gkellogg,

Hmm, or is it the case that we have an expansion bug here? Shouldn't the above input actually be equivalent to this (note @graph: @container is removed)?:

{
  "@context": {
    "@version": 1.1,
    "@vocab": "http://example.org/"
  },
  "input": {
    "@graph": [{
      "@graph": [
        {"value": "x"},
        {"value": "y"}
      ]
    }]
  }
}

I made an error (which I've corrected), as I was trying to illustrate the compaction problem.

@gkellogg
Copy link
Member Author

gkellogg commented Sep 1, 2019

This step of the Expansion Algorithm has a caveat that causes the above to happen:

13.8.3.6.1) If container mapping includes @graph and if item is not a graph object, set item to a new map containing the key-value pair @graph-item, ensuring that the value is represented using an array.

That step is in there to prevent double expanding graphs -- but I think that might just be wrong. I'm going through the tests now to see what's going on there, and so far all of the expansion tests are just testing to make sure we don't do this double expansion whenever a @container: @graph is used. But, it seems to me, we would want it to do the double expansion.

I can't think of a use case for double graphs. In other cases, if we have a container setting where the value already. Consider the following:

{
  "@context": {"@vocab": "http://example/", "list": {"@container": "@list"}},
  "list": {"@list": [1,2]}
}

This does not create a double list.

Also, @container: @id, or @container: @typework only when the value is a map, so it's entirely in keeping with this for the value of a property with@container: @graph` to not do a double expansion of graphs.

@gkellogg
Copy link
Member Author

gkellogg commented Sep 1, 2019

@gkellogg, I believe that if the special case code I mentioned in #143 (comment) is removed this problem goes away.

Given the potential to mix simple graphs with more complicated ones, it probably is, indeed, best to remove that special case.

@dlongley
Copy link
Contributor

dlongley commented Sep 2, 2019

I can't think of a use case for double graphs.

Me either; I don't really think it's about supporting them vs. removing special case behavior we don't need and that seems like it's gumming up the works.

Given the potential to mix simple graphs with more complicated ones, it probably is, indeed, best to remove that special case.

👍

gkellogg added a commit that referenced this issue Sep 2, 2019
…ers where the value is already a graph.

Fixes #144. Relates to #143.
gkellogg added a commit that referenced this issue Sep 2, 2019
…g `@included`, to make sure they are all contained in a single graph.

Fixes #143.
gkellogg added a commit that referenced this issue Sep 3, 2019
…g `@included`, to make sure they are all contained in a single graph.

Fixes #143.
@azaroth42 azaroth42 reopened this Sep 6, 2019
@iherman
Copy link
Member

iherman commented Sep 6, 2019

This issue was discussed in a meeting.

  • RESOLVED: Close api #143 as resolved by api PR #145 and #146
View the transcript Framing blank nodes
Rob Sanderson: last discussion we agreed that we couldn’t solve it on a call
… so gkellog and dlongley went off to look at it
Gregg Kellogg: we found a problem in a framing test where @container : @graph got mangled in re-expansion
… a bug in the compaction algo
… if the value is an array, it puts them in an @included block
… i tried [s solution] but it turned out not to be defined well enough
Rob Sanderson: all of that is solved and merged?
Gregg Kellogg: yep
Gregg Kellogg: See API PR #146
Gregg Kellogg: See API PR #145
Proposed resolution: Close framing #27 as not being the issue, and the real issues being addressed is api #143, solved by api PRs # 145 and #146 (Rob Sanderson)
Rob Sanderson: +1
Benjamin Young: +1
Dave Longley: +1
Gregg Kellogg: +1
Ivan Herman: +1
Adam Soroka: +1
Pierre-Antoine Champin: +1
Ruben Taelman: +1
Rob Sanderson: RESOLVE: Close framing #27 as not being the issue, and the real issues being addressed is api #143, solved by api PRs # 145 and #146
Proposed resolution: Close api #143 as resolved by api PR #145 and #146 (Rob Sanderson)
Rob Sanderson: +1
Ivan Herman: +1
Ruben Taelman: +1
Gregg Kellogg: +1
Adam Soroka: +1
Pierre-Antoine Champin: +1
Benjamin Young: +1
Dave Longley: +1
Resolution #2: Close api #143 as resolved by api PR #145 and #146

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants