Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore supporting "@embed": "@first" #43

Closed
dlongley opened this issue Mar 3, 2019 · 8 comments · Fixed by #51
Closed

Explore supporting "@embed": "@first" #43

dlongley opened this issue Mar 3, 2019 · 8 comments · Fixed by #51

Comments

@dlongley
Copy link
Contributor

dlongley commented Mar 3, 2019

My understanding is that the use of @last is intended to ensure that all data related to whatever a frame filters on is not omitted (or duplicated) from the result of a framing operation. The embed location ("last") is actually not important. One use case for this feature is framing subgraphs in JSON-LD documents prior to verifying digital signatures on them.

This form of embedding was the only type available when framing was invented: the choice was to embed this way or not to embed at all; @always came later. I can't remember a good reason why the choice was made to place the only embed for a node wherever it is referenced "last" -- and at this point it feels arbitrary. Certainly the framing user doesn't care because they can't know where "last" is a priori. In fact, in some implementations, the keyword @once was used before @last was adopted instead.

If the case is that it is arbitrary, using "last" clearly has some drawbacks that might be avoided if we supported @first instead for this use case. Using @last requires some post processing that removes previous embeds to ensure that only the last one is used. This is costly in current implementations. I would think that using @first would cause all subsequent occurrences to just use a reference and to add any additional data collected to the first embedded node object, eliminating the need for any post processing.

Of course, not having looked into the framing algorithms in a while, this could also be naive. Therefore, I recommend that implementers (such as me) explore @first as it may be the case that existing implementations can be very easily adapted to support it -- and its simplicity could address a number of issues (for example: performance). If nothing else, a good explanation for why "last" was chosen should come out of this and could be placed into the spec.

Note: The first public description of framing I could find did not help me remember anything: https://lists.w3.org/Archives/Public/public-linked-json/2011Aug/0078.html

@gkellogg
Copy link
Member

gkellogg commented Mar 4, 2019

I support adding @first. I'll need to work on it, but I think it would be fairly straightforward.

@azaroth42 azaroth42 added this to Discuss-Call in JSON-LD Management DEPRECATED Mar 7, 2019
@azaroth42
Copy link
Contributor

👍

gkellogg added a commit that referenced this issue Mar 10, 2019
@gkellogg gkellogg self-assigned this Mar 10, 2019
gkellogg added a commit that referenced this issue Mar 13, 2019
@azaroth42 azaroth42 reopened this Mar 13, 2019
@azaroth42
Copy link
Contributor

Was there a WG resolution that we should add this?

@gkellogg
Copy link
Member

gkellogg commented Mar 13, 2019 via email

@iherman
Copy link
Member

iherman commented Mar 15, 2019

It may be too late to consider this but... I am a little bit bothered by this feature altogether. Looking at, e.g., example 12, and if I understand it right, we are looking at

{
    "@id": "http://example.org/library",
    "@type": "Library",
    "books": "http://example.org/library/the-republic",
    "contains": "http://example.org/library/the-republic"
}

and the value of @first or @last refers to the order of books and contains to choose among them. However, the order of keys within a JSON object does not matter. (There are plenty of references in, e.g., stackoverflow question to various specs stating that.) Nor does the order matter in RDF. So how come we introduce a feature that explicitly refers to the order of keys?

(I may misunderstand something, though...)

@gkellogg
Copy link
Member

Both @first and @last depend on the algorithm sorting keys, whichi is optional. There’s an inline editors not about using something like @sample to use the first one found irrespective of order, which would lead to reproducibility issues.

We may simply need to stick with key ordering to make these work.

@iherman
Copy link
Member

iherman commented Mar 15, 2019

This issue was discussed in a meeting.

  • No actions or resolutions
View the transcript 4.1. @embed @FIRST issue
Dave Longley: #43
Gregg Kellogg: implemented as well as accompanied by tests
… seems straightforward
Dave Longley: or @once (which is a historical name)
Gregg Kellogg: another alternative would be to remove first and last and go back to all or none
Dave Longley: there is an important use case for having at least one of @FIRST or @last
… frame docs where you don’t care where something appears but it must appear
… or certain properties must appear in certain places
… historically the name for this was @once
… the change was mostly for testing purposes
Ivan Herman: I feel uneasy that we are talking about order in an environment that doesn’t inherently have order
Gregg Kellogg: that’s why the algorithms do ordering
Ivan Herman: from the user POV they have to know about the algorithms
… which is not ideal. I’m not objecting, just sharing uneasiness
Ivan Herman: can we push this to next week? It’s more than five minutes of talk.
Gregg Kellogg: ivan’s comment is not about this specific PR, but about the general point
… the existence of @last implies an @FIRST, but the real question is should we be implying ordering at all
Dave Longley: +1

@iherman
Copy link
Member

iherman commented Mar 22, 2019

This issue was discussed in a meeting.

  • RESOLVED: Change @last (back) to @once, don’t add @first and then figure out testing separately
View the transcript 1.2. @embed @FIRST
Rob Sanderson: link: #43
Gregg Kellogg: We touched on this one. Originally, @embed was true or false, if it was true then one of the uses within a node would contain the embedded value vs. just a reference. That was expanded because it was useful to be able to say “we want all of them” because it would make it much simpler so clients don’t have to follow pointers. Also embed @link was added for navigating an internal representation.
… We updated it to be the keyword value. And @last was a little arbitrary and the last one expressed in the frame would have the embedded value and @FIRST would be a little more efficient as you don’t have to remove other embeds.
… First and last implies ordering and all of the ordering is optional and if you don’t have order then first and last are non-sensical, so Dave said to use @once but then testing is hard.
… This proposal, given that ordering is an option, it makes sense that we also makes sense to be able to say @FIRST vs. @last and it’s a separate issue for how these behave or if we need something different if we are not ordering.
Pierre-Antoine Champin: Just an idea on the terminology, @FIRST and @last are misleading … perhaps @earliest to @latest or something like that. To make the terms less confusing.
David Newbury: Going through the issue, I’m trying to understand the optionalness of the sorting. It feels like what we’re trying to make sure we can do is compare two JS objects as strings.
Gregg Kellogg: That sort of gets into canonicalization – I believe the tests say we should do a comparison of objects so that ordering isn’t important for comparing but also compare the resulting RDF graphs for isomorphism.
… But it does make it important to know which of the properties get included in the node or a given subject, in that case, it doesn’t change the RDF signature. They all will be references to another object. It’s really in the JSON thing.
… Whether you compare unordered or ordered, it still matters where the embedding takes place.
… The only way to satisfy these cases is so that you run the algorithm so it does process the properties in order. Otherwise you can’t do a structural comparison of the result.
David Newbury: I can say that I have sometimes reordered the properties … to match documentation. It shouldn’t do that if you use a different keyword. If you do anything to change the order of the JS it doesn’t change the meaning of the graph.
Gregg Kellogg: It is sort of a corner case where you have the same thing referred to multiple times.
David Newbury: And it comes up all the time in our data structures. Every time we want to talk about a first name we have an included object that defines first namedness.
… We do use repeated signatures throughout the graph.
Rob Sanderson: I can dig up an example. The ordering of the name is arbitrary, the order of the … firstnamedness and middlenamedness you don’t want to mix that up.
Rob Sanderson: So we do post processing to make sure it’s in the right order to look sane.
Ivan Herman: In general, I am really bothered by this. As an author of JSON-LD … if I really want to control how the framing works… I have to be aware of somewhat arbitrary ordering in the system that is against my way of ordering my properties in my object.
… For me, the whole thing is flawed.
Rob Sanderson: My understanding is very similar to David Newberry’s … if the ordering is important and that will determine where things will be embedded…
Rob Sanderson: Then the structure will potentially be very very different depending on the names of the keys. It would be very weird to have to put numbers at the beginning of every key to make sure they end up in the right place.
Ivan Herman: I have to do this elsewhere (iTunes) and I hate it.
Rob Sanderson: If the intent is that the structure should be comparable – then we don’t need @FIRST … it’s just a flag that says “it goes here” such that it will be comparable.
Rob Sanderson: Then I don’t think that alphanumeric sorting will help, you don’t want @type appearing at the end or in the middle.
Pierre-Antoine Champin: I’m not sure I’m following ivan’s arguments
… about the expectations of the authors
… to me JSON objects are unordered
… as annoying as that is, I don’t expect the order I’ve written to be preserved
… the thing I queued up originally was to ask about @min and @max
… to be the smallest or highest key that one can find
… that would be reproducible
… and avoid key ordering
Dave Longley: so just looking at all this, I would remove all of the ordering
… and anything out of any algorithm that would attempt any ordering at all
… and highlights that the ordering of constituents back then was incorrect
… we mostly did this ordering for the test suites
Ivan Herman: +1 to dlongley
Dave Longley: original name was @once
… but what we bumped up again was how do we test this?
… because you can get really complicated output
… you can’t test from the RDF, because that isn’t the point of framing
… what I recommend we do, is in the test suite
… we order the keys in a certain way
… let’s somehow get the ordering into the test suite
… and out of the algorithms
Rob Sanderson: +1 to dlongley - ordering in the test suite, not in the spec
Dave Longley: so that we can still test correct ordering in the test suite
Jeff Mixter: +1 to dlongley
Dave Longley: and removing it out of the algorithm will also speed things up
… leaving it where it is will just confuse users anyhow
… so removing it is a win for them also
Gregg Kellogg: ordering in the test suite helps
… it’s been updated to not do order based testing of objects
… so I think the issue is about where to do the embedding
Dave Longley: -1 !
Gregg Kellogg: so we should either do no embedding or embed everywhere
… anytime you get into having to compare length of strings, you’re doing ordering
… ordering kills performance
… the issue of having @FIRST is predicated on having @last
… so what I think we’re debating now is should there be any reason to have these at all
… so maybe we only normatively require @none and @Always
Gregg Kellogg: https://w3c.github.io/json-ld-api/tests/#json-ld-object-comparison
Gregg Kellogg: there is a section in the test suite README about how we do object comparison
Ivan Herman: I wanted to answer pchampin, but I think we’ve moved past that
… but +1 to dlongley about removing this
Dave Longley: there’s no way we can just limit this to @none and @Always because there are lots of reasons to just embed one
… there are lots of use case where data comes in from somewhere, and you need to pluck out the proof or whatever single thing
… and then you need to do some processing on those
… and if you inject new triples, that will ruin the data for that processing
… so you don’t care where the data occurs, just that it does occur
David Newbury: I want to second that
… I’d like to create shallow set of look-up types
… I don’t care where it is
… but it would be really obnoxious to have to dedupe it, etc
… but where it shows up in the structure is not something I’d ever trust
Gregg Kellogg: then what makes since would be to change @last to @once and then figure out how to test it
… so when we’re testing we can’t do object comparison
… and again this sort of goes against the whole purpose of framing…but we would need to flatten the structure to do the comparison
Rob Sanderson: I think we’re ready for a proposal
Proposed resolution: Change @last (back) to @once, don’t add @FIRST and then figure out testing separately (Rob Sanderson)
Dave Longley: +1
Rob Sanderson: +1
Gregg Kellogg: +1
Ivan Herman: +1
David Newbury: +1
Benjamin Young: +1
David I. Lehn: +1
Tim Cole: +1
Pierre-Antoine Champin: +1
Simon Steyskal: +1
Jeff Mixter: +1
Resolution #2: Change @last (back) to @once, don’t add @FIRST and then figure out testing separately
Simon Steyskal: The default for the object embed flag is @FIRST (in addition to the explicit inclusion flag being false), and now?
Dave Longley: default should be @once now because it doesn’t add any triples to the graph

@azaroth42 azaroth42 moved this from Discuss-Call to Editorial Work in JSON-LD Management DEPRECATED Apr 4, 2019
gkellogg added a commit that referenced this issue Apr 21, 2019
…ked `json-ld-1.0` and new tests added. The `ordered` option is added to these tests to guarantee predictable results.

Fixes #43.
gkellogg added a commit that referenced this issue Apr 22, 2019
…ked `json-ld-1.0` and new tests added. The `ordered` option is added to these tests to guarantee predictable results.

Fixes #43.
@gkellogg gkellogg moved this from Editorial Work to Editorial work complete in JSON-LD Management DEPRECATED Apr 22, 2019
@gkellogg gkellogg removed this from Editorial work complete in JSON-LD Management DEPRECATED Apr 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants