Embed behavior makes .frame's results hard to work with #119

jmandel · 2012-05-10T20:03:11Z

Executive summary

The framing algorithm's approach to "multiple embeds" makes it hard for developers to work with framed results.

Background

Developers want to frame JSON-LD payloads in ways that make them simple to work with. For example:

discover subjects of interest
loop over these subjects
resolve nested data with consistent paths

But in the current framing algorithm, machinery for avoiding circularity and avoiding verbose output introduces complexity for developers. Best to understand with an example.

Example

I'll illustrate with MedicationLists that have Medications that have DrugCodes with titles and identifiers:
Framing Problem: example in Playground

How developers want framing to work:

jsonld.frame(raw_data, function(err, response){
    response['@graph'].forEach(function(medlist){
        medlist.hasMedications.forEach(function(med){
            console.log("Drug: " + med.drugCode.title + "::" + med.drugCode.identifier);
        });
    });
});

... but in the example above, when we hit ['@graph'][0].hasMedication[2].drugCode we find a reference, not an embed! It takes severely defensive progrmaming to avoid this.

How developers need to work around the current framing behavior:

Since framed results don't reliably re-embed resources, developers need to check at each step whether an object is a reference or an embed. This means first creating a hash of known embeds, and then looking up values in this hash at every step through the framed result.

jsonld.frame(raw_data, medframe, function(err, response) {

    // identify an embed for each subject to resolve references 
    var subjects = {}
    findSubjects(subjects, med_response['@graph']);

    response['@graph'].forEach(function(medlist){
        medlist.hasMedications.forEach(function(med){

            // need to ensure drugCode is an embed, not a reference
            var drugCode = subjects[med.drugCode['@id']];

            console.log("Drug code: " + drugCode.title + "::" + drugCode.identifier);
        });
    });
});

// pseudocode for finding subject embds in framed results
function findSubects(subjects, subtree) {
    if (_isArray(subtree)) {
        subtree.forEach(function(elt){
            findSubject(subjects, elt);
        });

        return;
    }

    if (_isEmbed(subtree)) {
        subjects[subtree['@id']] = subtree;
    }

    if (_isObject(subtree)) {
        for (k in subtree) {
            findSubjects(subjects, subtree[k]);
        }
    }
};

And the workaround isn't complete

This workaround presents limitations. For instance:

How to deal with subjects that are supposed to be framed in different ways?
How to properly implement _isEmbed?

Proposal: aggressive re-embedding

I'd recommend re-embedding resources aggressively -- right up to (but not crossing) the point of creating circular references. There are some risks here, including an explosion in the framing output size for graphs rich in bidirectional links. Does anyone have ideas for mitigating this explosion?

(One alternative approach is to allow a mode of operation that doesn't produce a serializable framing output, but instead produces an in-memory structure with potential circularity. For many applications, this in-memory, potentially circular structure is a very natural fit for developers' goals. This could be separate from framing, if there were a simple, consistent way to take a serialized framed result and convert to an appropriate in-memory structure.)

The text was updated successfully, but these errors were encountered:

gkellogg · 2012-05-10T22:26:48Z

Agreed, I was tripped up by this. I think when using @embed, it should always embed. My example was a bit simpler: http://tinyurl.com/7jzaqj3

Basically, given an object with two properties (doap:developer and dc:creator) I want them both to expand, not just one of them.

Input:

 {
  "@context": {
    "doap:developer": {
      "@type": "@id",
      "@container": "@set"
    },
    "foaf": "http://xmlns.com/foaf/0.1/",
    "dc:creator": {
      "@type": "@id",
      "@container": "@set"
    },
    "doap": "http://usefulinc.com/ns/doap#",
    "dc": "http://purl.org/dc/terms/",
    "@language": "en"
  },
  "@graph": [
  {
    "@id": "http://rubygems.org/gems/json-ld",
    "@type": "doap:Project",
    "dc:creator": ["http://greggkellogg.net/foaf#me"],
    "doap:developer": [
    {
      "@id": "http://greggkellogg.net/foaf#me",
      "@type": "foaf:Person",
      "foaf:homepage": "http://greggkellogg.net/",
      "foaf:name": "Gregg Kellogg"
    }],
    "doap:name": "JSON::LD"
  }]
}

Frame:

{
  "@context": {
    "@language": "en",
    "doap": "http://usefulinc.com/ns/doap#",
    "foaf": "http://xmlns.com/foaf/0.1/",
    "dc": "http://purl.org/dc/terms/",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "dc:creator": {"@type": "@id","@container": "@set"},
    "doap:homepage": {"@type": "@id"},
    "doap:implements": {"@type": "@id","@container": "@set"},
    "doap:developer": {"@type": "@id","@container": "@set"},
    "doap:helper": {"@type": "@id","@container": "@set"},
    "doap:created": {"@type": "xsd:date"},
    "foaf:homepage": {"@type": "@id"}
  },
  "@explicit": true,
  "@type": "doap:Project",
  "dc:creator": {
    "@explicit": true,
    "@embed": true,
    "@type": "foaf:Person",
    "foaf:name": {},
    "foaf:homepage": {}
  },
  "doap:developer": {
    "@explicit": true,
    "@embed": true,
    "@type": "foaf:Person",
    "foaf:name": {},
    "foaf:homepage": {}
  },
  "doap:name": {}
}

lanthaler · 2012-05-13T11:38:12Z

I think the easiest way to fix this would be to keep a list of stuff that has already been embedded (from the root to the current property). If a subject wasn't embedded yet, embed it by default. If @embed is set, embed it also if it was already embedded in the path from the root to the current property - which requires to break circular references. Maybe just holding references in the last embed would already solve this..

Thoughts?

dlongley · 2012-05-14T16:45:24Z

We already keep a list of what has been embedded.

When Josh first brought this up in #json-ld, I told him that I had been thinking of changing the framing behavior to do this anyway, as it would help solve a couple of issues: the unusual behavior of removing existing embeds, and that there is a bug in that algorithm involving traversing the path to the root through arrays.

In any case, I'd support re-embedding information and avoiding cycles. I think re-embedding whenever possible will be preferred behavior, and we might want to have a "strict" flag to throw an exception when a cycle would occur and a re-embed was avoided.

gkellogg · 2012-05-14T17:14:01Z

+1

Cycles that would lead to recursive embedded declarations should probably just turn into subject references.

dlongley · 2012-05-14T18:19:40Z

We might be able to alter the current algorithm to just keep track of the root of the path being currently processed and for for each embed (instead of its immediate parent) and compare those when making embed decisions to avoid cycles by using subject references or to throw exceptions when in strict mode. We'll probably need to make a couple of other changes, but hopefully nothing too drastic.

I'm not sure how we want to handle conflicts between auto-embeds and frame-specific (@embed: true) embeds. The older frame algorithm used to replace the auto-embeds with subject references -- which we could now do only if a cycle would be created. However, if we replace those auto-embeds then we'll have to keep the existing embed replacement code (and dangling embed clean up code) which I'd prefer not to. If we can work around that and still produce something that matches what we think people will expect from framing that would be best.

lanthaler · 2012-05-15T03:53:22Z

I think this issue (conflicts between auto-embeds and frame-specific (@embed: true) embeds would be solved by not automatically including the whole "subtree" as proposed in #110. As frame will never have an infinite depth, I think it would be OK to embed something several times in one path if the frame author wants that.

+1 to get rid of the existing embed replacement code

dlongley · 2012-05-15T13:48:52Z

Actually, we could just view the default behavior as @embed: true (which is really what happens now anyway) and then there's no conflict either. If you add @embed: true to your frame when that's the default option, it's just like repeating yourself so there's no issue. I think either way, if we decided @embed: false or @embed: true is the default, we can remove the embed replacement code and just check for cycles.

lanthaler · 2012-05-27T13:03:56Z

I just uploaded the latest version of my processor which supports agressive re-embedding. To automatically include the whole sub-tree (which is not the default behavior), add "@embedChildren": true to the frame.

I uploaded a modified version of the playground so that you try it without needing to download/install anything.

This addresses #110, #118, and #119.

lanthaler · 2012-09-04T15:40:58Z

RESOLVED: Do not support .frame() in JSON-LD 1.0 API.

gkellogg · 2016-09-21T22:39:14Z

This is handled in 5a3e506 to allow more values for @embed:

@always,
@last,
@link, -- Not really testable, as it's in-memory only
@never

In addition to true and false, which map to @always and @never.

lanthaler added a commit that referenced this issue Aug 29, 2012

Add some issue markers to the Framing spec

b87e100

This addresses #110, #118, and #119.

cardinal27513 mentioned this issue Dec 22, 2012

aggressive re-embedding digitalbazaar/jsonld.js#20

Closed

mcollina mentioned this issue Jul 18, 2013

Query languages based on framing levelgraph/levelgraph-jsonld#2

Open

dlongley mentioned this issue Nov 26, 2014

Update framing spec to use new embed API #377

Closed

dr0i mentioned this issue Oct 13, 2015

Embed in frames does only work once, having the same URI jsonld-java/jsonld-java#150

Closed

gkellogg added 1.1 under-review and removed on-hold labels Sep 21, 2016

gkellogg removed the 1.1 label Oct 6, 2016

gkellogg closed this as completed in fdb51b0 Nov 18, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embed behavior makes .frame's results hard to work with #119

Embed behavior makes .frame's results hard to work with #119

jmandel commented May 10, 2012

gkellogg commented May 10, 2012

lanthaler commented May 13, 2012

dlongley commented May 14, 2012

gkellogg commented May 14, 2012

dlongley commented May 14, 2012

lanthaler commented May 15, 2012

dlongley commented May 15, 2012

lanthaler commented May 27, 2012

lanthaler commented Sep 4, 2012

gkellogg commented Sep 21, 2016

Embed behavior makes .frame's results hard to work with #119

Embed behavior makes .frame's results hard to work with #119

Comments

jmandel commented May 10, 2012

Executive summary

Background

Example

How developers want framing to work:

How developers need to work around the current framing behavior:

And the workaround isn't complete

Proposal: aggressive re-embedding

gkellogg commented May 10, 2012

lanthaler commented May 13, 2012

dlongley commented May 14, 2012

gkellogg commented May 14, 2012

dlongley commented May 14, 2012

lanthaler commented May 15, 2012

dlongley commented May 15, 2012

lanthaler commented May 27, 2012

lanthaler commented Sep 4, 2012

gkellogg commented Sep 21, 2016