Consider objectify/link API method #140

Open
gkellogg opened this Issue Jun 26, 2012 · 22 comments

Comments

Projects

Framing in JSON-LD 1.1

6 participants
Owner

gkellogg commented Jun 26, 2012

Objectify is a mechanism proposed to turn JSON-LD into JavaScript (or other host language) objects with subject references replaced with links to actual objects. Originally proposed by @jmandel in issue #109 as a means of flattening a graph, applying a context, and replacing subject references with in-memory object links.

Alternate names have been suggested, as no-one's thrilled with objectify.

  • graphify would make it clear that we're turning a tree into a graph
  • link goes along with the notion of object linking implied in the name JSON-LD, and uses the term linking to apply to in-memory links as well as external links.
  • connect has the connotation of connecting references together.

Given that we've adopted a flatten API method already, partly to help support framing, and context application is implied in expansion, compaction and flattening already, the objectify API method really only need concern itself with replacing subject references with in-memory references.

Conversation took place on the June 26, 2012 telecon.

Search

For objectify to serve as a potential replacement for framing, we also need to have a means of surfacing nodes within a graph.

  • JSON Select is one tool that can be used to generically select JSON, similar to using CSS selectors.
  • BGP An implementation of BGP against JavaScript data structures would provide a normative path to full SPARQL query support of JSON-LD graphs, in this case, using framing-like syntax to define JSON structures which would be used for matching against JavaScript objects.

Concerns

Need to make sure code doesn't cycle when traversing such graphs, but if we flatten first, and keep track of visited nodes, this should ensure we don't loop.

Member

lanthaler commented Jun 27, 2012

Another name that could be used instead of objectify would be decode or parse.

An approach similar to JSON Select is JSONPath but IMO they are not particularly good candidates as all of them work on (DOM) trees whereas we work with graphs. Therefore I think it would indeed make more sense to look at approaches like BGP (SPARQL) or their NoSQL counterparts like Cypher.

Of course we could also leave the implementation of search etc. to the application developers for the time being and just provide very simple means of selecting the entry point to the graph (a subject IRI, e.g.)

Owner

gkellogg commented Jul 1, 2012

So, I implemented something in CoffeeScript for Backbone.js that works on a flattened object structure, which is pretty simple:

  # Resolve subject references {id: foo} given a map
  _resolve: (object) ->
    if object instanceof Array
      # Map each value
      @_resolve(value) for value in object
    else if object instanceof Object and _.keys(object).length is 1 and 'id' of object
      @collection.get(object.id) || object
    else if object instanceof Object
      # Interpolate each value
      res = {}
      res[key] = @_resolve(value) for key, value of object
      res
    else
      # Return unaltered object
      object

The key is that I only resolve once, so that I'll never face the case where something's already been resolved and I could end up in a loop.

Member

niklasl commented Jul 4, 2012

In my 'graphify' of JSON-LD experiment (at https://github.com/niklasl/rdfa-lab/blob/gh-pages/js/ld.js, usage example at https://github.com/niklasl/rdfa-lab/blob/gh-pages/test/test_ld.js), I've defined a graph object from which one can access resources by IRI. The graph behaves mostly like a regular JSON-LD tree, but supports "virtual" access via expanded IRIs and CURIEs as well (via a find method).

When objects are imported to the graph, they are connected just as this issue describes. This process also adds reverse relations, currently in an object accessed via a referrersVia attribute. I find this important to provide a means of effectively navigating a graph. (Every graph API I've seen, RDF-based and others, supports this.) This also gives us a findByType for free, since that operation just means getting the type and finding everything linking to it via rdf:type (illustrated in the usage example I link to above).

Specifically, I use clones of them to replace the link references‚ whose toJSON method returns undefined. This makes the graph directly serializable as JSON (albeit currently it also imposes restrictions for how to edit the graph).

It's very much work in progress, but I think the features there are important to consider. The mechanism is intended to mirror the RDFa API notion of Projection as well, but more closely aligned with JSON-LD contexts. And with the goal to be as little API and as much data as possible. I do hope we can turn this into a core "LinkedData" API, simpler than a full, triple-oriented RDF API but more general than a specific JSON-LD API.

(I should also point out that my experiment also toys with the idea of supporting SPARQL 1.1 property path expressions for more powerful navigation. But it's quite reasonable to have this as a utility add-on and not tied to any core spec, at least initially.)

Owner

msporny commented Sep 30, 2012

Moving this to the JSON-LD.next milestone. There isn't a single person that I know of that thinks that we have the time to write the spec text for this algorithm, create test cases, and put it in the 1.0 spec. There is no reason to spend telecon time discussing this any further.

To be clear - we all want this API method to exist... we just don't think we can get it done on the JSON-LD API 1.0 timeline.

Member

lanthaler commented Oct 1, 2012

Honestly, this worries me a bit as we also moved framing there. I wouldn't like to see a JSON-LD API that requires another API on top of it to be usable in practice.

The problem is that the same information can be expressed in various ways and apart from flattening it and then looping over it there's no way to access the data directly without knowing it’s exact structure. I think that was one of the problems RDF/XML faced and I wouldn't like to see JSON-LD going down the same path - especially considering that we do have an API.

Member

niklasl commented Oct 2, 2012

I just wrote and pushed an initial draft for a JSON-LD Connect algorithm. I hope we can use that to explore this idea and come to some conclusion.

Owner

gkellogg commented Oct 2, 2012

Thanks for doing this Niklas.

We should probably have a disconnect algorithm to revers this process, and essentially create a flattened map again.

It's also worth describing in a note somewhere that it can be useful to simulate the effect of connect when using model accessors such as Backbone.js Model get method. I implement a getModel (and setModel) method that replaces a node reference with a memory reference to the associated JSON object.

Member

lanthaler commented Oct 2, 2012

Just to make sure I understand this. The result of connect() is an object with a idMap and a typeMap member. The idMap is again an object where the keys are @ids and the values are node definitions (with an optional @rev key).

What is the typeMap used for? What data is in there? If I understand it correctly, all typed objects are not put in the idMap but in the typeMap, right? What’s the advantage of doing this? That you can get typed objects by type (but not by ID without looping)? In my implementation I solved this by using reverse links.. First get the type node and then use the reverse @type to get all the relevant nodes.

Gregg, by disconnect you mean a method to convert such a structure to a valid JSON-LD document, right?

Member

niklasl commented Oct 2, 2012

Good catch Markus, the type indexing step was incorrect. It only happened if there was no id, and the typeMap should be a map of type id:s to a set of instances. Of course, the id and type indexing should both happen and not exclude each other. I just pushed a minor correction to that step, also excluding literals from being indexed (unless we see great value in that).

It is an open question whether the type index is needed. Using reverse links for types as well makes sense too (I've implemented a corresponding RDF API like that). But if you use compact forms with @type keys, the types are opaque tags, and it seems very beneficial to have easy access to them. (With full "rdf:type" keys, the reverse relation would be added by the current algorithm. Not so with @id coerced relations though; that's to be added.) Of course, it may be that you want to index on all sorts of values, and perhaps connect should be configurable to set up indexes for any desired property.

Member

niklasl commented Oct 2, 2012

Gregg: yes, I agree that we should write something about how this algorithm is but one way to achieve the sought-after convenience of simple key access to nodes in a graph structure. In more powerful dynamic languages (e.g. Python and Ruby), having dynamic access to live underlying RDF graphs is just as viable.

But we should probably constrain the scenarios, since it just takes a little imagination to replace this kind of "sparse" data with access objects supporting things like get methods (e.g. capable also of taking CURIEs and IRIs). But then we end up with a full RDF API. Which is certainly nice (and I've built that too in the RDFa lab), but beyond the scope of this idea.

As for disconnect: yes, possibly. That could also be a toJSON method on the result object though, to cooperate with JSON.stringify. More crucially, we get into the problem of whether the result of connect should be considered properly editable. If we keep indexing (and rev maps), which are extremely useful for general consumption (e.g. sending the result to templates for rendering), editing requires some care. Due to the lack of generic capturing of setting keys in JS (you can predefine setters, but not capture adding new keys AFAIK), adding a new link or node definition has to be done via an API call. We may want to add that; I'm not sure yet. That too may be veering into full RDF API territory.

Member

lanthaler commented Oct 9, 2012

RESOLVED: Table issue-140 for the time being, delay discussion until all other issues for JSON-LD 1.0 have been addressed.

Owner

msporny commented Nov 20, 2012

I don't think we're going to be able to get to this issue, let's push it off for the next version of JSON-LD.

PROPOSAL 1: Defer creation of a .graphify() mechanism until after JSON-LD 1.0.

Owner

msporny commented Nov 20, 2012

PROPOSAL 1: +1

Contributor

tidoust commented Nov 20, 2012

PROPOSAL 1: +1

On Tue, Nov 20, 2012 at 1:54 AM, Manu Sporny notifications@github.comwrote:

PROPOSAL 1: +1


Reply to this email directly or view it on GitHubhttps://github.com/json-ld/json-ld.org/issues/140#issuecomment-10538535.

Owner

gkellogg commented Nov 20, 2012

Well, +1 too, but didn't we already resolve to delay until after JSON-LD 1.0?

Member

lanthaler commented Nov 20, 2012

RESOLVED: Defer creation of a .graphify() mechanism until after JSON-LD 1.0.

gkellogg added the 1.1 label Sep 22, 2016

gkellogg removed the 1.1 label Oct 6, 2016

Owner

gkellogg commented Apr 8, 2017 edited

I propose taking this off of the 1.1 milestone; please 👍 or 👎 to favor/disfavor removing.

Owner

dlongley commented Apr 8, 2017

I'd like to see the @link feature in framing move forward rather than a new API call.

Owner

dlongley commented Apr 8, 2017

@gkellogg are we 👍 for removing and 👎 for keeping? I did a 👍 for taking this off the 1.1 milestone.

Owner

gkellogg commented Apr 8, 2017

@dlongley 👍 to accept the proposal, and take off the 1.1 milestone. If you think this is handled by the @link feature, and would like to keep it in 1.1, can you propose text to be added to Framing (shouldn't be hard)? Not sure how to test this, given the need to serialize, which can't really be done with @link; ideas?

Owner

dlongley commented Apr 8, 2017 edited

@gkellogg,

If you think this is handled by the @link feature...

Yes, it is, I've implemented it.

...and would like to keep it in 1.1...

I'm not totally convinced it's ready. The @link feature is available in several implementations now but there may be some lingering questions about how it ought to work in some edge cases. So we may want to wait unless there's a groundswell of support to move forward with it (and that doesn't yet seem to be the case).

Not sure how to test this, given the need to serialize, which can't really be done with @link; ideas?

I haven't tried to figure anything out that isn't based on very customized testing yet. Off the top of my head, we could have a set of "paths" (that may include cycles) that the test suite will attempt to follow/dereference in the output to check object references and find some expected value. For example:

Test expectation data:

{
  "testPaths": [
    {
      "path": [
        {"object": 1, "term1"},
        {"object": 2, "term2"},
        {"object": 1, "term3"}
      ],
      "expected": "foo"
    }
  ]
}

This test output would pass:

{
  "@id": "root",
  "term1": {
    "@id": "nested",
    "term2": {/* cyclic link back to "root" */}
  },
  "term3": "foo"
}

Our test harness would build some code to assert that the same objects (in memory) were used where appropriate and that the expected values were found.

Owner

gkellogg commented Apr 8, 2017

Okay, great. Pending further votes, we'll take it out of 1.1.

BTW, I've implemented it too, but don't have tests. We'd need to define a new kind of test, as you expect, that would look for specific paths in memory using something like you suggest.

gkellogg added to Framing in JSON-LD 1.1 Apr 10, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment