Formalize cycle-breaking and object framing #9

msporny · 2011-01-22T22:38:41Z

We've been doing a bit of research at Digital Bazaar on how to best meld graph-based object models with what most developers are familiar with these days - JSON-based object programming (aka: associative-array based object models). We want to enable developers to use the same data models that they use in JavaScript today, but to work with arbitrary graph data.

This is an issue that we think is at the heart of why RDF has not caught on as a general data model - the data is very difficult to work with in programming languages. There is no native data structure that is easy to work with without a complex set of APIs.

When a JavaScript author gets JSON-LD from a remote source, the graph that the JSON-LD expresses can take a number of different but valid forms. That is, the information expressed by the graph can be identical, but each graph can be structured differently.

Think of these two statements:

The Q library contains book X.
Book X is contained in the Q library.

The information that is expressed in both sentences is exactly the same, but the structure of each sentence is different. Structure is very important when programming. When you write code, you expect the structure of your data to not change.

However, when we program using graphs, the structure is almost always unknown, so a mechanism to impose a structure is required in order to help the programmer be more productive.

The way the graph is represented is entirely dependent on the algorithm used to normalize and the algorithm used to break cycles in the graph. Consider the following example, which is a graph with three top-level objects - a library, a book and a chapter. Each of the items is related to one another, thus the graph can be expressed in JSON-LD in a number of different ways:

{
   "#": 
   {
      "dc": "http://purl.org/dc/elements/1.1/",
      "ex": "http://example.org/vocab#"
   },
   "@": 
   [
      {
         "@": "http://example.org/test#library",
         "a": "ex:Library",
         "ex:contains":  "<http://example.org/test#book>"
      },
      {
         "@": "<http://example.org/test#book>",
         "a": "ex:Book",
         "dc:contributor": "Writer",
         "dc:title": "My Book",
         "ex:contains": "<http://example.org/test#chapter>"
      },
      {
         "@": "http://example.org/test#chapter",
         "a": "ex:Chapter",
         "dc:description": "Fun",
         "dc:title": "Chapter One"
      }
   ]
}

The JSON-LD graph above could also be represented like so:

{
   "#": 
   {
      "dc": "http://purl.org/dc/elements/1.1/",
      "ex": "http://example.org/vocab#"
   },
   "@": "http://example.org/test#library",
   "a": "ex:Library",
   "ex:contains":
   {
      "@": "<http://example.org/test#book>",
      "a": "ex:Book",
      "dc:contributor": "Writer",
      "dc:title": "My Book",
      "ex:contains": 
      {
         "@": "http://example.org/test#chapter",
         "a": "ex:Chapter",
         "dc:description": "Fun",
         "dc:title": "Chapter One"
      }
   }
}

Both of the examples above express the exact same information, but the graph structure is very different. If a developer can receive both of the objects from a remote source, how do they ensure that they only have to write one code path to deal with both examples?

That is, how can a developer reliably write the following code:

// print all of the books and their corresponding chapters
var library = jsonld.toObject(jsonLdText);
for(var bookIndex = 0; bookIndex < library["ex:contains"].length; 
    bookIndex++)
{
   var book = library["ex:contains"][bookIndex];
   var bookTitle = book["dc:title"];
   for(var chapterIndex = 0; chapterIndex < book["ex:contains"].length; 
       chapterIndex++)
   {
      var chapter = book["ex:contains"][chapterIndex];
      var chapterTitle = chapter["dc:title"];
      console.log("Book: " + bookTitle + " Chapter: " + chapterTitle);
   }
}

The answer boils down to ensuring that the data structure that is built for the developer from the JSON-LD is framed in a way that makes property access predictable. That is, the developer provides a structure that MUST be filled out by the JSON-LD API. The working title for this mechanism is called "Cycle Breaking and Object Framing" since both mechanisms must be operable in order to solve this problem.

The developer would specify a Frame for their language-native object like the following:

{
   "#": {"ex": "http://example.org/vocab#"},
   "a": "ex:Library",
   "ex:contains": 
   {
      "a": "ex:Book",
      "ex:contains":
      {
         "a": "ex:Chapter"
      }
   }
}

The object frame above asserts that the developer expects to get a library containing one or more books containing one or more chapters returned to them. This ensures that the data is structured in a way that is predictable and only one code path is necessary to work with graphs that can take multiple forms. The API call that they would use would look something like this:

var library = jsonld.toObject(jsonLdText, objectFrame);

The mechanism in the API and the algorithm that is used to perform cycle breaking and object framing should be formalized in the JSON-LD specification.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Formalize cycle-breaking and object framing #9

Formalize cycle-breaking and object framing #9

msporny commented Jan 22, 2011

Formalize cycle-breaking and object framing #9

Formalize cycle-breaking and object framing #9

Comments

msporny commented Jan 22, 2011