Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LDflex over In-Memory RDF Data #62

Open
Aklakan opened this issue Mar 14, 2020 · 8 comments
Open

LDflex over In-Memory RDF Data #62

Aklakan opened this issue Mar 14, 2020 · 8 comments

Comments

@Aklakan
Copy link

Aklakan commented Mar 14, 2020

Hi,

Is there a way to use the proxying approach over in-memory data so that the await is no longer necessary? it feels quite clumsy in cases where all data RDF data is locally available anyway.

For my use case I just want to import static RDF files (or fetch an RDF graph in batch from an endpoint), and then have a resource-centric view over it (in contrast to triple based ones), and finally have a JavaScript proxy over it that exposes outgoing and ingoing triples as virtual JSON attributes - in the simplest case by just matching magic JSON atttributes to RDF local names.

To illustrate it conceptually step-by-step:

# expample.ttl

@prefix eg: <http://www.example.org/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

eg:Foo eg:relatesTo eg:Bar .
eg:Bar rdfs:label "Bar" .
// Load a string or JSON representation of a batch of RDF from whatever source - possibly async
var stringOrJson = loadRawRdf("example.ttl")
// If I am not mistaken, import only works for JSON, so only for json-ld we could do "import data from 'example.jsonld'"

// Make the RDF string/json accessible using a proper RDF-API
var model = makeRdfModel(stringOrJson)

// Get a resource centric view
var resource = rdfModel.getResource("http://www.example.org/Foo")

// Now comes the proxy magic:
var fooProxy = makeRdfProxy(resource)`

// In general, properties in RDF are multi-valued, so in general it maps to an (unordered) array view
console.log(fooProxy.relatesTo[0].label)
// Output: Bar

// On the array proxy, try to delegate all calls to element [0] if no array index is used
console.log(fooProxy.relatesTo.label)
// Output: Bar

The main benefit is, that one can then really create fuzz free HTML views just like:

<template>
  <h1>Hello {foo.relatesTo.label}</h1>
</template>
@RubenVerborgh
Copy link
Member

Is there a way to use the proxying approach over in-memory data so that the await is no longer necessary?

Yes, in the sense that LDflex wiring is completely configurable.
So by rewiring this, you can have other behavior.

However, it would be difficult in the sense that:

  • LDflex depends on a lot of components and interfaces that only have an asynchronous implementation.
    • Thinking concretely about JSON-LD context parsing, which can involve network requests and thus needs to be asynchronous. You could plug in a simplified JSON-LD parser that skips these and thus is able to work synchronously, but you'd most probably have to write that first (as I am not aware of any implementations).
    • The Comunica query engine is also asynchronous; however, we have a query engine implementation with RDFlib.js (https://github.com/LDflex/LDflex-rdflib) that could be made synchronous.
  • LDflex relies on data.things.foo.bar and await data.things.foo.bar being different things. If they are not, then you would have to execute a query on every step, i.e., data.things, data.things.foo, data.things.foo.bar, even if you only need the last result.
    • Unless, of course, you go for a synchronous property or method like data.things.foo.bar.items.

it feels quite clumsy in cases where all data RDF data is locally available anyway.

Yeah, but so as mentioned above, there is thus still the JSON-LD parsing and the querying that is asynchronous, and the fact that await is also used as a trigger for "just query it now".

The main benefit is, that one can then really create fuzz free HTML views just like:

I see the appeal indeed. That would not be impossible to wire, but would take some implementation work.

@Aklakan
Copy link
Author

Aklakan commented Mar 16, 2020

So I have been away from the JS community for a while, so I wonder what the status or RDF in JS is today.
Is there by now something similar to Jena/RDF4J (java) or RDFlib (python) w.r.t. abstraction of RDF graphs? So I could look possibly look into contributing a proxy wrapper, but then I'd rather first collect some suggestions of which frameworks are the contemporary ones for loading static RDF into some JavaScript RDFGraph object. Conversely, I wouldn't want to tie a proxy implementation to JSON-LD directly, but rather to some RDF interfaces.

@RubenVerborgh
Copy link
Member

Is there by now something similar to Jena/RDF4J (java) or RDFlib (python) w.r.t. abstraction of RDF graphs?

RDF/JS is the main thing now: https://rdf.js.org/
It's a federation of libraries that are interoperable by using the same interfaces.

So I could look possibly look into contributing a proxy wrapper

Super! Blocker is that most of RDF/JS is async now.

@Aklakan
Copy link
Author

Aklakan commented Mar 16, 2020

Hm, so the datamodel and dataset API seem very synchronous to me.
Are other components actually using this model? Especially, can the existing parsers output a dataset ? Well, one can always assemble from a set of async quad events.

So having an interface for a set of quads with a synchronuous match(g, s, p, o) method is already a good start - although I am a bit worried not seeing a graph abstraction (triples).
Because typically one navigates from a node within a graph to other nodes within the graph - unless one explicitly creates a union-graph (again a graph) view over all named graphs in the dataset.
So my impression is, that there is a fundamental interface missing :(

W.r.t async calls, if they are used to fetch graphs or datasets at once, then there is no issue, because once it is placed as a whole into the (view) model, the view can update itsel synchronously afterwards (I am thinking of vue2 models now).

@RubenVerborgh
Copy link
Member

Hm, so the datamodel and dataset API seem very synchronous to me.

Yeah, for in-memory stores indeed.

Are other components actually using this model? Especially, can the existing parsers output a dataset ? Well, one can always assemble from a set of async quad events.

Parsers etc. are typically Stream indeed. You can load them into a dataset and then do sync from there on.

So my impression is, that there is a fundamental interface missing :(

Not sure about that, you can just pass a constant g?

@Aklakan
Copy link
Author

Aklakan commented Mar 17, 2020

Yeah its not too complicated, its just somewhat unfortunate that I'd have to roll my own 'Graph' interface. Because it feels like reinventing over and over again. I suppose rdfstore-js already had something - but AFAIK its discontinued (right?)

And when I look at the definition of TripleObject https://github.com/Callidon/sparql-engine#rdf-graphs it feels like the JavaScript world is still getting the basics wrong (sorry if my slight frustration offends one - because RDF terms ARE NOT STRINGS!)

interface TripleObject {
  subject: string; // The Triple's subject
  predicate: string; // The Triple's predicate
  object: string; // The Triple's object
}

On the positive side, the most important method of Graph are just 3:

add(triple)
remove(triple)
match(s, p, o) // with all arguments being (RDF) terms

So the proven architecture from the Java world is essentially this:

[Model] -> [Graph] -> [GraphStore] 
Model: High level abstraction; provides Resources which pair (RDFTerm, Graph) so that one can navigate along the properties and update them.

Graph: Provides the match(s, p, o) method. Actually may also provide graph statistics, but this is not relevant for creating a JSON view over RDF.

GraphStore: Actual implementation of the backend; may e.g.
- linearly scan a list of triples for matching (inefficient but works)
- indexing using nested maps (+ dictionary encoding due to javascript associative arrays use strings as keys)

@RubenVerborgh
Copy link
Member

when I look at the definition of TripleObject https://github.com/Callidon/sparql-engine#rdf-graphs

But you're looking in the wrong place:

https://rdf.js.org/data-model-spec/#quad-interface

(and also, those strings have special markers to indicate the term type)

unfortunate that I'd have to roll my own 'Graph' interface

Not sure if there's a strong need for that, depending on what you want to achieve. Best to take that up in https://github.com/rdfjs/dataset-spec

@Aklakan
Copy link
Author

Aklakan commented Mar 17, 2020

Yes, the cleanest approach for me seems to build on https://github.com/rdfjs/dataset-spec - add a small graph abstraction and then the plumbing work is already done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants