Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support JSON-LD in tabular JSON output #125

Open
akuckartz opened this issue Oct 15, 2020 · 12 comments
Open

Support JSON-LD in tabular JSON output #125

akuckartz opened this issue Oct 15, 2020 · 12 comments

Comments

@akuckartz
Copy link

akuckartz commented Oct 15, 2020

Why?

When the W3C Recommendation SPARQL 1.1 Query Results JSON Format was created in 2013 JSON-LD was not yet available. JSON-LD 1.0 was published in 2014. The specified query results JSON Format therefore does not and could not specify a JSON-LD result format as an RDF serialisation.

The specification should be extended to also support JSON-LD 1.1 (or newer). This would make it possible to process the result of queries with RDF tools and eliminate the need for special parsers.

Previous work

Half a decade ago quite a bit of effort went into creating a set of W3C Recommendations enabling conversions from tabular data to RDF.

Regarding RDF* and SPARQL* see also: w3c/rdf-star#13

Proposed solution

Hopefully only a JSON-LD context needs to be added without modifying the syntax.

Considerations for backward compatibility

The syntax currently specified for the JSON documents should not be changed.

@kasei
Copy link
Collaborator

kasei commented Oct 15, 2020

Are there any current implementations that do this? Would you expect that the RDF would align with the ResultSet vocabulary?

@afs
Copy link
Collaborator

afs commented Oct 15, 2020

process the result of queries with RDF tools

How does this relate to suitably shaped CONSTRUCT, and content-type JSON-LD?

Note: result sets are ordered so that the effect of ORDER BY is visible.

@kasei
Copy link
Collaborator

kasei commented Oct 15, 2020

How does this relate to suitably shaped CONSTRUCT, and content-type JSON-LD?

I'd think you'd have to have a different content-type, though it's not obvious what that would be given the current types' use of +json and +ld

Note: result sets are ordered so that the effect of ORDER BY is visible.

The resultsset vocab allows ordering, right?

@afs
Copy link
Collaborator

afs commented Oct 15, 2020

a different content-type

Not necessarily, the data could say what it is c.f. rdf:type owl:Ontology.

In #73, there is compound CONSTRUCT, which is in the direction of building up a result graph from several WHERE clauses. I can see that forming a graph - maybe with table like parts - is more general.

The use case:

process the result of queries with RDF tools
and I'd like to understand the process chain envisaged because if the RDF tools simply pick data out of a table, then maybe it is really a CONSTRUCT case.

A application/sparql-results+json parser is one of the simpler ones to write. Wrapping up RDF tools, parsing out the ResultSet vocabulary, dealing with streaming, etc to produce a table may be more work.

The results set vocab allows ordering, right?
Yes - with an ad-hoc (not RDF list) approach. But how much is it used for (non-testing) real?

The vocabulary originated for test cases before XML and JSON result sets, which are off ramps from RDF data to non-RDF data systems, were around.

@kasei
Copy link
Collaborator

kasei commented Oct 15, 2020

In #73, there is compound CONSTRUCT, which is in the direction of building up a result graph from several WHERE clauses. I can see that forming a graph - maybe with table like parts - is more general.

Oh, that's interesting.

Yes - with an ad-hoc (not RDF list) approach. But how much is it used for (non-testing) real?

To an approximation, I'd say it's not used at all. Every time I implement a new SPARQL system and test harness, I always get caught out by having to remember to implement a parser for the RDF encoded result sets.

The vocabulary originated for test cases before XML and JSON result sets, which are off ramps from RDF data to non-RDF data systems, were around.

So… just like N-Triples? :-P

@afs
Copy link
Collaborator

afs commented Oct 16, 2020

Aside - maybe we should convert the old (SPARQL 1.0) test results that are TTL to SRJ or SRX and flip the manifest entries. No point expecting implementers to write unnecessary code.

@VladimirAlexiev
Copy link
Contributor

I think there are 2 forces in opposition here:

  1. On one hand, if the query returns straight values from the database, it would make sense to use the same props already defined in the underlying ontologies
  2. On the other hand, ResultSet describes a generic tabular structure that doesn't depend on the query or ontology

For 1, I could use CSVW metadata to recapture the original props (to reinterpret the table as pieces of the original graph), but why should I have to do this? Or to think of it in a different way, what's the relation between CSVW metadata and JSONLD context?

For 2, you need it in many cases: if you have any calculated variables, or fetch the same prop from different spots in the graph and output the values in different cols, etc

@gkellogg
Copy link
Member

gkellogg commented Nov 3, 2020

I think there are 2 forces in opposition here:

  1. On one hand, if the query returns straight values from the database, it would make sense to use the same props already defined in the underlying ontologies
  2. On the other hand, ResultSet describes a generic tabular structure that doesn't depend on the query or ontology

For 1, I could use CSVW metadata to recapture the original props (to reinterpret the table as pieces of the original graph), but why should I have to do this? Or to think of it in a different way, what's the relation between CSVW metadata and JSONLD context?

CSVW Metadata includes a limited JSON-LD Context, which doesn’t have the ability to add new term definitions, for example. Otherwise, the Metadata is really a schema for interpreting the CSV values.

You could certainly auto-generate CSVW Metadata to describe the result set, but a server would need to make this available at a location relative to the returned results due to the need for the Metadata to be located near the resulting CSV and for the Metadata to reference the table(s) it annotates.

Personally, I don't think that CSVW has had the impact that it should have, and this is a perfect example of where W3C specs should support each other.

@VladimirAlexiev
Copy link
Contributor

@akuckartz and @gkellogg I completely support what you said but how to achieve these goals? Can you give an example or write a strawman spec?

@jaw111
Copy link
Contributor

jaw111 commented May 13, 2021

I know Dydra @lisp supports a JSON-LD serialization of SELECT query results, which is arguably considerably simpler than the SPARQL 1.1 Query Results JSON Format.

@lisp
Copy link
Contributor

lisp commented May 13, 2021

yes, this has been the case since the initial application/ld+json deployment.
the choice was either to respond to any select queries which specified that media type with bad-request errors or to do something reasonable.
the approach is very simple: treat a dimensioned, tabular solution field as a hypergraph, generate the corresponding json-ld context on the basis of the field dimensions and emit a framed result for that graph.
this causes a query like

$ curl https://dydra.com/json-ld/foaf/selection -H "Accept: application/sparql-query"
select ?person ?name ?homepage ?type
where {
 ?person <http://xmlns.com/foaf/0.1/name> ?name .
 ?person <http://xmlns.com/foaf/0.1/workplaceHomepage> ?homepage .
 ?person <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type.
}

to produce

$ curl https://dydra.com/json-ld/foaf/selection -H "Accept: application/ld+json"
{"@graph": [{"@id": "_:g1", "type": "http://xmlns.com/foaf/0.1/Person", "homepage": "http://www.example.com/jsmith", "name": "John Smith", "person": "http://me.example.com"}, {"@id": "_:g2", "type": "http://xmlns.com/foaf/0.1/Person", "homepage": "http://www.example.com/bsmith", "name": "Bessie Smith", "person": "http://you.example.com"}, {"@id": "_:g3", "type": "http://xmlns.com/foaf/0.1/Person", "homepage": "http://www.example.com/jhacker", "name": "J Q Hacker", "person": "http://jhacker.example.com"}]}

a variant would use the initial projection value as the resource, but the current approach ensures that one does not produce an invalid "subject" term.

the result is more verbose than application/json, but less verbose that application/sparql-results+json and fits well with some client tooling.

@TallTed
Copy link
Member

TallTed commented May 24, 2021

Virtuoso's SPARQL query form currently offers JSON-LD with or without @context for CONSTRUCT, but not for SELECT. JSON-LD can be requested at any time via conneg, and should be delivered if the requested resource can be so serialized.

It's not clear to me whether this implementation delivers exactly what's being contemplated by this issue, but if not, it's not far off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants