JENA-1948: JSON-LD 1.1 - reader and writer #1045

afs · 2021-08-08T19:28:33Z

No description provided.

afs · 2021-08-09T21:36:57Z

Now includes prefix extraction (see LangJSONLD11.extractPrefixes - similar to jsonld-java except it follows the comment in the jsonld-java code and only considers an alias to be a prefix the uri string ends in #, / or :.

Includes a basic writer.

@acoburn - haven't figured out how to preload/cache contexts but that looks important for both performance and privacy/security. I think you have experience here?

ajs6f · 2021-08-09T22:12:42Z

From a security POV, we'd ideally like to offer pluggable context loading with a simple contract, e.g. some folks may want to load from a secure immutable store. That scenario was much-discussed during 1.1 WG. Does Titanium already offer something like that?

afs · 2021-08-10T08:20:33Z

@ajs6f Try asking on JENA-1948 which is more likely to get attention.

(Judging by class names, the answer is "yes" - the question is "how?")

afs · 2021-08-10T10:12:53Z

This work is now sufficiently complete to be put into the codebase.

acoburn · 2021-08-10T13:44:49Z

haven't figured out how to preload/cache contexts but that looks important for both performance and privacy/security.

In Titanium, one would write something like this:

JsonLd.toRdf(...).loader(myCustomLoader).get();

There is a built-in document loader with certain caching properties; the code above allows a user to override that default with their own implementation of the DocumentLoader interface.

If you're able to use your own DocumentLoader, then you have complete control over what is loaded and how that resolution is performed. That structure also allows you to selectively accept/reject certain context documents based on allow/deny lists. This gets to @ajs6f's comment about using an immutable store. In my own code, I have found this to be really useful, but also very specific to the particular use case.

afs · 2021-08-10T20:48:43Z

@acoburn -- thanks.

It's a JSON-LD issues that contexts are like a data version of tracking cookies and pixels, and not specifically about Titanium.

Anyway, if this PR looks good, we can go with the state here.

This is further along than I hoped for and more complete than the basic first PR. We may be able to smooth the legal issues if we have time. It's 425K more in the downloads.

acoburn · 2021-08-10T21:22:41Z

This PR looks good in its present form.

Eventually, it will be nice to have support for doing something like this:

RDFParser.create().source(...).lang(JSONLD11).context(jsonldReadContext).parse(...);

where that context contains some mechanism to adjust how the JSON-LD @context resolution happens in Titanium (e.g. document loader). For instance, since that Jena Context object is already sent to the LangJSONLD11::read method, it might be possible to use that object to customize how the Titanium parsing proceeds.

ajs6f · 2021-08-10T21:42:10Z

From an API POV, I'd distinguish .context(jsonldReadContext) from .contextLoader(jsonldReadContextLoader).

ajs6f · 2021-08-10T21:46:20Z

jena-arq/src/main/java/org/apache/jena/riot/RDFWriterRegistry.java

+            return new TriGWriter() ;
+        if ( Objects.equals(RDFFormat.TRIG_BLOCKS, serialization) )
+            return new TriGWriterBlocks() ;
+        if ( Objects.equals(RDFFormat.TRIG_FLAT, serialization) )


Would it be convenient for the future to let RDFFormat feature methods to get writers and readers? Then TRIG_BLOCKS has writer() { return new TriGWriterBlocks(); } etc. Happy to send an independent PR if so.

These few lines are only in the PR as a small bit of clearup to switch to using a function not an anonymous class.

What capability does that give that isn't available currently?

The right way to get a writer is via RDFWriter.create().

RDFFormat doesn't have readers.

One writer can service several RDFFormats - as happens for current JSON-LD.

The registry+factory approach means the association can be altered dynamically. We can use this to - for example - make Lang.JSONLD/RDFFormat.JSONLD switch from jsonld-java (1.0) and titanium (1.1).

As RDFFormat are keys into maps, returning "new" objects sort of breaks the substitution principle of value-equality. It would work for enums because there is only one object for each enum value. But enums are closed and RDFFormat is open.

I can see your point about closed vs open-- I guess I haven't ever added a format or had any reason to. In any event, this is a tiny thing and I am happy to leave it with your deep knowledge of the (de)ser systems. I haven't ever used them enough to understand where the proper entry points are.

These formats (and more generally Lang) are a way to have "type checked constants". The use of string for everything causes problems e.g. this week a question about model.read(String, String) where the second string is base URI, not syntax. A legacy thing - we are where we are.

Lang in particular,may not even be implemented. Lang also cover result sets.

It's a bit clunky in Java but there again, a programming language with all possible features makes long-term readability suffer.

afs · 2021-08-10T22:59:04Z

The way I think this would be done is via:

public RDFParserBuilder set(Symbol symbol, Object value)

RDFParser.create().source(...).lang(JSONLD11).set(symJsonldContext, jsonldReadContext).parse(...);

This style is done on newer WIP:

jena/jena-arq/src/main/java/org/apache/jena/sparql/exec/QueryExecDatasetBuilder.java

Line 115 in 061a713

public QueryExecDatasetBuilder set(Symbol symbol, Object value) {

where we have set operations for the context as well as context which is replacement.

    public QueryExecDatasetBuilder set(Symbol symbol, Object value)

    public QueryExecDatasetBuilder set(Symbol symbol, boolean value)

    public QueryExecDatasetBuilder context(Context context)

RDFParser only has the last form.
(I am trying to support this style by some simple code pattern but not had time; inheritance is used for something else in other cases.)

This avoids RDFParserBuilder having an API dependency specially for JSON-LD which I think is important.

ajs6f · 2021-08-11T01:37:11Z

Then we can distinguish with two different symbols, one for context and one for context loader, yes?

afs · 2021-08-11T07:48:38Z

two different symbols, one for context and one for context loader,

Yes.

The (RIOT) context is a map of symbols (best named with URIs for global uniqueness) to Objects.

So (JSON-LD) context setup and the (Titanium) context loader can be (RIOT) context entries.

Clean up

0cd8cc3

afs marked this pull request as draft August 8, 2021 20:45

afs force-pushed the jsonld11-2 branch from 1675698 to c28cf4c Compare August 9, 2021 21:10

afs changed the title ~~JENA-1948: Convert between Jena and Titanium RDF objects~~ JENA-1948: JSON-LD 1.1 - reader and writer Aug 9, 2021

JENA-1948: JSON-LD 1.1 - reader and writer

469251e

afs force-pushed the jsonld11-2 branch from c28cf4c to 469251e Compare August 10, 2021 09:55

afs marked this pull request as ready for review August 10, 2021 09:55

ajs6f approved these changes Aug 10, 2021

View reviewed changes

afs requested a review from acoburn August 11, 2021 10:33

afs merged commit 48187f6 into apache:main Aug 18, 2021

afs deleted the jsonld11-2 branch August 18, 2021 21:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JENA-1948: JSON-LD 1.1 - reader and writer #1045

JENA-1948: JSON-LD 1.1 - reader and writer #1045

afs commented Aug 8, 2021

afs commented Aug 9, 2021

ajs6f commented Aug 9, 2021

afs commented Aug 10, 2021

afs commented Aug 10, 2021

acoburn commented Aug 10, 2021

afs commented Aug 10, 2021

acoburn commented Aug 10, 2021

ajs6f commented Aug 10, 2021

ajs6f Aug 10, 2021

afs Aug 10, 2021 •

edited

ajs6f Aug 12, 2021

afs Aug 12, 2021

afs commented Aug 10, 2021

ajs6f commented Aug 11, 2021

afs commented Aug 11, 2021

JENA-1948: JSON-LD 1.1 - reader and writer #1045

JENA-1948: JSON-LD 1.1 - reader and writer #1045

Conversation

afs commented Aug 8, 2021

afs commented Aug 9, 2021

ajs6f commented Aug 9, 2021

afs commented Aug 10, 2021

afs commented Aug 10, 2021

acoburn commented Aug 10, 2021

afs commented Aug 10, 2021

acoburn commented Aug 10, 2021

ajs6f commented Aug 10, 2021

ajs6f Aug 10, 2021

Choose a reason for hiding this comment

afs Aug 10, 2021 • edited

Choose a reason for hiding this comment

ajs6f Aug 12, 2021

Choose a reason for hiding this comment

afs Aug 12, 2021

Choose a reason for hiding this comment

afs commented Aug 10, 2021

ajs6f commented Aug 11, 2021

afs commented Aug 11, 2021

afs Aug 10, 2021 •

edited