New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JENA-1948: JSON-LD 1.1 - reader and writer #1045
Conversation
Now includes prefix extraction (see Includes a basic writer. @acoburn - haven't figured out how to preload/cache contexts but that looks important for both performance and privacy/security. I think you have experience here? |
From a security POV, we'd ideally like to offer pluggable context loading with a simple contract, e.g. some folks may want to load from a secure immutable store. That scenario was much-discussed during 1.1 WG. Does Titanium already offer something like that? |
@ajs6f Try asking on JENA-1948 which is more likely to get attention. (Judging by class names, the answer is "yes" - the question is "how?") |
This work is now sufficiently complete to be put into the codebase. |
In Titanium, one would write something like this: JsonLd.toRdf(...).loader(myCustomLoader).get(); There is a built-in document loader with certain caching properties; the code above allows a user to override that default with their own implementation of the If you're able to use your own DocumentLoader, then you have complete control over what is loaded and how that resolution is performed. That structure also allows you to selectively accept/reject certain context documents based on allow/deny lists. This gets to @ajs6f's comment about using an immutable store. In my own code, I have found this to be really useful, but also very specific to the particular use case. |
@acoburn -- thanks. It's a JSON-LD issues that contexts are like a data version of tracking cookies and pixels, and not specifically about Titanium. Anyway, if this PR looks good, we can go with the state here. This is further along than I hoped for and more complete than the basic first PR. We may be able to smooth the legal issues if we have time. It's 425K more in the downloads. |
This PR looks good in its present form. Eventually, it will be nice to have support for doing something like this: RDFParser.create().source(...).lang(JSONLD11).context(jsonldReadContext).parse(...); where that |
From an API POV, I'd distinguish |
return new TriGWriter() ; | ||
if ( Objects.equals(RDFFormat.TRIG_BLOCKS, serialization) ) | ||
return new TriGWriterBlocks() ; | ||
if ( Objects.equals(RDFFormat.TRIG_FLAT, serialization) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be convenient for the future to let RDFFormat
feature methods to get writers and readers? Then TRIG_BLOCKS
has writer() { return new TriGWriterBlocks(); }
etc. Happy to send an independent PR if so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These few lines are only in the PR as a small bit of clearup to switch to using a function not an anonymous class.
What capability does that give that isn't available currently?
The right way to get a writer is via RDFWriter.create()
.
RDFFormat
doesn't have readers.
One writer can service several RDFFormats
- as happens for current JSON-LD.
The registry+factory approach means the association can be altered dynamically. We can use this to - for example - make Lang.JSONLD
/RDFFormat.JSONLD
switch from jsonld-java (1.0) and titanium (1.1).
As RDFFormat
are keys into maps, returning "new" objects sort of breaks the substitution principle of value-equality. It would work for enums because there is only one object for each enum value. But enums are closed and RDFFormat
is open.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see your point about closed vs open-- I guess I haven't ever added a format or had any reason to. In any event, this is a tiny thing and I am happy to leave it with your deep knowledge of the (de)ser systems. I haven't ever used them enough to understand where the proper entry points are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These formats (and more generally Lang
) are a way to have "type checked constants". The use of string for everything causes problems e.g. this week a question about model.read(String, String)
where the second string is base URI, not syntax. A legacy thing - we are where we are.
Lang
in particular,may not even be implemented. Lang
also cover result sets.
It's a bit clunky in Java but there again, a programming language with all possible features makes long-term readability suffer.
The way I think this would be done is via: public RDFParserBuilder set(Symbol symbol, Object value) RDFParser.create().source(...).lang(JSONLD11).set(symJsonldContext, jsonldReadContext).parse(...); This style is done on newer WIP: jena/jena-arq/src/main/java/org/apache/jena/sparql/exec/QueryExecDatasetBuilder.java Line 115 in 061a713
where we have public QueryExecDatasetBuilder set(Symbol symbol, Object value)
public QueryExecDatasetBuilder set(Symbol symbol, boolean value)
public QueryExecDatasetBuilder context(Context context)
This avoids |
Then we can distinguish with two different symbols, one for context and one for context loader, yes? |
Yes. The (RIOT) context is a map of symbols (best named with URIs for global uniqueness) to Objects. So (JSON-LD) context setup and the (Titanium) context loader can be (RIOT) context entries. |
No description provided.