fix: Redesign scoped context in jsonld #750

Panaetius · 2019-10-09T15:37:42Z

Closes #749

JSON-LD compaction goes through two phases, which we let pyld handle for us, namely:

Expansion: short form IRI's like schema:isPartOf get translated to full form IRI's like http://schema.org/isPartOf. Only values/nodes that have a corresponding entry in the @context or scoped @contexts inside the nodes/values themselves will get expanded, with other values getting dropped
Compaction: A new, separate context is supplied that is used to compact the JSON-LD from the previous step to a simpler form, replacing the absolute IRI's with the shortform if possible and removing values/nodes not found in the new context. This step ONLY pays attention to the supplied context, not to the one(s) found in the original JSON-LD document.

For instance, the following JSON-LD document:

{
  "@context": {
    "@version": 1.1,
    "name": "http://schema.org/name",
    "interest": {
      "@id": "http://xmlns.com/foaf/0.1/interest",      
    }
  },
  "name": "Manu Sporny",
  "interest": {
    "@id": "https://www.w3.org/TR/json-ld/",
     "@context": {"@vocab": "http://xmlns.com/foaf/0.1/"}
    "name": "JSON-LD",
    "topic": "Linking Data",
  }
}

together with using the top-level context of the document

"@context": {
    "@version": 1.1,
    "name": "http://schema.org/name",
    "interest": {
      "@id": "http://xmlns.com/foaf/0.1/interest",      
    }
  }

while being perfectly valid JSON-LD and a perfectly valid context, would delete the topic entry on compaction, since only interest and name are specified in the context. topic would be specified in the "@context": {"@vocab": "http://xmlns.com/foaf/0.1/"} line when reading the document, but any context inside the document is not applied when compacting.

So to not lose the topic field, we'd have to either manually add all fields of child nodes to the context supplied to the compaction function (leading to problems if you have the same name for two fields with a different Ontology, and making it quite hacky to know what has to be added to the context, which is what we were doing until now.

Or we can use scoped (sub-)contexts, such as:

{
    "@version": 1.1,
    "name": "http://schema.org/name",
    "interest": {
      "@id": "http://xmlns.com/foaf/0.1/interest",
      "@context": {"@vocab": "http://xmlns.com/foaf/0.1/"}
    }
  }

(See the @context added in interest, which is a scoped context that only applies to interest entries)

If we supply this context to compaction, all fields are there successfully. Since we don't want two methods to calculate contexts, it makes sense to add this enhanced, nested context (instead of the current flat one) to the metadata files etc. as well. Since this leads to the whole context being at the top level of a document, we don't need the contexts inside the values (as in the original example) anymore, as that'd just be duplication.

This PR does exactly that, automatically expanding entries in the top-level context with their respective child-contexts, removing the contexts inside values.
Collection types already had code to propagate their contexts up, though the implementation was broken and the code never reached.

jsonld.ib now has a new type parameter that allows the type (or fully namespaced string representation of the type to help with dependency hell) to be set for a property, which will automatically add it's context to the toplevel @context on load.

So for a collection, it's still (Example is for the Dataset class)

creator = jsonld.container.list(
        Creator,
        converter=_convert_dataset_creator,
        context='schema:creator',
        kw_only=True
    )

with Creator being the class whose @context will get added to the Datasets @context (if Dataset is the top level element)
And for single valued properties (one-to-one relations), like project.creator, it is now

creator = jsonld.ib(
        default=None, kw_only=True, context='schema:creator', type=Creator
    )

with type=Creator being new.

Old contexts get automatically adjusted on loading (Since the code that caused all this now actually does its job) and if they're persisted will have the new JSON-LD. Child level property names don't have to be manually added to the top level context anymore.

As this is a rather drastic rewrite of how we handle JSON-LD, we should be really sure it works before merging.

Other changes:

Creator has been moved to its own file, to prevent circular dependencies.
All JSON-LD now specifies that it's version 1.1, as otherwise scoped @context are not available.
Fixed a bug due to JSON-LD compacting single-element arrays to just the element (Screwing with DatasetTag collections if there's only 1 tag)
Fixed a bug with a raised exception that caused an exception itself
Added schema.org definition to Activity as it's needed for CommitMixin (This should not have to be done on every class inheriting from CommitMixin! we should fix this). This change is also in renku log to generate single identifier for dataset imports of the same resource #719, as it was needed in both places.

jsam

Looks great. I've played with it a bit and it looks that nothing broke. Thanks a lot for looking into this. Let's resolve conflicts and get this merged! 🚀

jsam

Thank you.

Ralf Grubenmann added 5 commits October 9, 2019 12:41

Adds @context to toplevel @context instead of to values

2753f78

Fixes @context missing @id

36d84e2

Annotates json-ld properties with type

b7d3830

Merge branch 'master' into 749-json-ld-compacting

3041922

Fixes unit tests

5fbae23

Panaetius requested a review from a team as a code owner October 9, 2019 15:37

Ralf Grubenmann added 2 commits October 9, 2019 17:42

cleanup

07eaebc

Adds missing type declaration

79ee728

Panaetius mentioned this pull request Oct 10, 2019

JSON-LD Triple generation does not work for all attributes #725

Closed

Panaetius added the needs ✋ testing label Oct 10, 2019

Ralf Grubenmann and others added 2 commits October 10, 2019 16:21

Merge branch 'master' into 749-json-ld-compacting

bc6fb0c

Merge branch 'master' into 749-json-ld-compacting

075fdee

jsam previously approved these changes Oct 15, 2019

View reviewed changes

Panaetius dismissed jsam’s stale review via 05af025 October 15, 2019 14:37

Ralf Grubenmann and others added 2 commits October 15, 2019 16:37

Merge branch 'master' into 749-json-ld-compacting

05af025

Merge branch 'master' into 749-json-ld-compacting

9302c81

jsam approved these changes Oct 15, 2019

View reviewed changes

Panaetius merged commit 2b1948d into master Oct 16, 2019

Panaetius deleted the 749-json-ld-compacting branch October 16, 2019 06:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Redesign scoped context in jsonld #750

fix: Redesign scoped context in jsonld #750

Uh oh!

Panaetius commented Oct 9, 2019 •

edited

Loading

Uh oh!

jsam left a comment •

edited

Loading

Uh oh!

jsam left a comment

Uh oh!

Uh oh!

fix: Redesign scoped context in jsonld #750

fix: Redesign scoped context in jsonld #750

Uh oh!

Conversation

Panaetius commented Oct 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jsam left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jsam left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Panaetius commented Oct 9, 2019 •

edited

Loading

jsam left a comment •

edited

Loading