Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider whether to address JSON-LD Support, RDF Vocabulary and SPARQL #121

Open
timgdavies opened this issue Oct 5, 2018 · 6 comments
Open
Assignees

Comments

@timgdavies
Copy link
Contributor

I've been exploring how we can work with BODS data as a Linked Data graph.

At least at a basic level it is possible to convert BODS JSON -> JSON-LD -> RDF with the addition of a short '@context' element.

A worked example of that is available here using the @context:

{
"@context": 
    {
      "@vocab": "http://bods.openownership.org/ns/",
      "@base": "http://example.bods.openownership.org/statements/",
      "statementID": "@id",
      "statementType": "@type",
      "describedByEntityStatement": {
        "@type": "@id"
      },
      "describedByPersonStatement": {
        "@type": "@id"
      }
    }
}

Simply adding this to the top of a BODS file, and running it through a JSON-LD parser returns a pretty easy-to-work with graph, which can then be explored with SPARQL. Using SPARQL 1.1 property paths, the data is navigable.

I've got jupyter notebook exploring converting and querying the data here.

Implications

  • It may be useful to create worked examples in the documentation showing how this approach can help exploring BODS data. (Although understanding of SPARQL varies, it is a good general query language for graph data, so potentially v. useful as a teaching tool about BODS structures);

  • It may be useful to maintain a canonical JSON-LD @context

  • There may be value in converting the schema to an ontology.

Limitations

If we want to provide an actual ontology, with properties living at the locations implied by the @context, then we would hit a problem when the current schema re-uses term with different descriptions and semantics depending on where it appears in the JSON tree.

For example, at present the schema has:

  • personStatement/identifiers - defined as "One or more official identifiers for this person. Where available, official registration numbers should be provided."

and

  • entityStatement/identifiers - defined as "One or more official identifiers for this entity. Where available, official registration numbers should be provided."

The @context converts both of these instances to the graph property <http://bods.openownership.org/ns/identifiers>.

If we were to provide schema information returned at http://bods.openownership.org/ns/identifiers, then there would be ambiguity over the definition of this property. The options here I think would be to:

  • Never provide an RDF vocabulary, and leave any properties and non-deferenceable
  • Give a more detailed @context (can this be done?)
  • Use different property names whenever definition is different

I don't think any of this is major issue right now, but documenting for consideration.

@timgdavies timgdavies changed the title JSON-LD Support and ontology JSON-LD Support, RDF Vocabulary and SPARQL Oct 5, 2018
@timgdavies timgdavies self-assigned this Mar 23, 2020
@timgdavies timgdavies changed the title JSON-LD Support, RDF Vocabulary and SPARQL Consider whether to address JSON-LD Support, RDF Vocabulary and SPARQL Mar 23, 2020
@Jeffrey04
Copy link

so is publishing the standard as RDF being considered? Would you also consider compatibility to popolo schema too?

https://www.popoloproject.com/

@stevenday
Copy link

@Jeffrey04 - this is not something we currently have in scope for version 1.0. Do you have a use case for it that you could elaborate on?

On Popolo, our initial data modelling looked at a wide range of data standards, including Popolo and I think our personStatement and entityStatement map across quite well to the basic fields of Popolo Persons and Organizations. The naming conventions might be slightly different, but I understand (for example) that Sinar Project's Politikus: https://sinarproject.org/transparency can produce Popolo and BODS representations of the same data quite easily.

Is there something specific that you'd like us to do more of in this regard, or should we just flag up the existing similarities better? Perhaps drop us an email on support@openownership.org if you'd like to discuss a project in more detail, this issue tracker is really just used for proposals to the standard.

@Jeffrey04
Copy link

the import script that i build for sinarproject https://github.com/Jeffrey04/popit_relationship is very loosely designed to be somewhat compatible with RDF (I include enough data into the imported cache so a proper RDF graph can be generated if needed), so I thought it would be nice if this is published as RDF as I see some overlap between this and popolo

@Jeffrey04
Copy link

image

This is the example graph we generated, besides popolo spec, we also used https://vocab.org/relationship/ for relationship between people

@StephenAbbott
Copy link
Member

StephenAbbott commented Jul 31, 2022

Updating this issue with work that Open Ownership did with Blue Anvil in 2020 and 2021 to model a Resource Description Framework (RDF) vocabulary for BODS ht @cosmin-marginean.

Here's a document discussing the principles and technical details of this proposal:
https://docs.google.com/document/d/1vej-UkK7QtmfKrmU6aD15vceIzJDsCv1jbHCJWgn9hs

And a GitHub repository containing related tooling and code samples, as well as some SPARQL queries, which show the vocabulary working in practice.

This work was written up on the Open Ownership website: https://www.openownership.org/en/blog/an-rdf-vocabulary-for-beneficial-ownership-data-created-with-blue-anvil/

@StephenAbbott
Copy link
Member

On 21 September 2023, Open Ownership announced a proof-of-concept project called BODS risk detection to demonstrate the use of BODS data in RDF format https://www.openownership.org/en/blog/spotting-risks-by-combining-beneficial-ownership-public-procurement-and-sanctions-data/

See https://github.com/openownership/bodsriskdetection

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants