Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper semantic type support #343

Closed
pwalsh opened this issue Dec 18, 2016 · 14 comments
Closed

Proper semantic type support #343

pwalsh opened this issue Dec 18, 2016 · 14 comments

Comments

@pwalsh
Copy link
Member

pwalsh commented Dec 18, 2016

Context

I raised having a model spec in frictionlessdata/frictionlessdata.io#854 as model in Fiscal Data Package is basically a way to declare semantic types over one or more resources, and it is very very useful and generally applicable.

Recently, I'm diving into rdfType on the JSON Table Schema spec, and the way it is specified there is not very useful:

  • It is a field-level property
  • It provides no guidance on how the value should be serialised as a property on an instance of an RDF class
  • It misses the most compelling use case which is to capture multiple fields as properties of a larger concept. This is exactly what model does for Fiscal Data Package.

I think we should:

  1. Deprecate rdfType as a field-level property in the JSON Table Schema spec
  2. Generalise the model spec
  3. Allow model to be declared on a resource (map multiple fields in a single resource to higher-level concepts), and on a package (map across multiple resources)

We have a real living use case for this right now in Fiscal Data Package, and we also have some implementation if it in OpenSpending which is not yet officially in the specification.

I have two short term goals here:

  1. Iterate on the next version of Fiscal Data Package with more explicit emphasis on a set of SHOULD semantic types, in line with the great work on OBEU Data Model, and the work that @akariv has done with OS Data Types.
  2. Have an easy, generic way to implement semantic type support across resources and packages.
@rufuspollock
Copy link
Contributor

@pwalsh very sound suggestion. I think we may want to think a bit further about whether rdfType offers some value before we deprecate.

One important point: is this post or pre v1?

@pwalsh
Copy link
Member Author

pwalsh commented Dec 20, 2016

post v1.

@rufuspollock rufuspollock added this to the Backlog milestone Dec 21, 2016
@pwalsh pwalsh self-assigned this Feb 5, 2017
@pwalsh pwalsh modified the milestones: v1.1, Backlog Feb 5, 2017
@ppKrauss
Copy link

Hi, I need to study better all issues here... But there are perhaps an alternative, preserving rdfType and enhancing it, see https://discuss.okfn.org/t/enhancing-table-schema-with-resources-rdftype-and-fileds-rdftype-property/5365

@danfowler
Copy link
Contributor

Perhaps it's also worth reaching out post-v1 to data.world to determine their motivation for using this field in their published Data Packages:

screen shot 2017-05-25 at 16 41 48

@rufuspollock
Copy link
Contributor

@ppKrauss i definitely think expanding rdfType to be properties as well as classes would make sense -- and I thought about that when doing first pass on it. This is something worth potentially doing for v1 (if so we should create a dedicated issue).

@ppKrauss
Copy link

Hi @rufuspollock, I am trying to create a dedicated issue, see #451.

Hi @danfowler, your example is a kind of "datatype defined by URL" (?)... Perhaps, to avoid confusion in the semantic-use of rdfType, we need another descriptor, eg. typeUrl.

@ioggstream
Copy link

ioggstream commented Jul 11, 2022

Since rdfType is limited to an rdfs:class it is often redundant:

  • the syntactic type is defined in the frictionless schema (e.g. type: string) so when the rdfs:Class is an rdfs:Datatype we could end in conflicting information;
  • limiting the metadata description to be an rdfs:Class rules out semantic information like properties, e.g. the following is not valid because schema.org/givenName is not an rdfs:Class
fields:
- name: given_name
  type: string
  rdfType: https://schema.gov/givenName

The rdfProp here #451
allows instead to map CSV to RDF

given_name,family_name
Roberto,Polli

->

_:b1 https://schema.org/givenName "Roberto" ;
        https://schema.org/familyName "Polli" ;

In general, a more comprehensive approach that does not work on each single field but on the whole table
can simplify the resolution of this issue, e.g. attaching a json-ld context outside fields.

@roll roll removed this from the v1.1 milestone Apr 14, 2023
@roll roll unassigned pwalsh Jan 3, 2024
@nichtich
Copy link
Contributor

nichtich commented Jan 7, 2024

Mapping tabular data to RDF should build on JSON-LD. Maybe we can define a mapping from Table Schema to a JSON-LD context document and a way to transform tabular data using this schema to JSON which can then be transformed with the JSON-LD context. May sound complicated but less complicated than defining a custom method of mapping from tabular data to RDF.

@ioggstream
Copy link

@nichtich in Italy we are publishing APIs using this specification to bridge JSON Schema and JSON-LD using JSON-LD Framing https://www.ietf.org/archive/id/draft-polli-restapi-ld-keywords-03.html

An easy way to integrate with this specification, is the following:

  1. associate a JSON-Schema to a frictionless data table;
  2. populate the x-jsonld-type and x-jsonld-context keyword in the JSON-Schema;
  3. use the JSON-LD Framing specification to convert the data to JSON-LD and back.

These slides provide some more context https://docs.google.com/presentation/d/1o__StiDhbNEfB0Htf-lsTtLm8zzSXmLiyD9QSKxkXIQ/edit#slide=id.g13c8404df7f_0_24

cc: @mfortini

@nichtich
Copy link
Contributor

nichtich commented Jan 9, 2024

I wrote:

a mapping from Table Schema to a JSON-LD context document [...] May sound complicated

Actually it only requires an optional root property @context (or another name) to hold the JSON-LD context and an optional field property (e.g. definition) to hold a JSON-LD term definition for a field. An example (taken from JSON-LD example 40):

{
  "@context": {
    "foaf": "http://xmlns.com/foaf/0.1/",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "name": "foaf:name", // term definition in root context
  },
  "fields": [
    { "name": "name", "type": "string" }, // term definition given above 
    {
      "name": "age",
      "type": "integer", 
      "definition": { // term definition inline at the field
        "@id": "foaf:age",
        "@type": "xsd:integer"
      }
    },
    {
      "name": "homepage",
      "type": "string"
      "definition": {
        "@id": "foaf:homepage",
        "@type": "@id"
      }
    }
  ]
}

This can be combined into one JSON-LD context:

{
  "@context": {
    "foaf": "http://xmlns.com/foaf/0.1/",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "name": "foaf:name",
    "age": {
      "@id": "foaf:age",
      "@type": "xsd:integer"
    },
    "homepage": {
      "@id": "foaf:homepage",
      "@type": "@id"
    }
  }
}

The content of @context and definition and how to use these to transform data to RDF are out the scope of Table Schema specification as it's all defined in JSON-LD. People and applications not interested in RDF can just ignore both fields.

@ioggstream
Copy link

Hi @nichtich, I think there's more than meet the eyes. For example:

  • adding a @context directly in Table Schema actually makes the schema a JSON-LD document, which is not. This is the reason why in LD-Keywords we use specific keywords (e.g., x-jsonld-context and x-jsonld-type)
  • every field could have its own context (e.g., when the entries refer to a vocabulary).

I think that we could consider an extension to Table Schema based on LD Keywords, provided some interoperability checks.

Another option could be to use JSON-Schema instead of Table Schema for this use case: this could benefit people that need to publish REST APIs based on frictionless data.

WDYT?

@pwalsh
Copy link
Member Author

pwalsh commented Jan 10, 2024

Hey @nichtich @ioggstream I'm not sure about this. This is actually an issue I started 7 years ago and to my knowledge there is very little interest in really trying to add comprehensive RDF/JSON-LD support directly to the specs, which is where the recent discussion has led.

Do we really need it in the spec, and why?

I think it would be very cool if we have implementations or published patterns on mapping, say, Table Schema to JSON-LD. And possibly some of the work @akariv did on column types for Fiscal Data Package ref. which does have a semantic concept layer, could be useful as a more general starting point.

@nichtich
Copy link
Contributor

@ioggstream good points. We can use another name but @context, e.g. x-jsonld-context as defined by LD-Keywords fully serves the purpose and be enough. As far as I understand LD-Keywords, the x-jsonld-type can be mapped to the current rdfType by expanding its value to a full URI.

Another option could be to use JSON-Schema instead of Table Schema

Sure there are alternatives to Table Schema specification and we should not try to fully reinvent them. If support of information how to map tabular data to RDF is possible without much complexity, I'd argue to include it in the specification, nevertheless.

every field could have its own context

I don't see a use case as most parts of JSON-LD context can already be specified in JSON-LD term definition. In short, the root element x-jsonld-context (or another name) would already be enough. An additional element at field level (``x-jsonld-term` or another name) would be just a convenient shortcut.

@ioggstream
Copy link

In short, the root element x-jsonld-context (or another name) would already be enough

In Italy, there was the idea to standardize CSV data exchanges between agencies using frictionless: I think @mfortini is still working on that. Since public sector data can be complex, the idea was to have a general and reliable solution capable of covering most of the use cases.

@frictionlessdata frictionlessdata locked and limited conversation to collaborators Oct 21, 2024
@roll roll converted this issue into discussion #996 Oct 21, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
Status: Done
Development

No branches or pull requests

7 participants