Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

for dcat:byteSize the range is dct:decimal: will this change to xsd:positiveInteger? #214

Open
sabinem opened this issue Dec 7, 2021 · 3 comments
Labels
release:3.0.0 https://semiceu.github.io/DCAT-AP/releases/3.0.0 status:fixed This issue has been fixed in a draft.

Comments

@sabinem
Copy link

sabinem commented Dec 7, 2021

We are currently improving the DCAT-AP conformance of DCAT-AP CH: in this regards I have 2 questions regarding dcat:byteSize:

We currently have it as rdfs:Literal typed as xsd:decimal, but we implemented it as rdfs:Literal typed as xsd:integer on our open data portal:

My questions are:

  1. does the implementation conform to the specification as dct:integer is a derived type from dct:decimal or do we need to change our implementation to be conformant to our own specification and also to DCAT-AP
  2. I noticed that DCAT version 3 recommends to recommend rdfs:Literal typed as xsd:nonNegativeInteger (https://www.w3.org/TR/vocab-dcat-3/#Property:distribution_size) instead of rdfs:Literal typed as xsd:decimal that it had in version 2 (https://www.w3.org/TR/vocab-dcat-2/#Property:distribution_size) Will DCAT-AP go along with this? Does it even make sense then to move away from our integer implementation?

I would appreciate to learn where DCAT-AP intends to go go with this and also the the conformance of derived types such as xsd:Integer to base types such as xds:decimal (see here on the hierarchy of xsd types: https://www.w3.org/TR/xpath-datamodel-3/#types-hierarchy)

@bertvannuffelen
Copy link
Contributor

@sabinem this is a non-trivial issue. And the problem is more on the client side than on the "semantics side". On the semantics side, the dependency between the xsd types can be / is considered part of the specification. So using xsd:integer instead of xsd:decimal is acceptable for an implementation.

However the problem is on the client side: if the client does not include this reasoning, then it might consider this as incorrect.
Take for instance our SHACL templates: unless you encode in the SHACL all compatible xsd types for xsd:decimal a SHACL validator will derive it is incorrect. As nobody can control the amount of inference a client will do, this is an unsolvable issue.

So if your profile uses a more restrictive range definition this is all fine according to me. Note that XSD allow to create your own datatypes (with min and max constraints): using these are also fine.

Today there are no expectations formulated that a client application must implement dereferencing (being on types, definitions, profiles, or data entities).

On your second question: is there an intend to align with w3c DCAT? Then the answer is positive.

But this case shows a funny opposite case. The specification is exploiting more from xsd then the developers actually do. I seldom see values been typed as xsd:nonNegativeInteger. Most implementations will internally use the integer datatype and block the negative numbers by programs. (So far I have not encountered a programming language that has as initialisation like var x = NonNegativeInteger())
In order to comply to this xsd typing requirement, implementers have to code an additional rule, namely at the type of producing the RDF representation the value must be a xsd:NonNegativeInteger otherwise not only a business error is present, but also a serialization error. This might lead to complains by the RDF parser, previously not present, becoming a major issue as the whole input will be disregarded (a parser is binary: parseable or non-parseable). So in order to resolve a parser issue, a developer has to investigate the problem. That means as long there is no person looking into it, the source cannot be harvested. As this is before SHACL validation (business-level) can happen. This is tedious & complex process. As the source of the parsing problem is not a technical problem, but a business problem that has been turned into a technical problem.

So from a "semantical perspective" the xsd:nonNegativeInteger is fine, but from the implementation perspective I would recommend not to do it, and leave the implementation of non-negativeness to the implementers.

Out of the book of having fun with harvesting: In this category of problems are also "spaces in URIs": many non-native RDF systems create URIs by concatinating values for their DCAT export. But they do not check if the produced URI is a valid one. And then the harvesting fails in a parse error.

@sabinem
Copy link
Author

sabinem commented Jan 12, 2022

@bertvannuffelen Thank you so much for your detailed and thorough answer on this. It helped my a lot to better understand the issue at hand. Regarding the harvesting I agree. I have also encountered that issue with URIs before. Regarding the implementation of a NonNegativeInteger I have no experience. But thanks for your opinion that it would not be advisable to implement this. This probably means that DCAT-AP also might not turn this way any time soon, even though it would be an option for semantic reasons.

@bertvannuffelen
Copy link
Contributor

On this issue I like to refer to an issue raised in W3C DCAT: w3c/dxwg#1536 to be aware when using native json numbers in the REST API payloads and converting it to DCAT.

@bertvannuffelen bertvannuffelen added the release:3.0.0 https://semiceu.github.io/DCAT-AP/releases/3.0.0 label Jun 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release:3.0.0 https://semiceu.github.io/DCAT-AP/releases/3.0.0 status:fixed This issue has been fixed in a draft.
Projects
None yet
Development

No branches or pull requests

2 participants