Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification of character encoding definition #88

Open
jakubklimek opened this issue Mar 3, 2024 · 1 comment
Open

Clarification of character encoding definition #88

jakubklimek opened this issue Mar 3, 2024 · 1 comment
Labels
alignment:dcat-ap-3.0 DCAT-AP 3 alignment related release:3.0.0 Actively being worked on for GeoDCAT-AP 3.0.0 status:fixed Resolution applied in draft type:improvement Improvement of current handling of a problem webinar:2024-03-12 To be discussed in the 2024-03-12 webinar

Comments

@jakubklimek
Copy link
Contributor

jakubklimek commented Mar 3, 2024

Problem statement
In GeoDCAT-AP 2.0.0, the usage note of character encoding refers to the whole instance of the class on which it is used.

  • For Catalogue Record:

This property SHOULD be used to specify the character encoding of the Catalogue Record

  • For Distribution:

This property SHOULD be used to specify the character encoding of the Distribution

This is confusing, as it actually refers to:

  • Catalogue Record: the textual metadata properties used on the Dataset linked using foaf:primaryTopic
  • Distribution: the textual content of the downloadable file linked using dcat:downloadURL or findable using dcat:accessURL, or in the output of the data service linked using dcat:accessService

Proposal
Follow the pattern similar to the definition of CatalogueRecord.language in DCAT-AP saying

A language used in the textual metadata describing titles, descriptions, etc. of the Catalogued Resource.

and change the definition to:

  • Catalogue Record:

A character encoding used in the textual metadata describing titles, descriptions, etc. of the Catalogued Resource.

  • Distribution:

A character encoding used in the downloadable file or output of the data service represented by the Distribution.

This may have also impact on the Distribution.language definition in DCAT-AP as it currently says simply

A language used in the Distribution.

@jakubklimek jakubklimek added alignment:dcat-ap-3.0 DCAT-AP 3 alignment related release:3.0.0 Actively being worked on for GeoDCAT-AP 3.0.0 type:improvement Improvement of current handling of a problem webinar:2024-03-12 To be discussed in the 2024-03-12 webinar next-webinar To be discussed in the next webinar labels Mar 3, 2024
@jakubklimek jakubklimek added status:resolution-provided Resolution statement present, not yet applied in draft and removed next-webinar To be discussed in the next webinar labels Mar 21, 2024
@jakubklimek
Copy link
Contributor Author

Based on the discussion in the webinar, Distribution usage note will be changed to

A character encoding used in the downloadable file or output of the data service represented by the Distribution.

There was a counterproposal for CatalogueRecord formulated: Drop character encoding from CatalogueRecord, as in RDF, UTF-8 is always used for all data (which is not the case for XML). It was approved.

@jakubklimek jakubklimek added status:fixed Resolution applied in draft and removed status:resolution-provided Resolution statement present, not yet applied in draft labels Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alignment:dcat-ap-3.0 DCAT-AP 3 alignment related release:3.0.0 Actively being worked on for GeoDCAT-AP 3.0.0 status:fixed Resolution applied in draft type:improvement Improvement of current handling of a problem webinar:2024-03-12 To be discussed in the 2024-03-12 webinar
Projects
None yet
Development

No branches or pull requests

1 participant