-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset citation [RDSC] #61
Comments
That information is already in the metadata. Is the requirement that there is a separate property that contains the text of a citation? A problem might be that there are several citation styles. Would it not be more sensible to derive the citation from the information already there in the metadata? |
+1 from me. However, the problem is that DCAT does not include all the metadata elements required for creating a citation (in particular, dataset authors). An additional issue is that, since DCAT is not meant to specify mandatory, recommended and optional metadata elements, the relevant information may be missing. To address this, an option could be to include a guidance section saying that, if you want to support data citation, you need to include dataset authors, etc. About which metadata elements should be recommended, as you say there are different citation styles, possibly requiring different information. Here my suggestion is to follow DataCite, and recommend (at least) the mandatory elements they define (which correspond to the minimal set of information to create a citation). An overview of these elements is available in the background document of the work we did to map DataCite to DCAT-AP (go to section "DataCite and DCAT-AP at a glance"): https://ec-jrc.github.io/datacite-to-dcat-ap/#comparison The DataCite to DCAT-AP mappings are here: https://ec-jrc.github.io/datacite-to-dcat-ap/#mapping-summary |
A Use Case based on this for the automated generation of a citation suggestion using provenance elements is recorded at http://patterns.promsns.org/usecase/46. |
I suggest to untag "quality", as the requirement itself is not directly related to data quality. |
I'd like to revitalize this issue in light of the work since the last post here on profiles. It may be possible to make a profile (a Profile Dec Ont Profile) of DCAT2018 that adheres to DataCite. Then, if the Profile includes an Implementation Resource Descriptor that, in turn, includes machine-readable constraints, it will be possible to validate whether a particular instance of DCAT2018 adheres to the profile and thus if it's able to be used to generate DataCite-specified citations. Specifically: |
@nicholascar While I see all of this as being possible, it may be beyond our stated requirements. In other words, we don't disallow the creation of such profiles, but nothing in our profile requirements would lead us to present this as anything more than one of many possibilities. I don't think we should be looking at anything more than the bare bones of the requirement, which is to include information in DCAT that could support citing the dataset. |
@kcoyle you’ve interpreted the suggestion the wrong way around: this isn’t a proposal for out-of-scope extra profile work but a proposal to satisfy the requirement RDSC. I think we can satisfy it using profiling so if this group wishes to address the requirement, then they could do it via profiling. The requirement has already been ruled in scope so now it’s a matter of deciding how to implement it. |
I agree with @nicholascar that looking at implementing this requirement would be providing a DCAT profile for data citation (thus indicating what are the required metadata for enabling data citation), in addition to making sure that we have indeed all the required properties within DCAT, as @andrea-perego pointed out. As this data citation profile would consider DataCite, this issue is related to #152 |
@agbeltran Actually, offering a DCAT profile is out of scope, and is directly stated to be out of scope in the group charter. If DCAT has the needed properties, then nothing needs to be done. If it does not, then this is a DCAT requirement. |
@kcoyle ok, thanks - yes, the requirement states "provide a way to specify information for data citation" so we should focus on making sure that all metadata can be represented (e.g. dataset authors); providing the profile would be the way to ensure that all information is available in a given dataset description. Good to know that is out of scope, but it would be a neat example of the use of a profile. |
@kcoyle , @agbeltran , @nicholascar , An option is to adopt a more general approach, and relate it to a use case of "data citation". For instance, we can say that, if you want to publish records enabling data citation, you should include these information. We can also mention DataCite, but, after all, the information needed for data citation has not been invented by DataCite: it is the basic information that has been traditionally used for bibliographic references. Maybe, this use scenario can be added just as one of the example at the beginning of the DCAT spec. |
As an FYI, the Dublin Core approach to vocabularies (ontologies) vs. profiles is that a vocabulary should be defined with a minimum semantic commitment, and that vocabulary usage (cardinality, value ranges, etc.) is the role of the profile. The upshot of this (and I believe this is compatible with RDF) is that for most uses a vocabulary alone will not be sufficient. So a vocabulary becomes a kind of building block, but the profile turns it into a usable structure. You can then design your vocabulary for maximum reuse. |
See some notes at: https://github.com/w3c/dxwg/wiki/Data-Citation |
One way to look at this is that the vocabulary addresses interoperability of elements within a resource, but profiles address usability and interoperability of the resource itself, with all its contained elements. so DCAT supports this, but does not mandate it, so it seems an important example of how DCAT works with other vocabularies, and we can introduce a hypothetical profile as a recommended way to use DCAT when a such choices need to be made consistently within a community of practice. |
Initial notes in PR: #321 It can be viewed here: https://rawgit.com/w3c/dxwg/agb-data-citation/dcat/index.html |
This PR has been merged and now the rawgit link won't work. Now the additions are visible in the current version of the document: |
As agreed in today's call, this issue can be closed. |
Dataset citation [RDSC]
Provide a way to specify information required for data citation (e.g., dataset authors, title, publication year, publisher, persistent identifier)
Related use cases: Requirements for data citation [ID10]
The text was updated successfully, but these errors were encountered: