Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOIs for CF Convention releases? #127

Closed
rsignell-usgs opened this issue Jan 18, 2018 · 101 comments · Fixed by #443
Closed

DOIs for CF Convention releases? #127

rsignell-usgs opened this issue Jan 18, 2018 · 101 comments · Fixed by #443
Labels
enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format

Comments

@rsignell-usgs
Copy link
Member

rsignell-usgs commented Jan 18, 2018

Seems like getting a new DOI for each release of CF would be a good idea.

And getting a DOI is pretty easy for GitHub releases:
https://guides.github.com/activities/citable-code/

What do folks think?


In March 2023 the CF governance panel decided to use Zenodo fo CF DOIs, as reported by Ethan @ethanrd. After the annual meeting in September 2023, Gui @castelao prepared pull request 443 to support CF's adoption of GitHub/Zenodo integration.

@davidhassell
Copy link
Contributor

An excellent idea, I think.

@cf-metadata-list
Copy link

cf-metadata-list commented Jan 18, 2018 via email

@ethanrd
Copy link
Member

ethanrd commented Jan 18, 2018

Another option is to have a single DOI and recommend that users include the version number when citing CF.

What URL should result when dereferencing a CF DOI? I would think either the main CF web page or the current CF specification document.

@neumannd
Copy link
Contributor

It sounds like a good idea to assign DOIs for the cv convention documents. The content, to which a DOI points, has to be invariable. Therefore, a DOI can only be assigned to a particular version of the cf convention document and not to the cf conventions in general.

@davidhassell
Copy link
Contributor

I know that on some DOI services (e.g. https://zenodo.org/) you can have a unique DOI for each release, but also generic DOI that always resolves to the latest version. I don't know if this feature is ubiquitous, though.

For instance, https://doi.org/10.5281/zenodo.832255 resolves to the latest version of cf-python, whatever it may be. Right now it's v2.1, and v2.1 has it's own DOI https://zenodo.org/record/1039367

@ethanrd
Copy link
Member

ethanrd commented Jan 18, 2018

The DOI itself is permanent, the URL that results from dereferencing the DOI can be changed. The object/concept the DOI identifies should be permanent. What that object/concept actually represents and the possible versioning of that object, I believe, is up to those stewarding that object.

DataCite [1] is the DOI minting service I've used. Their metadata schema [2] includes a field for version information. There are some notes on versioning on page 28 of the "DataCite Metadata Schema Documentation for the Publication and Citation of Research Data" [3] including:

Suggested practice: track major_version.minor_version.

Register a new identifier for a major version change. Individual stewards need to determine which are major vs. minor versions2

Not sure what other DOI minting services recommend or how this might work if using the GitHub DOI minting tie-in with FigShare.

[1] https://www.datacite.org

[2] http://doi.org/10.5438/0014

[3] https://schema.datacite.org/meta/kernel-4.1/doc/DataCite-MetadataKernel_v4.1.pdf

@ethanrd
Copy link
Member

ethanrd commented Jan 18, 2018

Yes, I kind of like the idea of having a top-level DOI and one for each version. Though, more DOIs means more things to maintain and more DOIs to include when tracking citations.

With a top-level DOI and individual version DOIs, what would be the recommended citation? Including the version information in the citation is more transparent (at least to the human eye).

The DataCite metadata schema includes a relationType property that can be used to give a relationship with another DOI-ed resource. It has a controlled list of values that includes HasVersion and IsVersionOf. So, perhaps defining these relationships will help ameliorate some of the issues around multiple DOIs.

@neumannd
Copy link
Contributor

OK, thanks for the clarification. I wasn't aware of that possibiliy.

@rsignell-usgs
Copy link
Member Author

@davidhassell would you be willing to make this happen?

@graybeal
Copy link

OK, looks like I'll be the odd one out here. Let me ask a few questions:

  • What will the DOI(s) be used for that the canonical URLs can not?
  • What capability do the DOIs have that the canonical URLs do not?
  • How will you resolve the duality of two canonical references, one being the DOI and the other being the canonical URL?
  • How will the DOIs representing different versions be recognizably different versions of the same entity/publication?
  • How will the DOIs be recognizably associated with the CF conventions, without having to actually resolve them? (This, at least, there is a known answer to, just want to be sure we are leveraging it.)

I know the community likes DOIs, but I'm not convinced there is any analytical advantage to the function provided by the DOIs.

@dopplershift
Copy link

I completely reject the idea that a URL on the internet is a suitable fixed point of reference. The "canonical URL" for the CF-conventions has changed over time, rendering unusable any publication citation that relied upon that.

DOIs provide a fixed record suitable for citation that is capable of being updated to point to new "landing pages" for the same content.

@dblodgett-usgs
Copy link
Contributor

dblodgett-usgs commented Jan 19, 2018 via email

@dopplershift
Copy link

Sure, everything digital needs upkeep--that's the blessing and the curse.

It's not my area of expertise, so I'm not really qualified to debate this with an informed point of view. therefore when it comes to best practice for long term reference and archival, I'll trust what the experts (i.e. digital library people) tell me to do: DOIs.

@graybeal
Copy link

The only reason canonical UIs have to change is that they have been chosen and managed without regard to their final purpose. (Something that DOIs are also vulnerable to, though I agree not as commonly.) Put me in the Cool URIs Don't Change camp.

@cf-metadata-list
Copy link

cf-metadata-list commented Jan 19, 2018 via email

@davidhassell
Copy link
Contributor

I am happy to make something happen!

The DOI server would, I think, keep a copy of the versioned document(s), thereby decoupling the need for a stable URL.

@graybeal
Copy link

TLDR version: I will not object further nor complain if you go the DOI path (except occasionally with a wink and nudge to close colleagues). Thanks for listening to my input!

I just have a few followups, to fully explain my perspective.

I am not aware of DOI servers being used to archive content. In fact not sure how they would know what to archive, given they just point to another resource, which could have arbitrarily many links to its parts (if the document is maintained as a set of pages, for example). I'm interested to know more.

I accept the judgment of the library community that DOIs are perfect unique identifiers for bibliographic materials, that is their clear community choice. On the other hand, the expert librarians I talk to at Stanford are open to the possibility that DOIs are not the primary references for certain other kinds of digital content. The kind of content where I am most experienced is semantic content, where IRIs are the typical (but not universal) identifier of choice, because of the W3C semantic standards. So, in short, I think one identifier type does not fit all needs.

I accept that DOIs were designed to decouple content; they were poorly designed to resolve content, without knowing what to add to them to make them resolvable. That said, you can generally find a DOI with Google, and yes, DOIs are easy(ier) to re-point by design.

I also concede that the DOI infrastructure is well-enough funded (and consistently-enough-used for this kind of thing) that the DOI infrastructure will not cause as many long-term headaches as most IRIs will. So I will not be trying to argue further, but I do want to note:

  • updating the DOI requires authority to update the DOI
  • over time, that authority must be passed on to others in an organized way, ideally through organizational accounts and permissions
  • if you have not properly prepared your organization for managing the DOIs, you will not be able to update the DOIs without at least some pain and suffering (the more rigorously DOI servers care about transitioning ownership, the more pain and suffering you'll face—since you don't want people stealing your DOI maintenance role from you)
  • you remain at the mercy of the company managing the DOI, and the services they provide.

These realities seem to map one-to-one with the realities of creating IRIs to decouple the content from the particulars of how and where it's served (I recommend Tim Berners-Lee's Cool URIs document, it's a short read and a fun bit of history). Either way, to have a successful persistent identifier, you have to be thoughtful, you have to invest resources in managing the maintenance and succession processes, and you have to understand that this is an indirection service that is run by an organization, one which you may or may not have full control over for the (eternal!) life of the identifier. If you manage those issues, either technology is equally effective, with only minor differences in cost-per-identifier and user pain to resolve the identifier.

@neumannd
Copy link
Contributor

neumannd commented Feb 9, 2018

I just realized that netCDF also has its own DOI as mentioned here:

https://www.unidata.ucar.edu/software/netcdf/docs/faq.html#How-should-I-cite-use-of-netCDF-software

It is written (if the URL does not work at some point in the future):

The registered Digital Object Identifier for all versions of netCDF software is http://doi.org/10.5065/D6H70CW6.

The following can be used as a citation:

Unidata, (year): Network Common Data Form (netCDF) version nc_version [software]. Boulder, CO: UCAR/Unidata. (http://doi.org/10.5065/D6H70CW6)

where year is the year in which the work being described was done and nc_version is the version of netCDF used. For example:

Unidata, (2015): Network Common Data Form (netCDF) version 4.3.3.1 [software]. Boulder, CO: UCAR/Unidata. (http://doi.org/10.5065/D6H70CW6) 

@dblodgett-usgs
Copy link
Contributor

Was there a conclusion to this issue? Is someone going to move it forward?

@taylor13
Copy link

Could we discuss this at the meeting in Reading in June?

@castelao
Copy link
Member

I believe that there was an agreement in Reading to create a DOI for the CF convention documentation. Is that correct? If so, shall we discuss the details on how to do it?

We have a few options on how to implement it. One of them is using Zenodo as suggested by @rsignell-usgs , which would also archive the document itself as mentioned by @davidhassell , and would allow a general DOI grouping all releases as suggested by @ethanrd . I use Zenodo in other projects and it is minimal work to operate in a GitHub environment.

I'm checking an alternative through UCSD library which offers similar resources and I just learned that they operate in a partnership with NCAR. I'll post here once I got some news.

@rsignell-usgs
Copy link
Member Author

@castelao , thanks for picking this issue up again!

@erget
Copy link
Member

erget commented Jul 19, 2018

I was really impressed by Zenodo and think it would be a great idea - lots of benefits, low workload.

@castelao
Copy link
Member

UCSD library could provide that, but they suggested to use Zenodo since it can be integrated with GitHub, which I confirm that is nearly zero maintenance. My contact in the library also mentioned that they trust Zenodo due to the solid institutions that support it.

I canto do the repository setup to connect it with Zenodo automatically if there is a consensus to move this forward.

@ethanrd
Copy link
Member

ethanrd commented Jul 25, 2018

As I recall, the decision at the Reading meeting was to mint a DOI for CF in general rather than for any particular version of any particular document. Is there a way using Zenodo with GitHub to mint a DOI that isn't associated with a particular document/artifact/release?

Or, perhaps the overarching DOI should be tied to the CF web page repo rather than the CF conventions document repo. (Seems an appropriate repo since we want the DOI to dereference to https://cfconventions.org.)

PS David and I have started on a meeting summary document. We'll share it out for comment and such once it isn't quite so rough.

@castelao
Copy link
Member

Sorry for the delay, I'm back.

Thanks for the correction @ethanrd. Yes, I also recall an agreement for a single DOI. Although I would recommend using a master DOI with one child DOI for each release, it is possible to use a single DOI for the CF concept. Thus it would not be associated to a specific version. In that case, I would recommend it to point to the general https://cfconventions.org website, not the repository.

My question is, how to move forward? If nobody says anything against this in 3 weeks, shall I start implementing such single DOI?

@davidhassell
Copy link
Contributor

Creating a single DOI pointing to https://cfconventions.org would be great, I think, and what was decided at the Reading meeting. We didn't decide to not create further DOIs (e.g. for different conventions versions) simply because we couldn't decide in the limited time how best to proceed. These will come later ...

@rsignell-usgs
Copy link
Member Author

Sounds like a thumbs up, @castelao !

@castelao
Copy link
Member

Great! We need to put some information together to move this forward:

  • Are there funding agencies? Which ones?
  • Who are the creators? The list of authors in the main document?
  • There are other categories of contributors that are not creators/authors. Who are the contributors and which category? There are more options if the list below is not enough.
    • Editor
    • Data collector
    • Data curator
    • Project leader
    • Project manager
    • Project member
  • For everyone that go in this DOI, it would be nice to have their ORCIDs, so this DOI is connected directly to each one.

The other fields should be straightforward, but I would print it all here for approval before submitting it.

@ethanrd ethanrd mentioned this issue Feb 1, 2024
1 task
@ethanrd
Copy link
Member

ethanrd commented Feb 1, 2024

I just created a PR (#507) which adds a CITATION.cff file. The .cff file is a possible alternative to creating a separate "How to Cite CF" page. Though it is pretty minimal and clunky so maybe we'll want to do both. For instance, I don't think there's a way to describe citing with the overarching CF DOI vs the CF version DOIs.

@JonathanGregory
Copy link
Contributor

Dear @ethanrd, Gui @castelao, et al.

Thanks for your PRs. #507 by Ethan to add the .cff file looks fine to me. Are you suggesting we add some text also in the CF convention document on "How to cite", Ethan? Shall we do that now, in this issue?

#443 by Gui has some comments outstanding.

  • Whether the licence should be CC0. As Ethan says, some reservations were mentioned a while ago. I had a quick look at the licences and alternatives. There didn't seem an obviously better alternative to me. CC0 is a deed by which we irrevocably allow anyone to use CF metadata for any purpose at all. That is, we put it in the public domain. The alternatives would require some acknowledgement, in a form appropriate to the circumstances, but I think it's really not practical for an acknowledgement to be made in most circumstances where the CF conventions or standard names are used. Does anyone have definite objections to CC0, alternatives to suggest, or corrections to my summary here?

  • Comment on the text for the "description" of CF in the .json file. The text Gui has used is the Abstract of the conventions document, verbatim. That's a logical choice for content, I think. Ethan and @davidhassell have made editorial suggestions, which I agree with, namely to omit "Descriptive information", "[NetCDF]", "[COARDS]" and "\n". The last is present because the Abstract is two paragraphs. With those changes, the .json file would have

"description": This document describes the CF conventions for climate and forecast metadata designed to promote the processing and sharing of files created with the netCDF Application Programmer Interface. The conventions define metadata that provide a definitive description of what the data in each variable represents, and of the spatial and temporal properties of the data. This enables users of data from different sources to decide which quantities are comparable, and facilitates building applications with powerful extraction, regridding, and display capabilities. The CF conventions generalize and extend the COARDS conventions. The extensions include metadata that provides a precise definition of each variable via specification of a standard name, describes the vertical locations corresponding to dimensionless vertical coordinate values, and provides the spatial coordinates of non-rectilinear gridded data. Since climate and forecast data are often not simply representative of points in space/time, other extensions provide for the description of coordinate intervals, multidimensional cells and climatological time coordinates, and indicate how a data value is representative of an interval or cell. This standard also relaxes the COARDS constraints on dimension order and specifies methods for reducing the size of datasets.",

David made suggestions to update this text, to cover UGRID. David, I think it would be better to keep the .json description and the Abstract the same. Would it be OK with you if we conclude this issue first, and then you start a new one to update them both?

This is the oldest open conventions issue, and I'm hopeful that we can conclude it very soon. We're nearly there!

Best wishes

Jonathan

@JonathanGregory
Copy link
Contributor

In website issue 182 Ethan @ethanrd wrote

While there were some comments, there were no objections to the CF Governance Panel decision to license CF with CC0. So we will move forward with implementing CC0 for CF.

I'm repeating that here because the licence has also been mentioned in PR #443 linked to this issue. Since Ethan commented two weeks ago today, I think we can regard the choice of CC0 as having been agreed if no-one objects before next Friday 9th.

@ethanrd
Copy link
Member

ethanrd commented Feb 2, 2024

Hi Jonathan @JonathanGregory - Sorry, we may have rushed things a bit (after a very long wait). The CC0 license has already been implemented in both the conventions repo and the website repo (PR #504 and website PR #440).

The decision was made back in 2022 (see this comment). As you quoted, there was discussion (long after the decision) but no objection. Two weeks ago I created the PRs. With the long delay and given the text is defined by the license/deed, we perhaps rushed the 3-week rule.

@ethanrd
Copy link
Member

ethanrd commented Feb 2, 2024

Hi Jonathan @JonathanGregory,

In terms of "How to Cite", I was actually thinking about a page on the CF website. But I do like the idea of having some mention in the CF Conventions document. Yes, I think we could discuss that here and start up another PR to add citation information. Should we start another issue in the website repo to discuss a "How to Cite CF" web page or can we do that here as well?

@JonathanGregory
Copy link
Contributor

Dear Ethan @ethanrd

The CC0 license has already been implemented in both the conventions repo and the website repo (PR #504 and website PR #440).

Yes, I don't that's a problem. Sorry if looked like I was trying to turn the clock back. The discussion of the licence was mostly in the website repo, but obviously affects the conventions repo as well, and had been mentioned in this issue previously. Because it's an important decision and was an outstanding question on PR #443, I thought it would do no harm to state it clearly in this issue, so that we can be perfectly clear we've followed the usual decision process. I don't expect that anyone will object. Is it OK to leave this issue open for one more week? In any case, we have other things not quite concluded here.

Best wishes

Jonathan

@JonathanGregory
Copy link
Contributor

@ethanrd

In terms of "How to Cite", I was actually thinking about a page on the CF website. But I do like the idea of having some mention in the CF Conventions document. Yes, I think we could discuss that here and start up another PR to add citation information. Should we start another issue in the website repo to discuss a "How to Cite CF" web page or can we do that here as well?

It's a good idea to put it on the website. Maybe we could agree the words, and where to put them on the website, in a website issue, and then return to this issue to decide where to put the same words in the conventions document?

Also, I think it would be appropriate to state the licence in the conventions document as well.

Cheers, Jonathan

@JonathanGregory
Copy link
Contributor

Dear @ethanrd, Gui @castelao and others

Is the DOI for the conventions document specifically, or for CF as a whole, including the standard names? At the moment, #443 lists the authors of the convention document as the authors. That is consistent with what Ethan did for "How to cite", and makes sense if the DOI is for the conventions document.

There is also a list of contributors. I think they are the contributors to the CF convention, excluding the authors, but I'm not sure. Has this been discussed before? I'm sorry, I don't remember. It feels a bit unfair to me to list all the contributors to the conventions but not the contributors to the standard names or to information management. But if we include all these lists, that will be a lot of people, and will require continual maintenance to keep them up to date. Instead of listing contributors by name, is there a way to refer to the CF website for those lists?

Best wishes

Jonathan

@larsbarring
Copy link
Contributor

Hi all,
Sorry for being somewhat slow in coming into the conversation but here are some thought and ideas:

1
In comment Ethan writes about the cff file

I don't think there's a way to describe citing with the overarching CF DOI vs the CF version DOIs

To my understanding (I might very well be wrong here!) the .cff file is useful a as file in the repo as such, but does not add much to a website or a html document. So I guess that we could have one cff file for the cf-conventionsrepo and another for the cf-convention.github.io repo. The reason I am suggesting this is that a general citation of the "CF Convention" or "Climate and Forecasting Convention" as a community and community activity might usefully point to the website https://cfconventions.org/ where the "authors" should, I imagine, be the "CF Community". A citation of the "CF Conventions document" should include the author list.

2
I find it somewhat inconsistent to have a cff file citing the original authors by name and the add the other authors as the "CF Community", whereas the Document as such lists all authors by name. As far as I know the "house rules" varies quite a lot among publishers regarding rules for number of authors. Some allows three author names, others only one before the"et al." kicks in, others again allow quite long author lists. If someone citing CF is using a reference management software such details are often handled automatically when formatting the reference list. And as it now the cff file is at risk interfering with this. Hence I suggest that the PR#507 is updated to include all authors.

3
Regarding where to put the licence information on the web page, I see two alternative places that are "persistent" across all(?) pages of the website: Either the black banner/menu at the top, or the footer that currently shows:

Contact the CF community with questions, comments and suggestions about CF metadata or this website

This site is open source. Improve this page!

I don't know how these headers/footers are set up, but I imagine that the Info Management team could fix this.

4
In the pdf version of the Conventions document it should be quite easy to add a reference to the CC0 Licence in the footer. Something like "This document is put in the public domain under the CC0 Licence". Someone more into licensing details should improve on the exact wording. Moreover, in Asciidoc there is a section type called "Colophon" that I believe could be used to include more details, such as the the full or abridged version of the licence text. But I have not worked with this before, so it might take some experimentation.

5
When it comes to the html version of the Conventions document it is not so clear where to put the licence information. Something very brief could probably be crammed into the "versions line" that now reads:

AuthorX · AuthorY · AuthorZ ‒ version 1.11, 05 December, 2023
See https://cfconventions.org/ for further information.

But there is nothing like a header or footer, because the html version is just one ling page

@castelao
Copy link
Member

@JonathanGregory, my understanding is that #443 was for the DOI of the CF-Conventions only. And an equivalent would be done for standard names.

I think it is important to have the names, with affiliations and preferably with ORCIDs, of contributors in the .zenodo.json, thus on the DOI record. The citation text is less important since, with the DOI record, everyone is appropriately linked. Indeed, there is some work to keep this list updated, but I think it is important to give such credit if we expect the community to dedicate time to contributing to CF.

My understanding is that 'CITATION.cff' is meant for machine-to-machine communication.

For the first time, we will have to link the repo with Zenodo, and after that, the following releases will trigger a new DOI automatically. During this very first time, it will create an overarching DOI (in this case, overarching among the CF-Conventions versions, not overarching for the whole CF) and another one specific for the release. The following ones will trigger just a new release DOI. I think the 'how to cite' text is important, and I usually instruct people to cite the release DOI only when referring to a specific version, otherwise, the default would be to cite the overarching one.

@JonathanGregory
Copy link
Contributor

Dear all

I agree with Gui @castelao that public credit should be given to the contributors to the conventions for the effort they have dedicated. That is indeed the purpose of the list on the website, which so far I have compiled and maintained. Does everyone agree that we need to list them all individually in .zenodo.json as well, rather than giving a URL to the web page?

If this is necessary, then I think the two lists should be identical, because that will simplify the maintenance. At present, I think zenodo.json lists a subset of the contributors.

As Gui says, we could assign a DOI to the standard name table. I think the contributords to standard names would be the authors of that document, because there are no other authors. I am concerned that the contributions to information management should be recognised as well. Many of their contributions aren't specific to conventions or standard names. They should therefore be identified as contributors to both, I suppose, which I think is another argument for giving URLs to the lists in .zenodo.json, rather than listing the names.

I agree with @larsbarring that the cff file for the CF conventions and any text about "How to cite" it should list all the authors, not just the original authors.

Best wishes

Jonathan

@ethanrd
Copy link
Member

ethanrd commented Feb 15, 2024

I agree that all authors should be listed in both the zenodo.json and CITATION.cff files. I'm not sure about including the contributors.

Does anyone know how contributor information in a zenodo.json file would be used/usable or visible? Does it show up on the Zenodo DOI landing page? I suspect at some point it will turn out to be useful to have the author/contributor data in a machine readable format. Maybe the zenodo.json file is a good start in that direction. However, until we know there are concrete benefits, I'm not sure we should take on the extra work involved in maintaining multiple copies. So, for now, I like the idea of including a link/reference to the web site contributor lists in the zenodo.json file. But I don't feel strongly about it.

As far as I can see, the CITATION.cff file is only used to automatically add a citation link/dropdown to the right-hand navigation on the GH repo main page (see my test repo for an example). So, I think the .cff file should be kept pretty minimal, i.e., not adding any information that doesn't show up in the drop-down.

@JonathanGregory - I agree with your earlier comments on discussing the content of a "How to Cite CF" website page in a website repo issue. I will try to start an issue for that in the next few days. Unless we need a

@JonathanGregory
Copy link
Contributor

Just to be perfectly clear, no-one objected regarding this:

In website issue 182 Ethan @ethanrd wrote

While there were some comments, there were no objections to the CF Governance Panel decision to license CF with CC0. So we will move forward with implementing CC0 for CF.

I'm repeating that here because the licence has also been mentioned in PR #443 linked to this issue. Since Ethan commented two weeks ago today, I think we can regard the choice of CC0 as having been agreed if no-one objects before next Friday 9th.

As @ethanrd said, CC0 has been implemented by stating it in LICENSE.md in the conventions and website repos.

@castelao
Copy link
Member

castelao commented Feb 15, 2024

@JonathanGregory , I copied the list of contributors (website) to create the zenodo's contributors list. If it's missing, it was my mistake. Would you know who are missing? Note that I intentionally removed all authors from the contributors list, since the authors list is a higher 'rank', and my understanding is that it wouldn't require such redundancy. There is no restriction in adding it as well if you prefer so.

I see the authors' list as an equivalent of the authors of a peer reviewed paper, and contributors would be the equivalent of everyone on the acknowledgments of that paper. There is a value in finding and fixing a typo and it should be recognized, but also should be clear that it is different than the time committed from the authors.

@ethanrd , The value of having everyone explicitly listed and respective ORCIDs is that the DOI links everyone. Those are fields in the DOI database. This is very different than having a text in a website or the document itself. The DOI database links objects with authors and contributors, and many more metadata for machine to machine communication. If you go in the ORCID of those authors and contributors, the CF would be listed. Only the 'rank' authors would show up in the citation text. If you look my ORCID, there is a mix of peer reviewed papers, software, and data, and each one with a different 'rank', as author or contributor.

It is important to include the DOI that will be generated in the CITATION.cff. In theory, it is used beyond the drop menu on the side suggesting the citing text.

@ethanrd
Copy link
Member

ethanrd commented Feb 15, 2024

Hi Gui @castelao - Do you mean that the DOI metadata gets automatically harvested (by Zenodo or DataCite, I guess) and pushed to ORCID metadata? That definitely makes it worth listing everybody.

Yes, I agree, the actual CF Conventions DOI (the top-level one) needs to be included in the CITATION.cff. Yes, the metadata in the CITATION.cff file could be used beyond the GH dropdown menu. I'm just not sure it is currently.

Once we agree on and merge the zenodo.json file and then connect the repo to Zenodo, can we force the minting of DOIs before the release of CF v1.12? I think a GH release is required. Perhaps we could do a "blank" release to mint the initial DOIs, delete the release, and then edit the DOI metadata in Zenodo to point the version DOI to v1.11.

@castelao
Copy link
Member

castelao commented Feb 15, 2024 via email

@JonathanGregory
Copy link
Contributor

Dear Gui @castelao and @ethanrd

I agree there is an advantage to listing all the contributors in the zenodo.json file with their ORCIDs, if that has the effect of linking the CF DOIs automatically to their ORCIDs. Thanks for clarifying.

Yes, I think the authors of the CF document should also be listed as contributors, as they are in the contributors page. This reflects their contribution to discussions on agreeing conventions, not their authorship of parts of the text. Since this means we will have two lists of contributors, we will have to keep them consistent, and we could think about how to automate that once zenodo.json is in place. Maybe conventions_contributors.md could be generated from zenodo.json whenever the latter is updated?

Cheers

Jonathan

@castelao
Copy link
Member

castelao commented Feb 16, 2024 via email

@larsbarring
Copy link
Contributor

I am planning to open a PR to add the CC0 license information to the conventions document. But we should also add the DOI for the particular CF version in the document. Does anyone know how we obtain a separate DOI for each version, and how/if we can get it ahead of actually submitting it to zenodo (else we have a chicken and egg problem)?

@JonathanGregory
Copy link
Contributor

From @castelao #443 (comment):

Hi everyone. Is this (#443) ready to move forward? It is still missing a few ORCIDs, but we can always add them later.

@JonathanGregory
Copy link
Contributor

Yes, I think it's ready. It would be great to close this ancient issue! Shall I merge #443, @castelao?

@larsbarring
Copy link
Contributor

As the associated PR #443 includes ORCID for many authors can also cf-convention/discuss#178 closed at the the same time as this issue is closed?

@JonathanGregory
Copy link
Contributor

Good idea, @larsbarring. I have linked cf-convention/discuss#178 to PR #443 as well, so it should be closed automagically upon merging. Thanks.

@larsbarring
Copy link
Contributor

larsbarring commented Apr 29, 2024

Another thought --- do we want to close this issue when PR #443 is merged ?
IF SO, then we probably should open a new issue to keep several outstanding loose ends together. I am thinking of

  • The discussions to have (if I have not missed something)
    • one overarching DOI for the CF Conventions that point to the website
    • another one always pointing to the current Conventions document
    • yet other ones pointing to the specific version of the Conventions documents
    • the same for the Standard Name Table (one for current, and separate ones for each version)
  • Implement the CFF file for the website (see draft PR #507)
  • How to keep all the different author lists in sync. (see this comment)
  • How to actually set this up in relation to Zenodo

... probably there are more aspects that I have missed

OR, do we want to keep this issue open and keep the "coordination" of these tasks here

@JonathanGregory
Copy link
Contributor

Dear Lars

Thanks for reminding us of all these loose ends. Since there are discussions to be had about what we want, such as you mention, I suggest it would be good to let this issue be closed and start again in a new Discussion (rather than an issue) to consider what else needs to be done. The CFF file could have its own issue to accompany the PR that @ethanrd did. That isn't the same thing as adding the DOI. It's related only because the citation lists the DOI.

Best wishes

Jonathan

@larsbarring
Copy link
Contributor

In another issue I was reminded that Heinke Höck already submitted her ORCID. @castelao would you mind updating the PR to include this.

Many thanks,
Lars

@JonathanGregory
Copy link
Contributor

Referring to Lars's comment above:

  • one overarching DOI for the CF Conventions that point to the website
  • another one always pointing to the current Conventions document
  • yet other ones pointing to the specific version of the Conventions documents

in issue 513, Ethan comments that the first can't be done and the other two will be done automatically by Zenodo once the PR for this issue is merged.

  • the same for the Standard Name Table (one for current, and separate ones for each version)

I have just added this to the related discussion 296 about old versions of the standard name table.

  • Implement the CFF file for the website (see draft PR #507)
  • How to keep all the different author lists in sync. (see this comment)

are now also raised in issue 513.

Two weeks ago I asked if it was OK now to merge @castelao's PR, which will implement DOIs for the conventions by Zenodo. @castelao asked the same question. No-one has objected. If no-one objects today, I will merge the PR tomorrow and thus close this issue (the oldest one which is outstanding) - unless someone else does it before me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format
Projects
None yet
Development

Successfully merging a pull request may close this issue.