Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record contributor roles/contribution types #112

Open
sdruskat opened this issue Sep 18, 2020 · 30 comments
Open

Record contributor roles/contribution types #112

sdruskat opened this issue Sep 18, 2020 · 30 comments
Labels
discussion enhancement Enhancement ideas for the format/schema itself minor-version Issues that affect a minor version release

Comments

@sdruskat
Copy link
Member

Just dropping this as an idea I just had: We could add role/roles to person/entity/the new umbrella object to provide a simple, works-now solution for recording contribution roles for people/institutions (this would open CFF up to recording funding as well for example).

This isn't ready to go I think, still need to check it against the other plans for CFF.

@jspaaks
Copy link
Member

jspaaks commented Jan 18, 2021

Related: #84, #27, #66

@sdruskat sdruskat added this to the Record contributors milestone May 31, 2021
@sdruskat sdruskat added enhancement Enhancement ideas for the format/schema itself and removed format/schema labels Aug 5, 2021
@jcolomb
Copy link

jcolomb commented Nov 11, 2021

Hello,
you may want to have a look at our work at https://github.com/jam-schema/jams

Our main focus has been to get all authors information available in jatsxml (research papers standard) into a yaml format. Apart from issues with author that are groups of people, we have covered a lot of cases, including how to define roles.

role:
      vocab-term-identifier: free text explanation
      

Note this answer #66 as free text explanation can be added to a computer-readable category of contribution

Our work is mostly meant to be used with the CREDIT taxonomy, but for the off purpose, one should probably look at other taxonomies (all-contributors or [CRO](https://bioportal.bioontology.org/ontologies/CRO/?p=classes&conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FCRO_0000060, see #27 for other proposition, as well as jam-schema/jams#18) as it is mostly meant for software citation.

@arfon : you probably would be the best person to see how the projects could interact ?

@arfon
Copy link

arfon commented Nov 24, 2021

@arfon : you probably would be the best person to see how the projects could interact ?

Yup, happy to try and assist with coordination here. @sdruskat – is adding contributor roles something you're still interested in?

@sdruskat
Copy link
Member Author

Yes, still interested! I'll try and pick this up next week when I'm back from leave.

@sdruskat
Copy link
Member Author

Related: https://upstream.force11.org/posts/deep-dive-into-ethics-of-contributor-roles

This will still take some time due to other responsiblities.

@jcolomb
Copy link

jcolomb commented Jan 25, 2022

@1kastner
Copy link

This sounds very interesting indeed!

@sdruskat
Copy link
Member Author

Also related, an initial crosswalk for contributor roles by Ted Habermann: https://zenodo.org/record/4767798.

@kevinmatthes
Copy link
Contributor

Related: all-contributors/all-contributors#664

@kevinmatthes
Copy link
Contributor

I would like to suggest to make the roles field an array of strings in order to support best the possibilities of the All Contributors bot.

@jcolomb
Copy link

jcolomb commented Dec 8, 2022

great idea @kevinmatthes, also related to all-contributors/all-contributors#471.
Contribution would make most sense if it can be taken up by zenodo (zenodo has roles for contributors, not for authors. the roles listed are (I think) a home-made list not related to CREDIT or contributor bot.

all coming to building a standard way to design authors and contribution, right? maybe you want to join the jams mini-community: jam-schema/jams#7

@sdruskat sdruskat changed the title Create role as a simple way to record contribution roles? Create roles as a simple way to record contribution roles? Dec 8, 2022
@sdruskat
Copy link
Member Author

sdruskat commented Dec 8, 2022

@kevinmatthes Thanks. Yes, I think having roles be plural and provide an array of strings is a good start. Ideally I would like this to be an enum, but the lack of agreement on what the values would have to be, i.e., the standard vocabulary for software contribution roles, is work in progress it seems.

@jcolomb I never found the time to engage with JAMS 😢, but is building this (sub-)vocabulary in the scope of JAMS?

@jcolomb
Copy link

jcolomb commented Dec 8, 2022

no, "building this (sub-)vocabulary" is more a work for NISO/FORCE11
but jams is looking for technical ways to use different vocabularies, people tend to think about ontologies when we present the problem(s).

@sdruskat
Copy link
Member Author

sdruskat commented Dec 8, 2022

I see, thanks.

Also found the Contributor Attribution Model from the National Center for Data to Health: https://contributor-attribution-model.readthedocs.io/en/latest/introduction.html.

@jcolomb
Copy link

jcolomb commented Dec 8, 2022

They use the CRO ontology in their example, also think the people behind are also behind CRO (Melissa heandel and co). (CRO behind the extension of CREDIT) you may want to see https://doi.org/10.1002/leap.1496 for discussions about these vocabularies development.

@apirogov
Copy link

apirogov commented Mar 2, 2023

I'd also be happy to have some roles concept, to distinguish "main authors" from "acknowledged contributors"

I'd also like to have an optional free-text field to state the actual main contribution(s) of each mentioned person, because in the end, any fixed "vocabulary" will be not expressive enough and I think a "who did what" list deserves at least some semi-structured form of being expressed.

@dmoracze
Copy link

+1 on creating a way to denote contribution type.
And I'll echo the suggestion that takes advantage of the work already done by the CRediT team: https://credit.niso.org/
What about a solution that looks something like:

authors:
  - given-names: Author
    family-names: One
    affiliation: Institute
    orcid: 'https://orcid.org/0000-0000-0000-0000'
    contribution:
      - role: Conceptualization
        credit-id: '8b73531f-db56-4914-9502-4cc4d4d8ed73'
      - role: Formal analysis
        credit-id: '8b73531f-db56-4914-9502-4cc4d4d8ed73'
      - role: Writing - original draft
        credit-id: '43ebbd94-98b4-42f1-866b-c930cef228ca'
  - given-names: Author
    family-names: Two
    affiliation: Institute
    orcid: 'https://orcid.org/0000-0000-0000-0001'
    contribution:
      - role: Data curation
        credit-id: 'f93e0f44-f2a4-4ea1-824a-4e0853b05c9d'
  - given-names: Author
    family-names: Three
    affiliation: Institute
    orcid: 'https://orcid.org/0000-0000-0000-0002'
    contribution:
      - role: Funding acquisition
        credit-id: '34ff6d68-132f-4438-a1f4-fba61ccf364a'
      - role: Conceptualization
        credit-id: '8b73531f-db56-4914-9502-4cc4d4d8ed73'
      -role: Writing - review & editing
        credit-id: 'd3aead86-f2a2-47f7-bb99-79de6421164d'

@dmoracze dmoracze mentioned this issue Jun 27, 2023
4 tasks
@sdruskat
Copy link
Member Author

Thanks, @dmoracze, for your comment. Contribution roles are planned to be part of the next minor release (1.3.0).

We're vetting different taxonomies for software contributions. CRediT has exactly one software role, so we likely won't use this taxonomy as is. In other work (v. early draft!), we've started looking at harmonizing taxonomies. We may use the outputs of this work, as it also seeks support from community initiatives.

@dmoracze
Copy link

Sounds good, @sdruskat! I look forward to it.

@effigies
Copy link

We're looking into using CITATION.cff to describe datasets, rather than software, and we register these datasets for DOIs using DataCite. To the extent that roles can correspond with DataCite's contributorType, we will be able to more faithfully map uploader intent onto DOI metadata.

I'm quite sure you've seen it, but figured I'd link it into this thread.

Another resource that might be useful for the dataset use case is the schema used by the DANDI archive, which includes EthicsApproval and Maintainer, neither of which show up in Credit or DataCite: https://github.com/dandi/schema/blob/1866f4d/releases/0.6.4/dandiset.json#L246-L278.

@ericearl
Copy link

ericearl commented Aug 7, 2023

I tried to apply something close to @dmoracze 's solution here, and it occurred to me there are so many competing standards for what roles or contributions should be called or should look like. Would it be reasonable to allow something more basic with no ontologies or exact options behind it like just a free text unordered list? You can simply enforce it's a list this way. Something like roles or contributions like this:

  - given-names: Eric
    family-names: Earl
    affiliation: >-
      Data Science & Sharing Team, National Institute of
      Mental Health, Bethesda, MD, USA
    orcid: https://orcid.org/0000-0001-5512-0083
    contributions:
        - Data curation

Thank you all for your involvement and consideration here!

@ericearl
Copy link

ericearl commented Aug 7, 2023

Oh! My colleague @dmoracze just pointed out: #338

Sorry! Ignore my last comment then.

@apirogov
Copy link

apirogov commented Aug 8, 2023

All these developments look great. I also think free text + a recommended list of canonical "contribution types" would be the best, and should be quite comprehensive in scope of a project (covering non technical and also academic aspects). Also, being not familiar with the current candidate "standards", I at least hope they include something like contribution-start and contribution-end dates. Once we really start doing microattribution seriously, having a chronology in that info could make it clearer what is more recent / revelant and what not, who is still active and who isn't etc.

@jspaaks
Copy link
Member

jspaaks commented Sep 19, 2023

The upcoming release 1.3.0 of the CFF schema adds a key contributors which can help solve the problem where repository owners

  1. gift authorship to contributors even when the contribution is insignificant, simply because there is no other mechanism to give thanks OR
  2. omit their contributors entirely, thus giving the impression that only they should be credited with the perceived benefits of the software.

However, the current issue as well as some others

  1. Align person roles with the OpenRIF Contribution Role Ontology Align person roles with the OpenRIF Contribution Role Ontology #27
  2. Allow differentiation between authors/contributors Allow differentiation between authors/contributors #84

suggest that CFF needs more granularity in differentiating the specific roles that each contributor had. I'm wondering, who or what would benefit from the additional level of detail? How would these metadata be used?

@tobyhodges
Copy link

I'm wondering, who or what would benefit from the additional level of detail? How would these metadata be used?

I think your question is about exactly how we would benefit from the description of types of contribution, e.g. contributor:reviewed and contributor:funding acquisition rather than only contributor? I will give an example from our use case (adding CFFs to Carpentries lesson repositories), where we want to be able to identify the reviewers and current+past maintainers of a lesson as well as the authors. We will use this information to populate a) the related Zenodo record, b) JSON-LD metadata according to the TrainingMaterial specification provided by Bioschemas, and c) an automatically-generated "Lesson Credit" page that lists these different people/entities and the role(s) they have taken in bringing the lesson to its current state.

I will say, though, that the addition of a contributors field to the CFF spec will already be very helpful and I would be perfectly happy to wait for a future release that included more granularity in the role of contributors.

@jspaaks
Copy link
Member

jspaaks commented Oct 5, 2023

Hi Toby! Indeed you have interpreted my question as intended. I'm trying to organize my thoughts around what would be a good list of terms to use for contributor roles in CFF (we are planning to have contributors be part of the next release, version 1.3.0, possibly with a subkey roles).

I'm leaning towards at least having the option of a controlled vocabulary, because without it, machine-readability is basically impossible. Additionally, there's value for users to being able to use a free text description of their contribution, because unavoidably there will be things that don't fit the chosen vocabulary. Long term, we could then analyze how people use the free text option to make informed decisions about where to go with the development of the controlled vocabulary for roles.

I tried building some crosswalks for example using the Allcontributors terms and see how well they translate to Zenodo/Datacite, Codemeta 3.0 (not released yet), schema.org, CRediT, etc. I was somewhat surprised to learn that

  1. most vocabularies are limited in what they can express, specifically with regard to software development
  2. conversions between formats are so lossy, it's basically not worth doing in the first place.

It seems that your bullet point (a) may not be a solvable problem given my bullet 2 above; bullet point (b) will work and could additionally use schema.org's Role (see my notes here citation-file-format/cffconvert#366 (comment)) but also note that schema's (and Codemeta 3.0's to a lesser extent) approach is just to pass along whatever terms were present in the original format, making it the consumer's problem to derive meaning from each term; for bullet point (c), you're basically free to do what you want as long as you know which key to automatically copy paste into the relevant part of the credit page.

So if automatic conversion between formats doesn't work anyway, maybe CFF should either not have granularity of contributions (i.e. not have roles), or it can pick a controlled vocabulary with which users can at least express their contributions well enough to human readers.

Thanks for your input!

@sdruskat
Copy link
Member Author

sdruskat commented Oct 5, 2023

Just very briefly, my gut feeling is that it's worthwhile to leave the decision about roles and a controlled vocabulary to use for it (of which I'm in favour) to another release (i.e., not make an attempt to include it in 1.3.0). This, mainly because the preliminary work mentioned above to create such a vocabulary based on community needs is now being picked up by a ReSA task force, so I think it doesn't hurt to wait until we see what their (our) results are.

@tobyhodges
Copy link

I tried building some crosswalks for example using the Allcontributors terms and see how well they translate to Zenodo/Datacite, Codemeta 3.0 (not released yet), schema.org, CRediT, etc. I was somewhat surprised to learn that

1. most vocabularies are limited in what they can express, specifically with regard to software development

2. conversions between formats are so lossy, it's basically not worth doing in the first place.

It seems that your bullet point (a) may not be a solvable problem given my bullet 2 above; bullet point (b) will work and could additionally use schema.org's Role (see my notes here citation-file-format/cffconvert#366 (comment)) but also note that schema's (and Codemeta 3.0's to a lesser extent) approach is just to pass along whatever terms were present in the original format, making it the consumer's problem to derive meaning from each term; for bullet point (c), you're basically free to do what you want as long as you know which key to automatically copy paste into the relevant part of the credit page.

This is very useful insight, thanks for sharing. Indeed I had suspected that a lot of this might not be easy.

I suppose with HERMES there's the possibility of writing a plugin that would do some processing of metadata and wrangle the role info in the project CFF into a format that Zenodo can parse. But we are still quite a long way out from integrating HERMES into the lesson publication workflow. (We are also a niche case, as I must acknowledge that CFF is primarily intended for software and data.)

@sdruskat sdruskat changed the title Create roles as a simple way to record contribution roles? Record contributor roles/contribution types Jan 16, 2024
@sdruskat
Copy link
Member Author

This is now the designated place to discuss further implementation of contributor roles and contribution types for the qualification of authors and/or contributors.

Supersedes discussions in:

@jspaaks
Copy link
Member

jspaaks commented Apr 9, 2024

Found another vocabulary of contributor roles https://vocabularies.cessda.eu/vocabulary/ContributorRole

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion enhancement Enhancement ideas for the format/schema itself minor-version Issues that affect a minor version release
Projects
None yet
Development

No branches or pull requests