Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating oboInOwl:created_by to dcterms:contributor for old terms created by ORCID-less contributors #1459

Closed
gouttegd opened this issue Jul 20, 2022 · 11 comments · Fixed by #1749
Assignees

Comments

@gouttegd
Copy link
Contributor

As part of the OBO-wide move from http://www.geneontology.org/formats/oboInOwl#created_by to http://purl.org/dc/terms/contributor, we want to massively replace all the existing oboInOwl:created_by annotations in FBbt (and also in other FlyBase ontologies).

However, many of the existing oboInOwl:created_by annotations in FBbt have a value that is not an ORCID URI, and actually not even an URI at all. Instead, they have values that typically correspond to the initials of the FlyBase ontologist who created them, such as djs93, mmc46, sr544, etc. Overall, generalised use of ORCIDs only started a few years ago, so almost all “old” terms have a non-ORCID, non-URI creator.

But the ODK is now enforcing that all dcterms:contributor annotations should have an URI-like value (as part of the iri-range-violation.sparql standard check), so blindly replacing oboInOwl:created_by by dcterms:contributor results in a build failure because annotations like dcterms:contributor mmc46 violate that constraint.

@matentzn You’re the one who is keen to see oboInOwl:created_by (along with oboInOwl:creation_date) replaced with equivalent DC terms, so your input here would be welcome: How should old, ORCID-less contributors be dealt with?

  1. Leave them alone, as oboInOwl:created_by annotations; replace oboInOwl:created_by by dcterms:contributor only when the contributor is represented by an IRI (typically an ORCID URI).

Probably not very satisfying if the goal is to get rid of oboInOwl:created_by, as that would leave quite a large number of such annotations.

  1. Make sure that all contributors have an ORCID.

I don’t think this is realistically feasible for old contributors that may not even work for FlyBase anymore.

  1. Relax the iri-range-violation.sparql constraint to exclude dcterms:contributor from the list of annotations that must have a IRI value.

If we go that route it should be done at the ODK level.

  1. Disable the iri-range-violation.sparql check in FBbt.

Doesn’t require any change at the ODK level, but then we lose the benefit of that check entirely, I am not really keen to do that.

  1. Create pseudo-URIs to represent old contributors.

For example, transform mmc46 into something like http://flybase.org/contributors/mmc46. This can be done even if said contributor is no longer there (contrary to creating an ORCID). The constructed pseudo-URI would not resolve to anything meaningful, though, contrary to an ORCID. (Unless we ask our web developers to make those URIs point to something on our website, but that seems way overkill to me.)

@gouttegd gouttegd self-assigned this Jul 20, 2022
@Clare72
Copy link
Contributor

Clare72 commented Jul 21, 2022

Does prefixing with "FBC:" stop the check failing?

@matentzn
Copy link
Collaborator

Great you are looking into this!

I would suggest the following:

  1. Try to identify if the person has an ORCID now, and replace if they do. This is what Anita did for Uberon, and maybe she still has the tables for the tag -> orcid mapping
  2. If there are not too many left after that, create wikidata entries for these (this is done in 1 minute or less per person)

If 2 is too much work, I would suggest you use dcelements:contributor for all case where the range is a string, and append a prefix "Flybase:mmc24".

Definitely avoid your solutions 3 and 4.

@Clare72
Copy link
Contributor

Clare72 commented Jul 21, 2022

actually there are not that many creators in FBbt:
created_by: camcur
created_by: david
created_by: djs93
created_by: dos
created_by: http://orcid.org/0000-0002-1373-1705
created_by: http://orcid.org/0000-0002-6095-8718
created_by: mmc46
created_by: sr544
created_by: temj2

(from grep "created_by: " fbbt-edit.obo | sort | uniq > unique_creators.txt)

@Clare72
Copy link
Contributor

Clare72 commented Jul 21, 2022

temj2 is my direct predecessor, Tamsin Jones - http://orcid.org/0000-0002-0027-0858
mmc46 is Marta Costa - http://orcid.org/0000-0001-5948-3092
I am not sure who sr544 is
@dosumis are you david, djs93 and/or dos?
maybe we could delete camcur? it refers to all FB Cambridge curators, which is a changing group of people, so maybe not a useful attribution

@gouttegd
Copy link
Contributor Author

@Clare72 : No, FBC:xxx still violates the constraint. My understanding is that it should be written as <FBC:xxx> to be recognised as an IRI, but this doesn’t seem to work in an OBO file.

Thanks for the ORCID above.

@matentzn : Will create Wikidata entries if needed (if we can’t identify remaining contributors).

@gouttegd
Copy link
Contributor Author

According to the SourceForge pages of the OBO Foundry (didn’t know someone was still using SourceForge…):

  • sr544 is Simon Reeve (who is also listed on FlyBase.org as a former member, so it checks out), who doesn’t seem to have an ORCID;
  • djs93 is David Osumi-Sutherland.

That leaves dos (one occurence) and david (151 occurrences), who probably are David Osumi-Sutherland as well, though I’d like a confirmation.

And I agree about deleting camcur (18 occurrences) – if we don’t know who contributed the term exactly, there’s not much point in having a dcterms:contributor annotation at all.

@matentzn
Copy link
Collaborator

For the camcur, I would like to add that we do give credit to groups in a few ontologies, and it is not redundant to do so. You could use wikidata:Q3074571 (https://www.wikidata.org/entity/Q3074571 )

@dosumis
Copy link
Collaborator

dosumis commented Jul 21, 2022 via email

@gouttegd
Copy link
Contributor Author

Remaining contributors referenced from within FBbt by their name rather than an identifier:

@Clare72
Copy link
Contributor

Clare72 commented Jan 10, 2024

orcids look good (yes you have the right Kei Ito)

@Clare72
Copy link
Contributor

Clare72 commented Jan 10, 2024

maybe we could use these for the others:
https://www.researchgate.net/scientific-contributions/Gary-Grumbling-2028777112
https://www.researchgate.net/scientific-contributions/Simon-Reeve-2162827703

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants