Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve incomplete records from RRID #954

Open
cthoyt opened this issue Oct 6, 2023 · 2 comments
Open

Improve incomplete records from RRID #954

cthoyt opened this issue Oct 6, 2023 · 2 comments
Labels

Comments

@cthoyt
Copy link
Member

cthoyt commented Oct 6, 2023

In #952, we aligned the RRID resources with the Bioregistry. However, there are a number of resources referenced in RRID that didn't align to existing Bioregistry prefixes, nor could they be curated as new ones. I've included that table here:

UNCURATABLE = {
"XEP": "could not find an example entity number",
"CWRU": "could not find evidence that this is an identifier resource",
"XGSC": "could not find evidence that this is an identifier resource",
"SSCLBR": "dead resource",
"EXRC": "resource does not have stable/referencable identifiers for entities",
"IMSR": "meta-site that seems to wrap other IMSR sites",
"IMSR_CARD": "dead website",
"IMSR_CMMR": "just a wrapper around MGI",
"IMSR_CRL": "Massive site, too cryptic, can't find",
"IMSR_GPT": "actual URLs don't match accession numbers",
"IMSR_HAR": "could not find evidence that this is an identifier resource",
"IMSR_NM-KI": "multiple conflicting identifiers - actual URLs don't match accession numbers",
"IMSR_NIG": "could not find evidence that this is an identifier resource",
"IMSR_TIGM": "could not find evidence that this is an identifier resource",
}

The question is: is there still a way that RRID incorporates information from these resources? For most of them, I could not actually find any information about the identifier resource. What do you think @bandrow? Is there a way to get example local unique identifiers for these resources that are pre-indexed in RRID? That might be a way forward.

@cthoyt cthoyt added the Curation label Oct 6, 2023
@bandrow
Copy link
Contributor

bandrow commented Oct 6, 2023

All of these are animal stock centers so they are under the organism RRID category:

  1. Go to
    https://scicrunch.org/resources/data/source/nlx_154697-1/search?q=xep&l=xep
  2. grab first result is from xenopus express, and the RRID is RRID:XEP_Xep
  3. can always check https://n2t.net/RRID:XEP_Xep.json = success
    to see if valid
    RRID:XEP_Xep
    RRID:CWRU_CFC
    RRID:XGSC_SR
    RRID:SSCLBR_Cdk8 - yes the resource is currently dead, we will need to redirect to RGD shortly;
    RRID:EXRC_0216
    RRID:IMSR_CARD:1153
    RRID:IMSR_CMMR:516C10
    RRID:IMSR_CRL:023
    RRID:IMSR_GPT:T057702
    RRID:IMSR_HAR:2115
    RRID:IMSR_NIG:186
    RRID:IMSR_TIGM:IST14962C7

The place where this will fail is IMSR (as you say this is a pan organization so it contains no specific RRIDs, it houses a bunch of mouse repository data)

I need to figure out what the heck IMSR_NM-KI is that breaks all rules. I will try to track this down

@bandrow
Copy link
Contributor

bandrow commented Oct 6, 2023

I see what happened, the curator captured an extra parameter in the RRID:IMSR_NM-KI resource, it should be
RRID:IMSR_NM
example:
RRID:IMSR_NM-NSG-001

-thanks for catching this one, they broke the rules on IDs at this repository, but we didn't catch it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants