-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Records for languages like SMILES, HGVS, and SPDI #460
Comments
TLDR: this is like a 2/5 on in-scope, but we might be able to give some support anyway It's possible to register prefixes even if there's no website that resolves them. However, I'm familiar with HGVS and it's not clear if HGVS strings count as identifiers the same way other "controlled" vocabularies do (the same way that we don't think of InChI and SMILES as prefixes where those strings are identifiers). However, both of those managed to get prefixes anyway, so it might not be the worst thing to skip passing judgment. Please go ahead and send some new prefix requests for these and we will do our best to get as much info about them before accepting |
Let's say that for these to be useful in the Bioregistry, we need a very good regex for enforcement |
J. Bradley Holmes (orcid:0000-0001-8354-5062) might be a good owner of this prefix because he authored SPDI: data model for variants and applications at NCBI. SPDI nomenclature syntax examples:
|
Similar use cases:
|
Background:
SPDI (Sequence Position Deletion Insertion) nomenclature and HGVS (Human Genome Variation Society) nomenclature are two standards that when used correctly, can uniquely identify sequence variants. The HGVS and SPDI nomenclature provide a short-hand notation for capturing: the genome, assembly, position, and sequence change of a sequence variant. In this way, they are a kind of identifier.
Motivation for Prefixes:
We have a group of users that would like to identify a prefix for either (or both of):
Use Cases:
continuing from biolink-model issue:biolink/biolink-model#1042
It would be helpful to be able to reuse the existing architecture above to place HGVS and SPDI "identifiers" in their appropriate biolink model classes and normalize them in the context of other sequence variant identifiers in disparate data sets across NCATS Data Translator.
Challenges:
Questions:
Does this group have any opinions on a registered prefix for an identifier that isn't resolvable (and isn't a parked prefix, meaning, there is no service that plans to support the expansion of the registered prefix)?
The text was updated successfully, but these errors were encountered: