-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add prefix uniprot.mnemonic #1110
Comments
Some comments on this entry: UniProt mnemonics are used often when referring to proteins in various places (e.g., in proteomics data sets). Mnemonics can also be resolved on uniprot.org. Their validation and resolution is of broad interest. The In this sense, the relationship of One issue this raises is that given that the resolver URL for both accession numbers and mnemonics is https://www.uniprot.org/uniprotkb/$1, this can cause confusion when reverse-mapping URLs to prefixes. |
I just want to note that these are not stable, nor guaranteed to point to the same entity over time. |
There was a previous discussion on resolvers for expressions The thinking at that time was that this was only 2/5 in scope - it seems resolving on things like gene symbols (or symbol-species tuples) is even less in scope? At least expression symbols have a fixed semantics, but as @JervenBolleman says, label/name/symbol lookup is inherently unstable. I think if we do add these we need a clear mechanism to distinguish identifier prefixes from expression prefixes from label lookup prefixes, communicating which are stable. But IMO this would be scope creep. There is room for a general lookup or name resolver service (this is something we do in NCATS Translator - https://name-resolution-sri.renci.org/docs) but I think this should be separate from identifier standardization and resolution. |
I am also on the fence about this one. Let me explain why I opened this: I was trying to demo how you can take a data set and annotate its various experimental factors (proteins, drugs, cell lines, etc.) with identifiers. In this data set, proteins are identified using UniProt mnemonics - which may be bad practice but is widespread. I think UniProt mnemonics occupy a special place in which they are kind of like identifiers: they have well defined syntax patterns and can be independently resolved through a URI pattern. If you do use UniProt mnemonics in some context, it's hugely more useful to be able to use a CURIE like |
Prefix
uniprot.mnemonic
Name
UniProt
Homepage
https://www.uniprot.org/
Source Code Repository
No response
Description
The UniProt Knowledgebase (UniProtKB) is a comprehensive resource for protein sequence and functional information with extensive cross-references to more than 120 external databases. Besides amino acid sequence and a description, it also provides taxonomic data and citation information.
This entry represents UniProt mnemonics which combine an alphanumeric representation the protein name and a species identification code representing the biological source of the protein.
License
No response
Publications
pubmed:16381842
Example Local Unique Identifier
BRAF_HUMAN
Regular Expression Pattern for Local Unique Identifier
^[A-Z0-9]{1,10}_[A-Z0-9]{1,5}$
URI Format String
https://www.uniprot.org/uniprotkb/$1
Wikidata Property
No response
Contributor Name
Benjamin M. Gyori
Contributor GitHub
bgyori
Contributor ORCiD
0000-0001-9439-5346
Contributor Email
b.gyori@northeastern.edu
Contact Name
No response
Contact ORCiD
No response
Contact GitHub
No response
Contact Email
No response
Additional Comments
No response
The text was updated successfully, but these errors were encountered: