Add prefix uniprot.mnemonic #1110

bgyori · 2024-05-05T23:30:20Z

Prefix

uniprot.mnemonic

Name

UniProt

Homepage

https://www.uniprot.org/

Source Code Repository

No response

Description

The UniProt Knowledgebase (UniProtKB) is a comprehensive resource for protein sequence and functional information with extensive cross-references to more than 120 external databases. Besides amino acid sequence and a description, it also provides taxonomic data and citation information.

This entry represents UniProt mnemonics which combine an alphanumeric representation the protein name and a species identification code representing the biological source of the protein.

License

No response

Publications

pubmed:16381842

Example Local Unique Identifier

BRAF_HUMAN

Regular Expression Pattern for Local Unique Identifier

^[A-Z0-9]{1,10}_[A-Z0-9]{1,5}$

URI Format String

https://www.uniprot.org/uniprotkb/$1

Wikidata Property

No response

Contributor Name

Benjamin M. Gyori

Contributor GitHub

bgyori

Contributor ORCiD

0000-0001-9439-5346

Contributor Email

b.gyori@northeastern.edu

Contact Name

No response

Contact ORCiD

No response

Contact GitHub

No response

Contact Email

No response

Additional Comments

No response

bgyori · 2024-05-05T23:35:28Z

Some comments on this entry: UniProt mnemonics are used often when referring to proteins in various places (e.g., in proteomics data sets). Mnemonics can also be resolved on uniprot.org. Their validation and resolution is of broad interest. The uniprot prefix represents UniProt accession numbers (in UniProt's own terminology).

In this sense, the relationship of uniprot vs uniprot.mnemonic is similar to hgnc identifiers vs hgnc.symbol-s which have two separate entries in the Bioregistry.

One issue this raises is that given that the resolver URL for both accession numbers and mnemonics is https://www.uniprot.org/uniprotkb/$1, this can cause confusion when reverse-mapping URLs to prefixes.

JervenBolleman · 2024-05-11T15:44:42Z

I just want to note that these are not stable, nor guaranteed to point to the same entity over time.

cmungall · 2024-05-31T15:43:25Z

There was a previous discussion on resolvers for expressions

Records for languages like SMILES, HGVS, and SPDI #460

The thinking at that time was that this was only 2/5 in scope - it seems resolving on things like gene symbols (or symbol-species tuples) is even less in scope? At least expression symbols have a fixed semantics, but as @JervenBolleman says, label/name/symbol lookup is inherently unstable.

I think if we do add these we need a clear mechanism to distinguish identifier prefixes from expression prefixes from label lookup prefixes, communicating which are stable.

But IMO this would be scope creep. There is room for a general lookup or name resolver service (this is something we do in NCATS Translator - https://name-resolution-sri.renci.org/docs) but I think this should be separate from identifier standardization and resolution.

bgyori · 2024-05-31T15:56:47Z

I am also on the fence about this one. Let me explain why I opened this: I was trying to demo how you can take a data set and annotate its various experimental factors (proteins, drugs, cell lines, etc.) with identifiers. In this data set, proteins are identified using UniProt mnemonics - which may be bad practice but is widespread. I think UniProt mnemonics occupy a special place in which they are kind of like identifiers: they have well defined syntax patterns and can be independently resolved through a URI pattern. If you do use UniProt mnemonics in some context, it's hugely more useful to be able to use a CURIE like uniprot.mnemonic:BRAF_HUMAN (and leverage validation, resolving, etc.) than just BRAF_HUMAN without any additional context. A second argument is that UniProt mnemonics vs UniProt IDs (accession numbers) are somewhat analogous to HGNC symbols vs HGNC IDs and HGNC symbols have their own entry and resolver (see e.g., https://bioregistry.io/registry/hgnc.symbol which harmonizes across all the external registries that have entries for it as well:

)

bgyori added Prefix New Used in combination with prefix, metaprefix, or collection for new entries labels May 5, 2024

github-actions bot mentioned this issue May 5, 2024

Add prefix: uniprot.mnemonic #1111

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add prefix uniprot.mnemonic #1110

Add prefix uniprot.mnemonic #1110

bgyori commented May 5, 2024

bgyori commented May 5, 2024

JervenBolleman commented May 11, 2024

cmungall commented May 31, 2024

bgyori commented May 31, 2024 •

edited

Loading

Add prefix uniprot.mnemonic #1110

Add prefix uniprot.mnemonic #1110

Comments

bgyori commented May 5, 2024

Prefix

Name

Homepage

Source Code Repository

Description

License

Publications

Example Local Unique Identifier

Regular Expression Pattern for Local Unique Identifier

URI Format String

Wikidata Property

Contributor Name

Contributor GitHub

Contributor ORCiD

Contributor Email

Contact Name

Contact ORCiD

Contact GitHub

Contact Email

Additional Comments

bgyori commented May 5, 2024

JervenBolleman commented May 11, 2024

cmungall commented May 31, 2024

bgyori commented May 31, 2024 • edited Loading

bgyori commented May 31, 2024 •

edited

Loading