Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prefix http://purl.uniprot.org/core/ #1113

Open
jamesamcl opened this issue May 11, 2024 · 9 comments
Open

Add prefix http://purl.uniprot.org/core/ #1113

jamesamcl opened this issue May 11, 2024 · 9 comments
Labels
New Used in combination with prefix, metaprefix, or collection for new entries Prefix

Comments

@jamesamcl
Copy link
Contributor

jamesamcl commented May 11, 2024

Prefix

up

Name

UniProt core

Homepage

https://sparql.uniprot.org/

Source Code Repository

No response

Description

This IRI prefix http://purl.uniprot.org/core/ is used by the UniProt RDF representation as seen on https://sparql.uniprot.org/

CC @JervenBolleman

License

No response

Publications

No response

Example Local Unique Identifier

up:mnemonic

Regular Expression Pattern for Local Unique Identifier

No response

URI Format String

No response

Wikidata Property

No response

Contributor Name

no attribution required

Contributor GitHub

Contributor ORCiD

Contributor Email

No response

Contact Name

No response

Contact ORCiD

No response

Contact GitHub

No response

Contact Email

No response

Additional Comments

No response

@jamesamcl jamesamcl added New Used in combination with prefix, metaprefix, or collection for new entries Prefix labels May 11, 2024
@jamesamcl
Copy link
Contributor Author

jamesamcl commented May 11, 2024

Looks like UniProt also acts as a registry definining prefixes for lots of other DBs, e.g.:

<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/geneid/235256"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/kegg/mmu:235256"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/ucsc/uc009oyk.1"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/agr/MGI:2660716"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/agr/RGD:1332791"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/ctd/235256"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/mgi/2660716"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/veupathdb/HostDB:ENSMUSG00000062121"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/eggnog/ENOG502SH9P"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/genetree/ENSGT01050000244869"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/hogenom/CLU_012526_8_1_1"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/inparanoid/Q60888"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/oma/QVAFFPE"/>
<rdfs:seeAlso rdf:resource="http://purl.orthodb.org/odbgroup/4307986at2759"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/phylomedb/Q60888"/>
<rdfs:seeAlso rdf:resource="http://purl.uniprot.org/treefam/TF336512"/>

(taken from one of the RDF files in the latest release)

Not sure how to best approach this @cthoyt ? I guess we should add these as alternate prefixes for all the other DBs but a full list would be helpful (@JervenBolleman?)

@jamesamcl
Copy link
Contributor Author

jamesamcl commented May 11, 2024

Full list of databases obtained from sparql endpoint, though I am not sure exactly how these map to the URI prefixes seen above:

http://purl.uniprot.org/database/EnsemblFungi
http://purl.uniprot.org/database/EnsemblProtists
http://purl.uniprot.org/database/EnsemblMetazoa
http://purl.uniprot.org/database/PDB
http://purl.uniprot.org/database/Ensembl
http://purl.uniprot.org/database/EnsemblPlants
http://purl.uniprot.org/database/EnsemblBacteria
http://purl.uniprot.org/database/Gene3D
http://purl.uniprot.org/database/KEGG
http://purl.uniprot.org/database/PDBsum
http://purl.uniprot.org/database/SUPFAM
http://purl.uniprot.org/database/DNASU
http://purl.uniprot.org/database/MoonProt
http://purl.uniprot.org/database/EMDB
http://purl.uniprot.org/database/Allergome
http://purl.uniprot.org/database/Araport
http://purl.uniprot.org/database/MaizeGDB
http://purl.uniprot.org/database/PlantReactome
http://purl.uniprot.org/database/ProMEX
http://purl.uniprot.org/database/TAIR
http://purl.uniprot.org/database/REBASE
http://purl.uniprot.org/database/CLAE
http://purl.uniprot.org/database/PomBase
http://purl.uniprot.org/database/SGD
http://purl.uniprot.org/database/JaponicusDB
http://purl.uniprot.org/database/BMRB
http://purl.uniprot.org/database/UniLectin
http://purl.uniprot.org/database/CORUM
http://purl.uniprot.org/database/CPTC
http://purl.uniprot.org/database/CarbonylDB
http://purl.uniprot.org/database/ComplexPortal
http://purl.uniprot.org/database/DOSAC-COBS-2DPAGE
http://purl.uniprot.org/database/IDEAL
http://purl.uniprot.org/database/MetOSite
http://purl.uniprot.org/database/MoonDB
http://purl.uniprot.org/database/OGP
http://purl.uniprot.org/database/PRIDE
http://purl.uniprot.org/database/UCD-2DPAGE
http://purl.uniprot.org/database/WBParaSite
http://purl.uniprot.org/database/ZFIN
http://purl.uniprot.org/database/PHI-base
http://purl.uniprot.org/database/DEPOD
http://purl.uniprot.org/database/ConoServer
http://purl.uniprot.org/database/GlyConnect
http://purl.uniprot.org/database/euHCVdb
http://purl.uniprot.org/database/AlphaFoldDB
http://purl.uniprot.org/database/EMBL
http://purl.uniprot.org/database/HAMAP
http://purl.uniprot.org/database/HOGENOM
http://purl.uniprot.org/database/InParanoid
http://purl.uniprot.org/database/InterPro
http://purl.uniprot.org/database/NCBIfam
http://purl.uniprot.org/database/OrthoDB
http://purl.uniprot.org/database/Pfam
http://purl.uniprot.org/database/RefSeq
http://purl.uniprot.org/database/SMR
http://purl.uniprot.org/database/STRING
http://purl.uniprot.org/database/CAZy
http://purl.uniprot.org/database/MEROPS
http://purl.uniprot.org/database/PIR
http://purl.uniprot.org/database/AGR
http://purl.uniprot.org/database/Antibodypedia
http://purl.uniprot.org/database/Bgee
http://purl.uniprot.org/database/BindingDB
http://purl.uniprot.org/database/BioGRID-ORCS
http://purl.uniprot.org/database/BioMuta
http://purl.uniprot.org/database/CCDS
http://purl.uniprot.org/database/CPTAC
http://purl.uniprot.org/database/CTD
http://purl.uniprot.org/database/ChiTaRS
http://purl.uniprot.org/database/DMDM
http://purl.uniprot.org/database/DisGeNET
http://purl.uniprot.org/database/DrugCentral
http://purl.uniprot.org/database/EPD
http://purl.uniprot.org/database/ExpressionAtlas
http://purl.uniprot.org/database/FlyBase
http://purl.uniprot.org/database/GeneCards
http://purl.uniprot.org/database/GeneReviews
http://purl.uniprot.org/database/GeneTree
http://purl.uniprot.org/database/GeneWiki
http://purl.uniprot.org/database/Genevisible
http://purl.uniprot.org/database/GenomeRNAi
http://purl.uniprot.org/database/GlyGen
http://purl.uniprot.org/database/HGNC
http://purl.uniprot.org/database/HPA
http://purl.uniprot.org/database/MANE-Select
http://purl.uniprot.org/database/MGI
http://purl.uniprot.org/database/MIM
http://purl.uniprot.org/database/MalaCards
http://purl.uniprot.org/database/MassIVE
http://purl.uniprot.org/database/MaxQB
http://purl.uniprot.org/database/OpenTargets
http://purl.uniprot.org/database/Orphanet
http://purl.uniprot.org/database/PCDDB
http://purl.uniprot.org/database/PathwayCommons
http://purl.uniprot.org/database/PeptideAtlas
http://purl.uniprot.org/database/PharmGKB
http://purl.uniprot.org/database/Pharos
http://purl.uniprot.org/database/PhosphoSitePlus
http://purl.uniprot.org/database/ProteomicsDB
http://purl.uniprot.org/database/Pumba
http://purl.uniprot.org/database/REPRODUCTION-2DPAGE
http://purl.uniprot.org/database/RGD
http://purl.uniprot.org/database/RNAct
http://purl.uniprot.org/database/SIGNOR
http://purl.uniprot.org/database/SignaLink
http://purl.uniprot.org/database/SwissPalm
http://purl.uniprot.org/database/TopDownProteomics
http://purl.uniprot.org/database/TreeFam
http://purl.uniprot.org/database/UCSC
http://purl.uniprot.org/database/VGNC
http://purl.uniprot.org/database/WormBase
http://purl.uniprot.org/database/Xenbase
http://purl.uniprot.org/database/dbSNP
http://purl.uniprot.org/database/jPOST
http://purl.uniprot.org/database/neXtProt
http://purl.uniprot.org/database/GuidetoPHARMACOLOGY
http://purl.uniprot.org/database/NIAGADS
http://purl.uniprot.org/database/PATRIC
http://purl.uniprot.org/database/BRENDA
http://purl.uniprot.org/database/DrugBank
http://purl.uniprot.org/database/EvolutionaryTrace
http://purl.uniprot.org/database/SABIO-RK
http://purl.uniprot.org/database/TCDB
http://purl.uniprot.org/database/iPTMnet
http://purl.uniprot.org/database/ABCD
http://purl.uniprot.org/database/BioGRID
http://purl.uniprot.org/database/DIP
http://purl.uniprot.org/database/DisProt
http://purl.uniprot.org/database/ELM
http://purl.uniprot.org/database/ESTHER
http://purl.uniprot.org/database/GlyCosmos
http://purl.uniprot.org/database/IntAct
http://purl.uniprot.org/database/MINT
http://purl.uniprot.org/database/OMA
http://purl.uniprot.org/database/PRO
http://purl.uniprot.org/database/PaxDb
http://purl.uniprot.org/database/PhylomeDB
http://purl.uniprot.org/database/Reactome
http://purl.uniprot.org/database/SWISS-2DPAGE
http://purl.uniprot.org/database/VEuPathDB
http://purl.uniprot.org/database/dictyBase
http://purl.uniprot.org/database/Leproma
http://purl.uniprot.org/database/SwissLipids
http://purl.uniprot.org/database/LegioList
http://purl.uniprot.org/database/BioCyc
http://purl.uniprot.org/database/eggNOG
http://purl.uniprot.org/database/CGD
http://purl.uniprot.org/database/COMPLUYEAST-2DPAGE
http://purl.uniprot.org/database/CDD
http://purl.uniprot.org/database/PANTHER
http://purl.uniprot.org/database/PIRSF
http://purl.uniprot.org/database/PRINTS
http://purl.uniprot.org/database/PROSITE
http://purl.uniprot.org/database/SFLD
http://purl.uniprot.org/database/SMART
http://purl.uniprot.org/database/GeneID
http://purl.uniprot.org/database/Gramene
http://purl.uniprot.org/database/PeroxiBase
http://purl.uniprot.org/database/ChEMBL
http://purl.uniprot.org/database/SASBDB
http://purl.uniprot.org/database/World-2DPAGE
http://purl.uniprot.org/database/CollecTF
http://purl.uniprot.org/database/TubercuList
http://purl.uniprot.org/database/EchoBASE
http://purl.uniprot.org/database/PseudoCAP
http://purl.uniprot.org/database/ArachnoServer
http://purl.uniprot.org/database/IMGT_GENE-DB
http://purl.uniprot.org/database/ClinGen
http://purl.uniprot.org/database/DDBJ
http://purl.uniprot.org/database/ENZYME
http://purl.uniprot.org/database/GO
http://purl.uniprot.org/database/GPCRDB
http://purl.uniprot.org/database/GenAtlas
http://purl.uniprot.org/database/GenBank
http://purl.uniprot.org/database/GenCC
http://purl.uniprot.org/database/HUGE
http://purl.uniprot.org/database/MobiDB
http://purl.uniprot.org/database/ModBase
http://purl.uniprot.org/database/PDBe-KB
http://purl.uniprot.org/database/PDBj
http://purl.uniprot.org/database/Proteomes
http://purl.uniprot.org/database/RCSB-PDB
http://purl.uniprot.org/database/Rouge
http://purl.uniprot.org/database/SOURCE
http://purl.uniprot.org/database/SWISS-MODEL-Workspace
http://purl.uniprot.org/database/UniPathway

@JervenBolleman
Copy link

JervenBolleman commented May 11, 2024

For more information about xrefs in UniProt see https://web.expasy.org/docs/userman.html#DR_line and https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/docs/dbxref.txt which are the equivalent of the RDF file https://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/databases.rdf.xz

One can ask for information about where these links go a sparql query at our public database endpoint

PREFIX up: <http://purl.uniprot.org/core/>

SELECT 
*
WHERE {
  GRAPH <http://sparql.uniprot.org/database> {
   	?database a up:Database .
    OPTIONAL {
    	?database up:urlTemplate ?urlTemplate .
    }
    OPTIONAL {
    	?database up:abbreviation ?abbreviation .
    }
    OPTIONAL {
     	?database rdfs:seeAlso ?furtherInformation
    }
  }
}

However, what the RDF representation is currently not exposed. If there is a desire for this please contact uniprot via the contact form.

We also do not guarantee that these links to other databases will be maintained if the xref is removed from use in UniProt.

There will also be a few prefixes not in the list above, that come from the UniParc representation in RDF.

@bgyori
Copy link
Contributor

bgyori commented May 11, 2024

Hello, @jamesamcl and @JervenBolleman, thanks for submitting this. The Bioregistry already integrates the UniProt registry, see here: https://bioregistry.io/metaregistry/uniprot. Among other metadata, this page links to https://bioregistry.io/api/metaregistry/uniprot/mappings.json which is the set of mappings from bioregistry prefixes to the IDs UniProt assigns to these external databases. In addition, if you go to e.g., https://bioregistry.io/registry/ensembl.fungi, you will see that UniProt is shown and linked out to in this table:

image

Given the above, I'm trying to understand if there are specific changes / extensions we would want to make in the Bioregistry.

@JervenBolleman
Copy link

Hi @bgyori, UniProt does not use DB-0148 as a "prefix" in anyway we would have used EnsemblFungi (we don't as there was an agreement in the RDF with ENSEMBL to use an rdf.ebi.ac.uk).

As a general note pease do not use any single PI's as contact point for UniProt. Just use the UniProt contact form.

@bgyori
Copy link
Contributor

bgyori commented May 11, 2024

I think that's just a slight terminology issue, the column in the table uses "prefix" to refer to how an external registry catalogs an identifiers resource. Some external registries use what you would actually call a prefix, others not so much, still this is just a column name.

In terms of contacts, at least son far, the approach in Bioregistry advocated by @cthoyt has been to use specific people as contacts when possible. I will look into it and get back to you.

@JervenBolleman
Copy link

What you can do is add one more prefix to the uniprot subconcepts. uniprot.core:
which will represent the concepts nested in http://purl.uniprot.org/core/ e.g. http://purl.uniprot.org/core/Protein. (Complete current list is available in the file https://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/core.owl)

@jamesamcl
Copy link
Contributor Author

@JervenBolleman Apologies for tagging you directly if this was inappropriate; I was under the impression you were responsible for the RDF serialization of UniProt and our culture has generally been to involve the relevant database contacts in these discussions wherever possible (as IMO it is not particularly in the spirit of FAIR to discuss interoperability issues in a private support ticket tracker).

@bgyori Thanks for the info, I didn't think to look for UniProt on the metaregistry collections. I am hoping to use BioRegistry to map IRIs to one consistent set of CURIEs, but I found that UniProt's IRI prefixes weren't included in the standard BioRegistry prefix map which is what prompted this issue.

@JervenBolleman
Copy link

@jamesamcl tagging me is fine, but might lead to very slow or missed responses. To be honest I am the public face of the UniProt RDF serialization, my colleagues at SIB are the real stars on this topic, but are less keen on presenting :(

The UniProt protocol is to prefer to have contacts happen in the first instance via our contact form. So someone can be assigned to help, this might be as little as please have a look at github issue X or biostars question Y. The helpdesk/contact form is staffed by multiple people. PIs and individual staff get so much mail they might respond much later, or sometimes not at all, or are just not available during certain periods.

To be honest we at UniProt, think that UniProt is a team effort. Which is why for the main NAR UniProt paper we no longer put in individual authors but the consortium as an author.

If you feel the need to put in names, then you should really fill in the whole list -> from https://www.uniprot.org/help/key_staff plus the PIs.

bgyori added a commit that referenced this issue May 12, 2024
This PR adds a URI pattern for `uniprot.core` (which already exists as a
prefix in Bioregistry but so far had no URI defined).
Related to #1113
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New Used in combination with prefix, metaprefix, or collection for new entries Prefix
Projects
None yet
Development

No branches or pull requests

3 participants