Align GO JSON-LD context with dipper curie-map #582

cmungall · 2018-05-08T22:13:55Z

TODO: report on clashes

cmungall · 2018-05-08T22:15:50Z

CGD Was=http://identifiers.org/cgd/ [registry/go_context.jsonld], Now=http://ohsu.edu/cgd/ [registry/monarch_context.jsonld] M-FIX
EC Was=http://www.expasy.org/enzyme/ [registry/go_context.jsonld], Now=http://identifiers.org/ec-code/ [registry/monarch_context.jsonld]
GO_REF Was=http://purl.obolibrary.org/obo/go/references/ [registry/go_context.jsonld], Now=http://www.geneontology.org/cgi-bin/references.cgi#GO_REF: M-FIX[registry/monarch_context.jsonld]
GenBank Was=http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val= [registry/go_context.jsonld], Now=http://www.ncbi.nlm.nih.gov/nuccore/ [registry/monarch_context.jsonld] GO-FIX (is this used in URIs?)
HGNC Was=http://identifiers.org/hgnc/ [registry/go_context.jsonld], Now=http://identifiers.org/hgnc/HGNC: [registry/monarch_context.jsonld] TODO
MGI Was=http://identifiers.org/mgi/ [registry/go_context.jsonld], Now=http://www.informatics.jax.org/accession/MGI: [registry/monarch_context.jsonld] M-FIX but careful about doubling up
NCBIGene Was=http://identifiers.org/ncbigene/ [registry/go_context.jsonld], Now=http://www.ncbi.nlm.nih.gov/gene/ [registry/monarch_context.jsonld]
OMIM Was=http://omim.org/entry/ [registry/go_context.jsonld], Now=http://purl.obolibrary.org/obo/OMIM_ [registry/monarch_context.jsonld] M-FIX (but coordinate with mondo)
PAINT_REF Was=http://www.pantherdb.org/panther/lookupId.jsp?id=PTHR [registry/go_context.jsonld], Now=http://www.geneontology.org/gene-associations/submission/paint/ [registry/monarch_context.jsonld]
PANTHER Was=http://identifiers.org/panther.family/ [registry/go_context.jsonld], Now=http://www.pantherdb.org/panther/family.do?clsAccession= [registry/monarch_context.jsonld]
PDB Was=http://www.rcsb.org/pdb/cgi/explore.cgi?pdbId= [registry/go_context.jsonld], Now=http://identifiers.org/PDB: [registry/monarch_context.jsonld]
PMCID Was=http://www.ncbi.nlm.nih.gov/sites/entrez?db=pmc&cmd=search&term= [registry/go_context.jsonld], Now=http://www.ncbi.nlm.nih.gov/pmc/ [registry/monarch_context.jsonld]
PomBase Was=http://identifiers.org/pombase/ [registry/go_context.jsonld], Now=http://identifiers.org/PomBase: [registry/monarch_context.jsonld] M-FIX
RGD Was=http://identifiers.org/rgd/ [registry/go_context.jsonld], Now=http://rgd.mcw.edu/rgdweb/report/gene/main.html?id= [registry/monarch_context.jsonld] M-FIX
RefSeq Was=http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val= [registry/go_context.jsonld], Now=http://www.ncbi.nlm.nih.gov/refseq/?term= [registry/monarch_context.jsonld]
TAIR Was=http://identifiers.org/tair.locus/ [registry/go_context.jsonld], Now=http://identifiers.org/TAIR: [registry/monarch_context.jsonld] TAIR locus identifiers: ATGs vs numeric IDs identifiers-org/registry#5
ZFIN Was=http://identifiers.org/zfin/ [registry/go_context.jsonld], Now=http://zfin.org/ [registry/monarch_context.jsonld]
dbSNP Was=http://identifiers.org/dbsnp/ [registry/go_context.jsonld], Now=http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs= [registry/monarch_context.jsonld] M-FIX
dictyBase Was=http://identifiers.org/dictybase/ [registry/go_context.jsonld], Now=http://dictybase.org/gene/ [registry/monarch_context.jsonld]

TomConlin · 2018-05-09T00:08:36Z

I applaud giving the primary database URLs their due.
dipper warns on non 1:1 maps https://github.com/monarch-initiative/dipper/blob/master/dipper/utils/CurieUtil.py#L21 GO might want to as well
mind httpS where possible

cmungall · 2018-05-10T03:25:22Z

I applaud giving the primary database URLs their due
The problem here is that these are often ad-hoc and subset to change. In GO we are going for identifiers.org, more stable, predictable

dipper warns on non 1:1 maps https://github.com/monarch-initiative/dipper/blob/master/dipper/utils/CurieUtil.py#L21 GO might want to as well

We should look at merging dipper CurieUtil and https://github.com/prefixcommons/prefixcommons-py

mind httpS where possible

I have gotten assurances from identifiers.org that they will support http in perpetuity. Same of course true for OBO. Stability is key here

cmungall · 2018-05-10T03:27:45Z

Note the list above only includes cases where the prefix matches or the URL matches.

It isn't reporting the fact that GO has

- database: Reactome
  name: Reactome - a curated knowledgebase of biological pathways
  synonyms:
    - REACTOME
    - REAC
  rdf_uri_prefix: http://identifiers.org/reactome/
  generic_urls:
    - http://www.reactome.org/
  entity_types:
    - type_name: entity
      type_id: BET:0000000
      id_syntax: R-[A-Z]{3}-[0-9]+(-[0-9]+){0,1}(\.[0-9]+){0,1}
      url_syntax: http://www.reactome.org/content/detail/[example_id]
      example_id: Reactome:R-HSA-109582
      example_url: http://www.reactome.org/content/detail/R-HSA-109582

whereas dipper has

'REACT': 'http://www.reactome.org/PathwayBrowser/#/'

It looks like we have just recommended REACT to translator folks ah well. I'm not sure where this abbreviation came from.

But the URL is a good example of a bad semantic web PURL http://www.reactome.org/PathwayBrowser/#/

jmcmurry · 2018-05-19T02:06:18Z

Please note that the shortform curie resolution is now supported in identifiers.org. For example http://identifiers.org/MGI:3764834, my preference would be to use these simple URIs throughout our stack, except for OBO purls and other sources that have additional semantic sugar. I've made specific recommendations here.

nathandunn · 2018-05-22T23:45:05Z

@jmcmurry (sorry to interject) I was talking with @TomConlin about this. I think that its going to be problematic even if it goes to the canonical source. I think you're going to run into problem if you squat on the base-level CURIE. I would propose something like (such that its always scoped):

http://identifiers.org/monarch/MGI:3764834

This way, if the AGR, MONARCH, MGI, etc. can choose where their external links resolve and it reduces any possibility of data collision along the way. Doing it this way, you don't really have to consult anyone outside Monarch, whereas doing it at the root level will require a higher level of coordination for establishing and changing them.

nathandunn · 2018-05-22T23:52:40Z

But I really do like the identiferis.org approach overall. Its a nice approach to the ever moving / dying web. I'm not sure if there is a better solution, I would just scope any curie in a way that you can own it long-term.

nathandunn · 2018-05-23T14:57:30Z

Just to clarify my point. It might be fine to use the short-form if, for example, MGI is committed to supporting it internally, as they do the rest of their IDs, but even then, I think you are better off coming up with a scoping model. The reasons are:

1 - prevent potential collisions (can you register an entire CURIE?)

2 - allow an organization that doesn't own the IDs to quickly update changed external IDs (for example, if a downstream organization is using your IDs in a load, so they won't pickup your changed links)

3 - allows for individual organizations to change where a pointed ID goes to, as there are several entities that house the same IDs. e.g., external links on http://identifiers.org/monarch/MGI:107476 points to http://www.informatics.jax.org/marker/MGI:107476 , but http://identifiers.org/myorg/MGI:107476 points to https://www.alliancegenome.org/gene/MGI:107476

4 - at a minimum I don't think we'll be able to grab CURIE's for organizations we don't actively own (I imagine orgs would furious if an organization other than their own controlled their CURIE). It wouldn't be a bad thing to encourage the MODs (for example) to register these with identifiers.org as @jmcmurry suggested, though.

This sort of resolves to a poor man's DNS in some ways, but I think it simplifies things quite a bit.

@cmungall / @jmcmurry / @TomConlin I would be happy to chat about this. A lot of orgs are going to face this. I think that identifiers.org is the right way to do this for many reasons, but I think there needs to be a bit of nuance on the implementation.

cmungall · 2018-05-25T12:59:56Z

@nathandunn I think you're starting from some different assumptions. Primary use case here is joining triples, not resolution, hence URIs must be identical, organism-specific URIs contrary to this.

choice is between standard id.org URLs or the newer ones that embed CURIEs in URL directly, latter is preferable for many reasons but concerns over effect of colons in various semweb specs

nathandunn · 2018-05-25T14:18:28Z

@cmungall Thanks for the clarification, and sorry for any confusion. Yes, the CURIE is a no-brainer.

jmcmurry · 2018-05-30T00:08:33Z

No prob Nathan, agreed we would never ever squat on a curie for our own 3rd party purposes. It would break trust of both users and providers.
The new identifiers.org syntax is such that a provider can be specified OR omitted as the user desires; however, where the user omits provider they're redirected to whichever of the trusted authoritative original sources and their close collaborators have the best "up time" record that month. There are some issues related to that, but it is what it is.

cmungall added a commit to prefixcommons/biocontext that referenced this issue May 8, 2018

Improved on reporting, generated go-monarch clash report, see monarch…

5e174ad

…-initiative/dipper#582

cmungall changed the title ~~Align GO and Monarch json-ld contexts~~ Align GO JSON-LD context with dipper curie-map May 8, 2018

cmungall added a commit to biolink/biolink-model that referenced this issue May 10, 2018

Using reactome for now, but see monarch-initiative/dipper#582

e44fa5b

lpalbou mentioned this issue Jul 30, 2018

Wrong JSON-LD Context ? monarch-initiative/biolink-api#204

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align GO JSON-LD context with dipper curie-map #582

Align GO JSON-LD context with dipper curie-map #582

cmungall commented May 8, 2018

cmungall commented May 8, 2018 •

edited

Loading

TomConlin commented May 9, 2018

cmungall commented May 10, 2018

cmungall commented May 10, 2018

jmcmurry commented May 19, 2018 •

edited

Loading

nathandunn commented May 22, 2018

nathandunn commented May 22, 2018

nathandunn commented May 23, 2018

cmungall commented May 25, 2018

nathandunn commented May 25, 2018

jmcmurry commented May 30, 2018

Align GO JSON-LD context with dipper curie-map #582

Align GO JSON-LD context with dipper curie-map #582

Comments

cmungall commented May 8, 2018

cmungall commented May 8, 2018 • edited Loading

TomConlin commented May 9, 2018

cmungall commented May 10, 2018

cmungall commented May 10, 2018

jmcmurry commented May 19, 2018 • edited Loading

nathandunn commented May 22, 2018

nathandunn commented May 22, 2018

nathandunn commented May 23, 2018

cmungall commented May 25, 2018

nathandunn commented May 25, 2018

jmcmurry commented May 30, 2018

cmungall commented May 8, 2018 •

edited

Loading

jmcmurry commented May 19, 2018 •

edited

Loading