Navigation Menu

Skip to content
This repository has been archived by the owner on Apr 28, 2023. It is now read-only.

REST API does not link to owl file correctly #388

Closed
markmcdowall opened this issue Jul 20, 2020 · 1 comment
Closed

REST API does not link to owl file correctly #388

markmcdowall opened this issue Jul 20, 2020 · 1 comment

Comments

@markmcdowall
Copy link

In the REST API the links to the owl files don't always work.

I get the list of ontologies by:

import requests
request_handle = requests.get(url)
json_document = json.loads(request_handle.text)

If I test the headers based on json_document['_embedded']['ontologies']['config']['id']

valid = requests.head(ontology['owl'])

Then the following URLs either return 404, or have various connection issues:

<class 'requests.exceptions.ConnectionError'>: enm (5.0.1): http://purl.enanomapper.net/onto/enanomapper.owl

<class 'requests.exceptions.ConnectionError'>: hcao (2020-05-22): http://ontology.data.humancellatlas.org/ontologies/hcao

nmrcv - nmrcv (1.1.0): http://nmrML.org/nmrCV
<Response [404]>

<class 'requests.exceptions.ConnectionError'>: scdo (2019-06-26): http://scdontology.h3abionet.org/ontology/scdo.owl

afo - afo (REC/2019/05/10): http://purl.allotrope.org/voc/afo/merged-OLS/REC/2019/05/10
<Response [404]>

# This is the root page rather than the one for the actual owl file
edam - edam (17-07-2019): http://edamontology.org

ppo - ppo (2018-10-26): https://raw.githubusercontent.com/PlantPhenoOntology/ppo/master/ppo.owl
<Response [404]>

<class 'requests.exceptions.ConnectionError'>: sdgio (2018-08-10): http://purl.unep.org/sdg/sdgio.owl
teddy - teddy (rel-2014-04-24): http://identifiers.org/teddy/
<Response [400]>

However, if I use the json_document['_embedded']['ontologies']['config']['fileLocation'] , which would give me the EDAM owl file. This raises errors with other vocabs where sometimes it returns links to the EBI's internal file system (/nfs/pandas/ensembl/.../.../PHI.obo)

<class 'requests.exceptions.InvalidSchema'>: dicom (None): ftp://medical.nema.org/MEDICAL/Dicom/Resources/Ontology/DCM/dcm.owl

<class 'requests.exceptions.ConnectionError'>: enm (5.0.1): http://purl.enanomapper.net/onto/enanomapper.owl

<class 'requests.exceptions.ConnectionError'>: afo (REC/2019/05/10): http://afo-ols.semanticsfirst.com/ontologies/afo

genepio - genepio (2018-06-15): https://raw.githubusercontent.com/GenEpiO/genepio/master/src/ontology/genepio-merged-cardfix.owl
<Response [404]>

<class 'requests.exceptions.InvalidSchema'>: phi (10-12-2018): file:/nfs/panda/ensembl/production/ensprod/ontologies/phi/PHI.obo

<class 'requests.exceptions.ConnectionError'>: sdgio (2018-08-10): http://purl.unep.org/sdg/sdgio.owl

Neither using fileLocation or the id seems to guarentee getting the source owl file for all ontologies.

@jamesamcl
Copy link
Member

You've encountered one of the main problems we have as OLS maintainers: that the ontology links often become dead. Our own indexer which runs daily also hits many of these dead links, so we have to skip over the ones that fail and these ontologies will not be updated until we update the OLS configuration with the new URLs.

We make a best effort to update the URLs where we can, but with the growing number of ontologies indexed it is a moving target. Therefore, you should do as our indexer does and not assume that all of the URLs in the fileLocation property will resolve correctly.

phi is indeed loaded from a file:// URL on our internal NFS, which is not ideal. I will see if there is a http URL we can use instead.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants