You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Decide how to handle MESH to CHEBI mappings. Currently there is a GitHub Gist (ncbo_rest_api.py) that pings the BioPortal API into a script that can be run as part of the KG CI/CD build.
Problems: The ncbo_rest_api.py script runs fine, but it's brittle given its reliance on the BioPortal API, which is notoriously unstable. A potential solution (for now or in the future) could be implement the LOOM algorithm which is what creates the mappings underlying the API.
TODO
Decide whether or not to use current script or implement LOOM
@bill-baumgartner - this is complete (will be integrated with PR #81). I followed the details for the LOOM algorithm described on the BioPortal Wiki. It's very simple, just a few methods. Since there is nothing fancy, essentially accomplished through some preprocessing of the input MesH and ChEBI data and performing an inner join to find overlapping concepts.
In a Nutshell: We download the mesh2021.nt data file directly from MeSH and the Flat_file_tab_delimited/names.tsv.gz file directly from ChEBI. Using these files, we have recapitulated the LOOM algorithm implemented by BioPortal when creating mappings between these resources. The procedure is relatively straightforward and utilizes the following information from each resource:
For all MeSH SCR Chemicals, obtain the following information:
Identifiers: MeSH identifiers
Labels: string labels using the RDFS:label object property
Synonyms: track down all synonyms using the vocab:concept and vocab:preferredConcept object properties
For all ChEBI classes, obtain the following information:
Labels: string labels using the RDFS:label object property
Synonyms: track down all synonyms using all synonym object properties
You can see details with a description in the notebook here under ChEBI Identifiers as well as in the scripted version of this notebook (lines: 496-628, here)
TASK
Task Type:
CODEBASE
Decide how to handle
MESH
toCHEBI
mappings. Currently there is a GitHub Gist (ncbo_rest_api.py
) that pings the BioPortal API into a script that can be run as part of the KG CI/CD build.Problems: The
ncbo_rest_api.py
script runs fine, but it's brittle given its reliance on the BioPortal API, which is notoriously unstable. A potential solution (for now or in the future) could be implement the LOOM algorithm which is what creates the mappings underlying the API.TODO
Write tests against scriptThe text was updated successfully, but these errors were encountered: