Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of descendants call #980

Closed
marcosmro opened this issue Feb 28, 2019 · 3 comments
Closed

Improve performance of descendants call #980

marcosmro opened this issue Feb 28, 2019 · 3 comments
Assignees
Milestone

Comments

@marcosmro
Copy link
Member

marcosmro commented Feb 28, 2019

Some calls to BioPortal's descendants endpoint take a long time or time out. Here is an example of call that returns 504 Gateway Time-out:

http://data.bioontology.org/ontologies/NCBITAXON/classes/http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FNCBITAXON%2F131567/descendants?&page=1&pagesize=50&include=prefLabel,hasChildren,created,synonym,definition

The BioPortal team recently enhanced the search index (see #32) to optimize subtree search. This new index could be used to improve the performance of the descendants call too.

Here are a couple of observations made by @mdorf (extracted from #32):

  • The original /descendants endpoint should not be modified, as it acts consistently with the rest of the hierarchical endpoints (/ascendants, /parents, /children). The latter three cannot take advantage of the search index.
  • I can simulate the functionality you're seeking by enabling a queryless search in a branch. The signature will look like:

/search?ontology=SNOMEDCT&subtree_root_id=http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FSNOMEDCT%2F64572001

@graybeal
Copy link
Member

graybeal commented Mar 2, 2019

question about bullet 1 above. It seems to me the search index does not change the functionality, it just makes the call faster. In which case, it doesn't matter that one call can use it and the other calls can't, since the signature doesn't change. Is there some feature that is also being requested that affects descendants API?

@mdorf
Copy link

mdorf commented Mar 5, 2019

question about bullet 1 above. It seems to me the search index does not change the functionality, it just makes the call faster. In which case, it doesn't matter that one call can use it and the other calls can't, since the signature doesn't change. Is there some feature that is also being requested that affects descendants API?

My primary reason is a reluctance to make a core API endpoint rely on a data located in a search index. My thought on this can be summed up by this quote from a user in a related thread:

Solr is a great way to have fast access to big data on a budget. Be advised that Solr cannot replace RDBMS since it cannot serve as the authoritative, canonical representation of your data. You will have to rebuild your index from other sources eventually.

The core endpoint, though slower at the moment (may not be the case once we move to AllegroGraph), will always be more reliable, in my opinion, since it retrieves data directly from the primary source. Having a way to do it via a search gives you an option of retrieving the data from a "secondary cache" (the search index).

mdorf added a commit to ncbo/ontologies_api that referenced this issue Mar 12, 2019
@mdorf
Copy link

mdorf commented Mar 18, 2019

code has been deployed to prod

@mdorf mdorf closed this as completed Mar 18, 2019
@mdorf mdorf removed the Release 2.4 label Mar 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants