schedule task to build genome index if one is not available #49

benjiec · 2021-04-16T13:17:46Z

This allows a GET request to the RO server to trigger a celery task to build genome index, if one does not exist. Currently if there's no genome index, the GET call fails, and the index is not built, so subsequent calls also fail.

yaoyuyang · 2021-04-16T14:19:29Z

src/edge/tasks.py

+@shared_task
+def build_genome_fragment_indices(genome_id):
+    genome = Genome.objects.get(pk=genome_id)
+    genome.indexed_genome()


Cool. So this is different from the build blastdb for the genome, right? also where is the index stored?

The indices are per-fragment, and maps sequence bps to a bp numeric number. It's stored in the database.

yaoyuyang · 2021-04-16T14:20:30Z

src/edge/views.py

@@ -330,6 +330,10 @@ def on_get(self, request, genome_id):
        args = q_parser.parse_args(request)
        field = args['field']

+        if not genome.has_location_index:


What causes the genome index not built when it was first created? random failures?

The indices are not required. In fact, because the indices are not shared between genomes, if we build one per fragment all the time, it would remove some of the storage advantage of Edge for engineered genomes. Hence, we don't actually create one when a new genome is created, only when someone go hit the genome on the website, or hit an API that requires it. The issue is that with RO server, if you hit there, you will never end up building the indices. So the solution here is to have building the indices be done via celery, on RO APIs. If a RO server receives an API that's meant for updating, then that's a caller error.

schedule task to build genome index if one is not available

c9255e1

benjiec requested a review from yaoyuyang April 16, 2021 13:17

yaoyuyang reviewed Apr 16, 2021

View reviewed changes

benjiec merged commit 5308315 into master Apr 16, 2021

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

schedule task to build genome index if one is not available #49

schedule task to build genome index if one is not available #49

benjiec commented Apr 16, 2021

yaoyuyang Apr 16, 2021

benjiec Apr 16, 2021

yaoyuyang Apr 16, 2021

benjiec Apr 16, 2021

schedule task to build genome index if one is not available #49

schedule task to build genome index if one is not available #49

Conversation

benjiec commented Apr 16, 2021

yaoyuyang Apr 16, 2021

Choose a reason for hiding this comment

benjiec Apr 16, 2021

Choose a reason for hiding this comment

yaoyuyang Apr 16, 2021

Choose a reason for hiding this comment

benjiec Apr 16, 2021

Choose a reason for hiding this comment