Skip to content

Blazegraph upload tasks should be added to it's own queue #247

@kevinkle

Description

@kevinkle

Right now, every call of datastruct_savvy() calls upload_graph() separately; with a large number of workers, this might be causing Blazegraph to hang up when running in corefacility.

The way to solve this would be to merge a few of the current queues:

  1. priority is currently used to run blazegraph queries for the frontend
  2. blazegraph is currently used to reserve spfyids for uploaded files
  3. multiples (for RGI) and singles (for ECTyper) can each invoke the upload_graph() function and cause simultaneous uploading of result graphs.

There are a number of permutations for this, but for now I'm going to try and just group 3. into their own queue. This is because 2. is fairly valuable since all tasks are dependent on it, thus we want to keep it separate. Ideally, by merging 3. and only having one worker on it, we can avoid overloading Blazegraph.

Few approaches to do this:

  1. create a new task for uploading which will require modifying the routes to return the upload task instead of the datastruct_savvy() task as the end task. (Again, still waiting on multi-job deps Multi dependencies (my take) rq/rq#856)
  2. create a decorator for uploading which sidesteps route modification, but means that users are blind to when their files are actually loaded into the database, though they will still get results.

I'm going to go with 2. as it will be fast to dev. and test this theory; we can also use the decorators to eventually build full job classes.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions