You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Polyphemus is a worker that finds data uploaded to magma and manages its analysis via an external pipeline.
Summary
At the moment data from external pipelines is fed into Magma by third parties. A typical workflow might go like this:
Analyst receives data (e.g. raw sequence from RNAseq) and performs analysis on it (e.g. alignment + quantification)
Analyst composes a document, validates it against a Magma template and sends it with appropriate credentials to Magma for insertion
Magma approves and inserts
However, as more and more data flows into the system and the number of tasks increase this sort of manual intervention will become problematic. We can imagine, instead, replacing the analyst's role with a software worker (polyphemus). This time the workflow might go like this:
Data is uploaded into Magma (e.g. raw sequence from RNAseq)
Polyphemus searches for unanalyzed data to consume (e.g. a sample with raw sequence but no quantification results)
Polyphemus determines the correct analysis to run on the unanalyzed data and composes a configuration script for a remote pipeline.
The job is dispatched to the remote pipeline, which requests the appropriate raw data from Magma and analyzes it.
The remote pipeline pushes records (or errors) back to Magma, which alerts Polyphemus that the job is complete.
If there is an error, Polyphemus makes note of it and asks for intervention.
Polyphemus could also track what version of a specific pipeline was used, and whether it has been invalidated by a newer version. For example, if we discover our RNAseq alignment was incorrect, we can compose a new, better pipeline and update the analysis requirements (e.g., "RNAseq raw data must be analyzed with at least version 10.1 of pipeline 'rnaseq'"). Polyphemus can then find all data that has not yet been analyzed, or data that needs to be re-analyzed using a new method, and queue new jobs for analysis.
In general this means we just need to feed information into the system, and it will slowly produce the best available analysis of the data as it exists currently.
The text was updated successfully, but these errors were encountered:
Polyphemus is a worker that finds data uploaded to magma and manages its analysis via an external pipeline.
Summary
At the moment data from external pipelines is fed into Magma by third parties. A typical workflow might go like this:
However, as more and more data flows into the system and the number of tasks increase this sort of manual intervention will become problematic. We can imagine, instead, replacing the analyst's role with a software worker (polyphemus). This time the workflow might go like this:
Polyphemus could also track what version of a specific pipeline was used, and whether it has been invalidated by a newer version. For example, if we discover our RNAseq alignment was incorrect, we can compose a new, better pipeline and update the analysis requirements (e.g., "RNAseq raw data must be analyzed with at least version 10.1 of pipeline 'rnaseq'"). Polyphemus can then find all data that has not yet been analyzed, or data that needs to be re-analyzed using a new method, and queue new jobs for analysis.
In general this means we just need to feed information into the system, and it will slowly produce the best available analysis of the data as it exists currently.
The text was updated successfully, but these errors were encountered: