Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Formalize workflow invocation and invocation step outputs.
Workflow Invocations -------------------- The workflow invocation outputs half of this is relatively straight forward. It is modelled somewhat on job outputs, output datasets and output dataset collections are now tracked for each workflow invocation and exposed via the workflow invocation API. This required adding new tables (linked to WorkflowInvocations and WorkflowOutputs) that track these output associations. Previously one could imagine backtracking this information for simple tool steps via the WorkflowInvocationStep -> Job table, but for steps that have many jobs (i.e. mapping over a collection) or for non-tool steps such information was more difficult to recover (and simply couldn't be recovered from the API at all or even internally without significant knowledge of the underlying workflow). Workflow Invocation Steps ------------------------- Tracking the outputs of WorkflowInvocationSteps was not previously done at all, one would have to follow the Job table as well. A signficant downside to this is that one cannot map over empty collections in a workflow - since no such job would exist. Tracking job outputs for WorkflowInvocationSteps is not a simple matter of just attaching outputs to an existing table because we had no concept of a workflow step tracked - since there could be many WorklfowInvocationSteps corresponding to the same combination of WorkflowInvocation and WorkflowStep. That should feel wrong and that is because it is - when collections were added the possiblity of having many jobs for the same combination of WorkflowInvocation and WorkflowStep was added. I should have split WorkflowInvocationSteps into WorkflowInvocationSteps and WorkflowInvocationStepJobAssociations at that time but didn't. This commit now does it - effectively normalizing the ``workflow_invocation_step`` table by introducing the new ``workflow_invocation_step_job_association`` table. Splitting up the WorkflowInvocationStep table this way allows recovering the mapped over output (e.g. the implicitly created collection from all the jobs) as well the outputs from the individual jobs (by walking WorkflowInvocationStep -> WorkflowInvocationStepJobAssociation -> Job -> JobToOutput*Association). This split up involves failrly substantial changes to the workflow module interface. Any place a list of WorkflowInvocationSteps was assumed, I reworked it to just expect a single WorkflowInvocationStep. I vastly simplified recover_mapping to just use the persisted outputs (this was needed in order to also implment empty collection mapping in workflows). This also fixes a bug (or implements a missing feature) where Subworkflow moudles had no recover_mapping methods - so for instance if a tool that produces dynamic collections appeared anywhere in a workflow after a subworkflow step - that workflow would not complete scheduling properly. Now that we have a way to reference the set of jobs corresponding to a workflow step within an invocation, we can start to track partial scheduling of such steps. This is outlined in #3883 and refactoring toward this goal is included here - including adding a state to WorkflowInvocationStep so Galaxy can determine if it has started scheduling this step and an index when scheduling jobs so it can tell how far into a scheduling things have gone as well as augmenting the tool executor to take a maximum number of jobs to execute and allow recovery of existing jobs for collection building purposes. *Applications* These changes will enable: - A simple, consistent API for finding workflow outputs that can be consumed by Planemo for testing workflows. - Mapping over empty collections in workflows. - Re-scheduling workflow invocations that include subworkflow steps. - Partial scheduling within steps requiring a large number of jobs when scheduling workflow invocations.
- Loading branch information
Showing
10 changed files
with
739 additions
and
157 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.