Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow API import/export of format 2 workflows. #6776

Merged
merged 8 commits into from Oct 26, 2018

Conversation

@jmchilton
Copy link
Member

@jmchilton jmchilton commented Sep 26, 2018

This builds on a set of Galaxy PRs #6746, #6807, #6811 and a series of gxformat2 releases.

In addition to bringing in the latest changes to gxformat2 to allow these syntax changes (described below) and updating hundreds of lines of test workflows - this PR finally enables such workflows to work directly with the Galaxy API (switched on enable_beta_workflow_format). When enabled, workflows imported from dictionaries will now be checked and if they look like a format2 workflow they will be pre-converted to a native format before import - uploading YAML wrapped in JSON to avoid dictionary handling changes is allowed also. Likewise, there is a couple new styles of download that attempt to extract format 2 workflows from the native representation (using new functionality added to recent versions of gxformat2). These can be downloaded as JSON or as YAML string content wrapped up in JSON in order to preserve the pretty YAML formatting and dictionary ordering implemented in gxformat2.

If the JSON wrapped YAML-in-strings seems odd - consider also an important future direction of this work is likely to store the original supplied YAML alongside the native representation of the workflow and then allow the workflow editor to operate a series of deltas to both in parallel. This can be done with a round trip aware YAML parsing/writer such as ruamel.yaml so that user comments and formatting as well as extraneous data in the YAML are preserved.

This PR continues to refine the workflow syntax toward a more concise and CWL compatible syntax. The following two code blocks are a before and after example demonstrating the syntax changes.

Before this series of PRs:

class: GalaxyWorkflow
inputs:
  - id: input_fastqs
    type: collection
  - id: reference
outputs:
  - id: pileup_output
    source: pileup#out_file1
steps:
  - label: map_over_mapper
    tool_id: mapper
    state:
      input1:
        $link: input_fastqs
      reference:
        $link: reference
  - label: pileup
    tool_id: pileup
    state:
      input1:
        $link: map_over_mapper#out_file1
      reference:
        $link: reference

My recommended best practice after this PR is merged:

class: GalaxyWorkflow
inputs:
  input_fastqs: collection
  reference: data
outputs:
  pileup_output:
    outputSource: pileup/out_file1
steps:
  map_over_mapper:
    tool_id: mapper
    in:
      input1: input_fastqs
      reference: reference
  pileup:
    tool_id: pileup
    in:
      input1: map_over_mapper/out_file1
      reference: reference

Prior to this pull request, all subworkflows (format-version 0.1 or 2) would be imported repeatedly - once per step that referenced them. This PR introduces a CWL-derived syntax for repeatedly referencing the same workflow and updates the Galaxy import functionality to properly resolve these references and import such workflows only once instead of once per step.

$graph:
- id: nested
  class: GalaxyWorkflow
  inputs:
    inner_input: data
  outputs:
    inner_output:
      outputSource: inner_cat/out_file1
  steps:
    inner_cat:
      tool_id: cat
      in:
        input1: inner_input
        queries_0|input2: inner_input
- id: main
  class: GalaxyWorkflow
  inputs:
    outer_input: data
  steps:
    outer_cat:
      tool_id: cat
      in:
        input1: outer_input
    nested_workflow_1:
      run: '#nested'
      in:
       inner_input: outer_cat/out_file1
    nested_worklfow_2:
      run: '#nested'
      in:
        inner_input: nested_workflow_1/inner_output
@jmchilton jmchilton force-pushed the gxformat2_integration branch 3 times, most recently from a90464c to bc9717b Sep 30, 2018
@jmchilton jmchilton force-pushed the gxformat2_integration branch 6 times, most recently from 50bdfd5 to eb0ddac Oct 6, 2018
@jmchilton jmchilton force-pushed the gxformat2_integration branch 2 times, most recently from 540706a to 3be0747 Oct 9, 2018
@jmchilton jmchilton changed the title [WIP] gxformat2 workflow integration with Galaxy API Allow API import/export of format 2 workflows. Oct 9, 2018
@jmchilton jmchilton force-pushed the gxformat2_integration branch from 3be0747 to 397f237 Oct 10, 2018
@jmchilton jmchilton changed the title Allow API import/export of format 2 workflows. [WIP] Allow API import/export of format 2 workflows. Oct 10, 2018
@jmchilton jmchilton force-pushed the gxformat2_integration branch from 397f237 to 2fac59a Oct 11, 2018
@jmchilton jmchilton force-pushed the gxformat2_integration branch from 2fac59a to f7415cb Oct 12, 2018
@jmchilton jmchilton changed the title [WIP] Allow API import/export of format 2 workflows. Allow API import/export of format 2 workflows. Oct 12, 2018
@mvdbeek mvdbeek merged commit dda297c into galaxyproject:dev Oct 26, 2018
6 checks passed
Loading
@jmchilton
Copy link
Member Author

@jmchilton jmchilton commented Oct 26, 2018

Thanks a million @mvdbeek !

Loading

@nsoranzo nsoranzo deleted the gxformat2_integration branch Nov 30, 2018
nsoranzo added a commit to nsoranzo/galaxy that referenced this issue Nov 30, 2018
with `make config-rebuild` .
Follow-up on galaxyproject#6776 .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants