Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fastp tool to IUC #1763

Merged
merged 15 commits into from
Mar 10, 2018
Merged

Add fastp tool to IUC #1763

merged 15 commits into from
Mar 10, 2018

Conversation

mblue9
Copy link
Contributor

@mblue9 mblue9 commented Mar 7, 2018

fastp - a tool for fastq processing, trimming, QC etc (website: https://github.com/OpenGene/fastp)
has been wrapped by @HassanAmr (hooray! 🎉) bgruening/galaxytools#691
and this PR is to move it to IUC

A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.
remote_repository_url: https://github.com/bgruening/galaxytools/tree/master/tools/fastp
name: fastp
owner: rnateam
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iuc :)

@bgruening
Copy link
Member

@nsoranzo I increased the Galaxy version against which we are testing here. I don't know why this matters for the test but it worked in galaxytools since a few days and there we are testing against 18.01.

Can we merge it with this change, as 18.01 will be out "soon"?

@nsoranzo
Copy link
Member

nsoranzo commented Mar 8, 2018

@bgruening I re-ran the last Travis build using 17.09 (i.e. the one for commit e8a18b1) and it passed this time, but I see no problem running tests on 18.01 at this point of the release cycle.

@bgruening
Copy link
Member

@nsoranzo I'm also not sure, it passed for me as well locally, but travis failed multiple times. Please feel free to merge this. @mblue9 would be happy to use it :)

@bgruening
Copy link
Member

Danke Nicola!

homepage_url: https://github.com/OpenGene/fastp
long_description: |
A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.
remote_repository_url: https://github.com/bgruening/galaxytools/tree/master/tools/fastp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name: fastp
owner: iuc
type: unrestricted

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove empty line

@@ -0,0 +1,357 @@
<tool id="fastp" name="fastp" version="0.12.4">
<description>Fast all-in-one preprocessing for FastQ files</description>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/FastQ/FASTQ/

<requirements>
<requirement type="package" version="0.12.4">fastp</requirement>
</requirements>
<version_command>fastp --version | tail -1</version_command>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tail -n 1, -1 is not POSIX

</requirements>
<version_command>fastp --version | tail -1</version_command>
<command><![CDATA[

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove empty line

</tests>
<help>
<![CDATA[
A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/FastQ/FASTQ/


10. support long reads (data from PacBio / Nanopore devices).

This tool is being intensively developed, and new features can be implemented soon if they are considered useful. If you have any additional requirement for fastp, please file an issue:https://github.com/OpenGene/fastp/issues/new
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove this line.

<![CDATA[
A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.

**features**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capital F

@@ -0,0 +1,11 @@
categories:
- Sequence Analysis
description: Fast all-in-one preprocessing for FastQ files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/FastQ/FASTQ/

description: Fast all-in-one preprocessing for FastQ files
homepage_url: https://github.com/OpenGene/fastp
long_description: |
A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/FastQ/FASTQ/

@mblue9
Copy link
Contributor Author

mblue9 commented Mar 8, 2018

Thanks for the review @nsoranzo ! I've made those changes.

</section>
</when>
<when value="paired">
<param name="in1" argument="-i" type="data" format="fastq" label="Input 1" help="Input FASTQ file #1."/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to add fastq.gz here and for the other inputs, right ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I wondered that and so tested it with a fastq.gz file and it ran fine so it didn't seem necessary to add 😕. But what do you think? Should I instead add all possibilities- fastq, fastq.gz, fastqsanger, fastqsanger.gz?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fastqsanger, fastqillumina etc are all subclasses of fastq, so that's fine.
fastqsanger.gz, fastqillumina.gz etc, are subclasses of fastq.gz, and should normally require implicit conversion to their uncompressed equivalent if you want to use them with a fastq input.
Is that perhaps why it worked for you ? The disadvantage of not specifying fastq.gz is the additional, unnecessary uncompressed dataset that would be created.

</inputs>

<outputs>
<data name="out1" format="fastq" label="${tool.name} on ${on_string}: Read 1 Output"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this format_source="in1" (and below) ?

<test>
<param name="in1" value="R1.fq"/>
<param name="single_paired_selector" value="single"/>
<output name="out1" file="out1.fq" ftype="fastq"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would be fastqsanger now, unless you change <param name="in1" value="R1.fq"/> to <param name="in1" value="R1.fq" ftype="fastq"/>


#if $single_paired.adapter_trimming_options.disable_adapter_trimming:
$single_paired.adapter_trimming_options.disable_adapter_trimming
#end if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The #if and #end if are not needed since you're defining truevalue and falsevalue in the <param>. Same for most other boolean params below.

@mblue9
Copy link
Contributor Author

mblue9 commented Mar 9, 2018

Thanks @mvdbeek and @nsoranzo I've made those changes

@nsoranzo
Copy link
Member

nsoranzo commented Mar 9, 2018

@mblue9 Thanks a lot for the changes! There are still 3 standing comments from my previous review at https://github.com/galaxyproject/tools-iuc/pull/1763/files#diff-12b0b82edec68d4ab4e953daf7087da6 , if you have time.

@nsoranzo
Copy link
Member

nsoranzo commented Mar 9, 2018

@bgruening The merge of master didn't work well, the new tools in tools/charts/ appear now in the diff.

@bgruening
Copy link
Member

@nsoranzo on it, one commit is also missing. Please review in 10min and merge :)
@mblue9 needs it when she wakes up :)

@bgruening
Copy link
Member

So done.


fastp

-i input1.${in1.ext}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a doubt: does fastp support all our weird ext, like fastqillumina.gz or fastqsolexa, which are all subtypes of fastq/fastq.gz?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most likely not and if we want to support this we porbably need to add --phred64 as well.

#set $ext = 'fastq'
#else:
#set $ext = $in1.ext
#end if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be:

#if $in1.is_of_type('fastq.gz')
    #set ext = 'fastq.gz'
#else
    #set ext = 'fastq'
#end if

#end if


## Run fastp

fastp

-i input1.${in1.ext}
-o first.${in1.ext}
#if $in1.ext in ['fastqillumina', 'fastqsolexa', 'fastqillumina.gz', 'fastqsolexa.gz']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#if $in1.is_of_type('fastqillumina', 'fastqsolexa', 'fastqillumina.gz', 'fastqsolexa.gz')

@mblue9
Copy link
Contributor Author

mblue9 commented Mar 9, 2018

Sorry @nsoranzo I missed those other changes! I've added your suggestions now and if you think I should make more changes please let me know.

@mblue9
Copy link
Contributor Author

mblue9 commented Mar 10, 2018

:( I don't know why it's failing here on the 9th test with "No output has been received in the last 20m0s", it passes fine locally 😕

@nsoranzo
Copy link
Member

It's green after a restart!

Thanks @mblue9 and @bgruening!

@nsoranzo nsoranzo merged commit 2e41f5d into galaxyproject:master Mar 10, 2018
@mblue9
Copy link
Contributor Author

mblue9 commented Mar 10, 2018

Thanks a lot @nsoranzo ! I'm trying it now and it's looking good! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants